You disliked this video. Thanks for the feedback!
The recent increase in the amount of data generated, stored and analyzed by insurers to establish their pricing and underwriting policies has led to the emergence of new needs. Both from a regulatory point of view, with the recent implementation in the European framework of the General Data Protection Regulation (GDPR), and with a view to offering new services on the market (cyber risk). The work carried out in this paper is thus devoted to the development and analysis of actuarial methods within the default security framework - a principle presented and imposed on companies using personal data by the GDPR. The objective is therefore to extend the elementary mathematical concepts and models used when developing non-life insurance pricing models (Simple Linear Regression and Generalized Linear Models) to their use on secure data in accordance with regulatory requirements. We will then start by defining the framework of our study in order to specify the regulatory and theoretical contexts within which our problem stands, then we will first focus on the development of an encryption procedure to perform a simple linear regression on encrypted data without ever having to decrypt them during the process. In other words, being able to calculate a linear regression on a pre-cyphered database - in this paper thanks to the Efficient Integer Vector Homomorphic Encryption and Fan & Vercauteren schemes - without having knowledge of the decryption keys so only the owner has the possibility to decrypt the obtained results. In a second step, we will focus on an alternative methodology to data encryption : anonymization of the insured portfolio by aggregating policies using non-supervised learning methods (OPTICS, K-Means, etc.). We then obtain for each cluster a new anonymous individual representative of his group. Our idea is then to carry out the pricing of an automobile civil insurance based on data thus secured. To analyze the performance of this process, we will compare these results with those obtained from this same pricing model but calculated on non-anonymized data.