You disliked this video. Thanks for the feedback!
The increase in healthcare costs is one of the most important global problems, with a strong impact on the insurance industry, and in particular in the health sector, since by definition the insurer receives from its clients in advance an amount of premium that can generate future liabilities. The need to increase insurance premiums is imperative, as a response to customers' growing needs for access to healthcare, generating problem in the current pricing methods in situations where companies lack data. It is therefore necessary to evaluate the predictive capacity of traditional statistical pricing models, Bootstrap and Generalized Linear Models, in scenarios where there is little experience and to analyze the introduction of new methods in the pricing process. This article, based on the CRISP-DM methodology, introduces techniques of supervised machine learning in the pricing process applied to 6 years of healthcare cost data of 500.000 Company customers. These models include information on demographic, socioeconomic and lifestyle aspects, with the objective of defining risk-differentiating variables and establishing standards regarding the respective evolution of healthcare costs in the insurance industry, and their impact on pricing. The models were trained with a sample of 300.000 customers and 5 years of data. The predictive capacity was evaluated with the remaining sample, also comparing the results obtained with the outcomes of the traditional statistical models. These results represent an application of data science and big data, in an increasingly dynamic and demanding sector of activity allowing the insurer to analyze and manage risk, based on more information and solid models, safeguarding financial stability and simultaneously answering to the clientâ€™s needs. JEL Classification: G22, I11, D80, E39, I19, C49 Keywords: Insurance Activity, Health Insurance, Risk, Pricing, Healthcare Costs, CRISP-DM, Supervised machine learning