Early warning model with machine learning for Turkish Insurance Sector


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Uygulamalı Matematik Enstitüsü, Finansal Matematik Anabilim Dalı, Türkiye

Tezin Onay Tarihi: 2019

Tezin Dili: İngilizce

Öğrenci: GÜNAY BURAK KOÇER

Danışman: Sevtap Ayşe Kestel

Özet:

Early warning models are needed to ensure that relevant stakeholders would not cause worse outcomes by ignoring the risks. Specific to the insurance sector, this risk is about meeting the obligations and its sustainability. In this study, an early warning model is formed using the ratios obtained from the financial statements of insurance companies. The goal of the model is to identify the risk areas and to support the strengthening of the financial structure and taking timely measures in companies. Since the data consisted of annual periods, only non-life insurance companies are included in the analysis. The annual balance sheet and statements of income declared by the Insurance Association of Turkey (IAT) and the annual reports about insurance and private pension activities published by The Republic of Turkey Ministry of Treasury and Finance (TRMTF) are examined based on companies, and ratios to be used in the model are calculated. The data set is composed of 70 financial ratios obtained from the financial statements of all non-life insurance companies operating between 2011-2018. Classifications are determined as credit, liquidity, market, reinsurance, underwriting, technical provisions, reputational, operational, profitability, and capital risks. In the developed model, the values realized in 2018 are estimated with machine learning methods by using the data of 2011-2017. Random Forest, Neural Networks, Gradient Boosting Machine and Extreme Gradient Boosting are used as analysis methods, and Boruta is used as a feature selection method. Capital requirement ratio is chosen as the dependent variable. The other 69 ratios are the independent variables and this set is reduced to 22 independent variables by the Boruta method. Analyzes are conducted on two datasets with 69 and 22 independent variables and the results are compared. Furthermore, after predicting the 2018 values for above mentioned 38 non-life insurance companies and doing stage classifications with those values, whether the actual stages match with the predicted stages is evaluated. The best estimate accuracy belongs to Random Forest and Gradient Boosting Machine methods with 87%. The predictive power decreases with 22 independent variables, but the results are still close to each other. Then, performances of machine learning models are compared over capital adequacy classification. In this comparison, the best predictive method is Neural Networks with 95% accuracy.