CEU eTD Collection (2025); Shaken Abylaikhan: Modelling Probability of Default Using Data on Hungarian Companies

CEU Electronic Theses and Dissertations, 2025
Author Shaken Abylaikhan
Title Modelling Probability of Default Using Data on Hungarian Companies
Summary This capstone project develops a machine learning framework to predict financial distress among Hungarian companies using firm-level accounting data from 2014 to 2023. Unlike traditional credit scoring models that rely on fixed formulas and rare default events, the project introduces a custom distress label based on four key financial ratios: profit to assets, EBITDA margin, revenue growth, and equity to assets. Firms in the bottom 30% of a standardized composite score are labeled as distressed.
The model training and evaluation follow a strict out-of-sample design, with the training period covering 2014–2019 and the testing period set to 2022–2023. The study applies logistic regression, random forest, XGBoost, and a soft voting ensemble to predict distress, using non-leaky features such as liquidity, liabilities, and firm age. Evaluation metrics include ROC AUC, precision, recall, and F1-score.
The final model, deployed in an interactive dashboard, enables financial analysts to assess distress probabilities in real time. This work demonstrates how modern data science methods can improve credit risk assessment, especially for small and medium-sized enterprises in Central and Eastern Europe.
Supervisor Schindele Ibolya
Department Economics MSc
Full texthttps://www.etd.ceu.edu/2025/shaken_abylaikhan.pdf

Visit the CEU Library.

© 2007-2025, Central European University