Result: Comparative analysis of boosting algorithms for predicting personal default.

Title:
Comparative analysis of boosting algorithms for predicting personal default.
Authors:
Nguyen, Nhat1 (AUTHOR) nhatnm@hub.edu.vn, Ngo, Duy1 (AUTHOR) ngohoangkhanhduy.work@gmail.com
Source:
Cogent Economics & Finance. Dec2025, Vol. 13 Issue 1, p1-20. 20p.
Geographic Terms:
Database:
Business Source Premier

Further Information

Accurately predicting personal default risk is crucial for financial institutions to manage credit risk effectively. This study conducts a comparative analysis of the performance of boosting algorithms, including AdaBoost, XGBoost, LightGBM, and CatBoost, in predicting personal defaults. The dataset used in the study comprises 7,542 individual customers collected from Vietnamese commercial banks and financial institutions between 2014 and 2022, with 12 features related to the financial and demographic characteristics of the borrowers. All customer-related information is fully anonymized and encrypted during the data collection process to ensure compliance with research ethics. The predictive models are evaluated based on six criteria: Accuracy, Precision, Sensitivity, Specificity, F1 score, and AUC. The results indicate that the LightGBM model has the best performance, demonstrating the ability to efficiently handle large and complex datasets. Additionally, the study identifies the five most significant factors influencing personal default risk: Monthly Liability, Credit Balance, Credit History Length, Max Credit Limit, and Yearly Income. However, the study's limitations in the size and scope of the dataset may reduce the generalizability of the results when applied to other regions. These findings provide valuable insights that help financial institutions enhance their strategies for managing credit risk effectively. IMPACT STATEMENT: In credit granting activities, forecasting the probability of customer default plays a very important role. It helps banks and financial institutions to control credit risk and improve efficiency in building loan portfolios. This study focuses on evaluating the performance of four prominent boosting algorithms, including AdaBoost, XGBoost, LightGBM, and CatBoost, on a dataset of individual customers in the Vietnamese market. The research results indicate that LightGBM outperforms the other algorithms in predicting the probability of default. Additionally, the study identifies the five most important features affecting credit risk: Monthly Liability, Credit Balance, Credit History Length, Max Credit Limit, and Yearly Income. This research provides valuable practical insights for banks and financial institutions in developing and refining credit rating models, contributing to minimizing risks and making more effective lending decisions. In addition, evidence of the effectiveness of boosting algorithms in emerging financial markets like Vietnam may drive a stronger adoption of advanced machine learning tools in the lending process, fostering transparency and delivering benefits to both financial institutions and borrowers. [ABSTRACT FROM AUTHOR]

Copyright of Cogent Economics & Finance is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)