برآورد احتمال نکول تسهیلات اعطایی در بانک ملی: مقایسه رویکردهای یادگیری ماشین و اقتصادسنجی

طالبلو, رضا; کمالی, میرعلی; مهاجری, پریسا

doi:10.22054/ijer.2025.84878.1350

نوع مقاله : مقاله پژوهشی

نویسندگان

¹ دانشیار گروه اقتصاد نظری، دانشگاه علامه طباطبائی، تهران، ایران

² دانشجوی دکتری اقتصاد، دانشگاه سمنان، سمنان، ایران

https://doi.org/10.22054/ijer.2025.84878.1350

چکیده

در این پژوهش، 56,965 فقره تسهیلات اعطایی طی سال‌های 1398 تا 1403 در شعب شمال تهران بانک ملی ایران، به‌منظور برآورد احتمال نکول وام مورد بررسی قرار گرفتند. برای پیش‌بینی رفتار اعتباری مشتریان، سه مدل شامل رگرسیون لجستیک، جنگل تصادفی و تقویت گرادیان حداکثری به‌کار گرفته شده است. متغیرهای ورودی شامل 29 متغیر در سه دسته‌ اصلی بودند: مشخصات قرارداد تسهیلات (مبلغ، دوره بازپرداخت، نوع وثیقه و...)، ویژگی‌های فردی تسهیلات‌گیرنده (سن، شغل، سابقه اعتباری و...) و مشخصات شعبه (استان، نوع شعبه و...). همچنین پیش‌پردازش‌هایی مانند حذف مقادیر پرت، دسته‌بندی متون، استخراج سن و دوره تنفس از داده‌های موجود انجام شده است و مدل‌ها در دو حالت پایه و بهینه‌سازی‌شده (با تنظیم ابرپارامترها) ارزیابی شدند. نتایج نشان داد که مدل‌های یادگیری ماشین عملکرد بهتری نسبت به روش سنتی دارند. شاخص ROC-AUC برای مدل تقویت گرادیان حداکثری معادل 73/99 و برای جنگل تصادفی نیز 68/99 درصد برآورد شد درحالی‌که این مقدار برای رگرسیون لجستیک تنها 34/75 درصد بود. اختلاف میانگین AUC بین مدل‌های یادگیری ماشین و رگرسیون لجستیک حدود 243/0 بود و در همه موارد، آزمون‌های آماری و فاصله اطمینان 95 درصد، بر معناداری این اختلاف تأکید داشتند. یافته‌ها برتری قابل اتکای روش‌های یادگیری ماشین در پیش‌بینی نکول تسهیلات را تأیید می‌کند.

کلیدواژه‌ها

موضوعات

اقتصاد مالی

مراجع

توکلی، سعید و آشتاب، الهام. (۱۴۰۲). مقایسه کارایی مدل‌های یادگیری ماشین و مدل‌های آماری در پیش‌بینی ریسک مالی. فصلنامه راهبرد مدیریت مالی، ۱۱(۱)، ۷۶–۵۳. https://doi.org/10.22051/jfm.2023.35240.2512

رحمانی، علی و اسماعیلی، غریبه. (1389). کارایی شبکه‌های عصبی، رگرسیون لجستیک و تحلیل تمایزی در پیش‌بینی نکول. اقتصاد مقداری (بررسی‌های اقتصادی)، 7(4)، 151-172. https://doi.org/10.22055/jqe.2010.10640

موحدی نیا، اکبر و بهمئی، نوشین. (1394). تعیین نکول تسهیلات مشتریان حقوقی به‌وسیله حداقل مربعات ماشین بردار پشتیبان بهبودیافته بر مبنای الگوریتم بهینه‌سازی تجمعی ذرات. کنفرانس بین‌المللی پژوهش‌های نوین در مدیریت، اقتصاد و حسابداری. http://irdoi.ir/103-440-857-466

Akerlof, G.A. (1970). The market for “lemons”: quality uncertainty and the market mechanism. The Quarterly Journal of Economics, 84(3), 488–500. https://doi.org/10.2307/1879431

Aldrich, J.H. & Nelson, F.D. (1984). Linear probability, logit, and probit models (Quantitative Applications in the Social Sciences No. 07-045). SAGE Publications. https://doi.org/10.4135/9781412984744

Akinjole, A., Shobayo, O., Popoola, J., Okoyeigbo, O. & Ogunleye, B. (2024). Ensemble-based machine learning algorithm for loan default risk prediction. Mathematics, 12(21), 3423.

https://doi.org/10.3390/math12213423

Arrow, K.J. (1963). Uncertainty and the welfare economics of medical care. The American Economic Review, 53(5), 941–973. https://doi.org/10.1016/B978-0-12-214850-7.50028-0

Bergstra, J. & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281-305. https://doi.org/ 10.5555/2503308.2188395

Bermudez, J.D., Gonzalez-Rivera, G. & Gonzalez, M. (2022). Machine learning approaches to credit risk modeling: A comparative analysis. Journal of Risk and Financial Management, 15(4), 123. https://doi.org/10.3390/jrfm15040123

Berrar, D. (2019). Cross-validation. In Encyclopedia of bioinformatics and computational biology (pp. 542–545). Elsevier.

https://doi.org/10.1016/B978-0-12-809633-8.20349-X

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655

Breiman, L. (2011). Random Forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324

Chen, T. & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785

Chinchor, N. (1992). MUC-4 evaluation metrics. In Proceedings of the 4th Conference on Message Understanding (pp. 22–29). https://doi.org/10.3115/1072064.1072067

Efron, B. & Tibshirani, R.J. (1993). An introduction to the bootstrap. Chapman & Hall/CRC. https://doi.org/10.1007/978-1-4899-4541-9

Feurer, M. & Hutter, F. (2019). Hyperparameter optimization. In: Hutter, F., Kotthoff, L., Vanschoren, J. (eds) Automated Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-05318-5_1

Fishman, G.S. (1973). Statistical analysis for queueing simulations. Management Science, 20(3), 363–369.

https://doi.org/10.1287/mnsc.20.3.363

Friedman, J.H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232.

https://doi.org/10.1214/aos/1013203451

Green, D.M. & Swets, J.A. (1966). Signal detection theory and psychophysics. Wiley. https://doi.org/10.1086/405615

G’ulomova, B.M.M. qizi. (2023). Bank loan allocation model based on credit risk prediction of SMEs.

https://doi.org/10.1109/ictc57116.2023.10154753

Guo, C. (2016). Using machine learning techniques for credit risk modeling: Empirical evidence from China. Journal of Financial Risk Management, 5(3), 1–12. https://doi.org/10.4236/jfrm.2016.53005

Hand, D.J. & Henley, W.E. (1997). Statistical classification methods in consumer credit scoring: A review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 160(3), 523–541. https://doi.org/10.1111/j.1467-985X.1997.00078.x

Ho, T.K. (1995). Random decision forests. In Proceedings of the 3rd international conference on Document analysis and recognition (Vol. 1, pp. 278-282). IEEE https://doi.org/10.1109/ICDAR.1995.598994

Kelly, G.A. (1952). The psychology of personal constructs. Norton. https://doi.org/10.4324/9780203359037

King, M., Zhu, Q. & Wang, T. (2021). Combining behavioral and financial data to improve credit scoring models: Evidence from a commercial bank. Journal of Banking and Finance, 127, 106125. https://doi.org/10.1016/j.jbankfin.2021.106125

Liao, L., Li, H., Shang, W. & Ma, L. (2022). An Empirical Study of the Impact of Hyperparameter Tuning and Model Optimization on the Performance Properties of Deep Neural Networks. ACM Transactions on Software Engineering and Methodology, 31(3), 1–40. https://doi.org/10.1145/3506695

Liu, H. (2020). Credit risk assessment with ensemble learning: A study of small and medium enterprises. International Review of Financial Analysis, 71, 101519. https://doi.org/10.1016/j.irfa.2020.101519

Movahedinia, A. & Bahmai, N. (2015). Determining the default of legal entity customers' facilities using improved support vector machine least squares based on particle swarm optimization algorithm. International Conference on New Researches in Management, Economics, and Accounting. http://irdoi.ir/103-440-857-466 [In Persian].

Nuez Mora, J.A., Moncayo, P. & Franco, C. (2023). Loan default prediction: A complete revision of LendingClub. Estudios Gerenciales, 39(169), 1–17 https://doi.org/10.21919/remef.v18i3.886

Peykani, P., Sargolzaei, M., Sanadgol, N., Takalu, A. & Kamyabfar, H. (2023). Application of structural models (Merton and Geske) and machine learning models (random forest and gradient boosted trees) in predicting default risk of listed companies in the Iranian capital market. PLoS ONE, 18(11), e0292081.

https://doi.org/10.1371/journal.pone.0292081

Powers, D.M.W. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2(1), 37–63 https://doi.org/10.9735/2229-3981

Probst, P., Boulesteix, A. & Bischl, B. (2018). Tunability: Importance of Hyperparameters of Machine Learning Algorithms. Machine Learning Research, 20, 53:1-53:32. https://doi.org/10.48550/arXiv.1802.09596

Rahmani, A. & Esmaeili, G. (2010). The efficiency of neural networks, logistic regression, and discriminant analysis in predicting default. Quantitative Economics (Economic Studies), 7(4), 151-172. https://doi.org/10.22055/jqe.2010.10640 [In Persian].

Robinson, N. & Sindhwani, N. (2024). Loan default prediction using machine learning. In 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (pp. 1–5). IEEE. https://doi.org/10.55041/IJSREM24519

Spence, M. (1973). Job market signaling. The Quarterly Journal of Economics, 87(3), 355–374. https://doi.org/10.2307/1882010

Stiglitz, J.E. & Weiss, A. (1981). Credit rationing in markets with imperfect information. The American Economic Review, 71(3), 393–410. http://www.jstor.org/stable/1802787

Tang, Y., Liu, Y. & Huang, X. (2019). Detecting moral hazard in loan default using deep learning algorithms. Expert Systems with Applications, 130, 95–103. https://doi.org/10.1016/j.eswa.2019.04.003

Tavakoli, S. & Ashtab, E. (2023). Comparison of the efficiency of machine learning models and statistical models in predicting financial risk. Quarterly Journal of Financial Management Strategy, 11(1), 53-76. https://doi.org/10.22051/jfm.2023.35240.2512 [In Persian].

Uphade, D.B., Muley, A.A. & Chalwadi, S.V. (2024). Identification of most preferable machine learning technique for prediction of bank loan defaulters. Indian Journal of Science and Technology, 17(4), 343-351. https://doi.org/10.17485/IJST/v17i4.2978

van Rijsbergen, C.J. (1979). Information retrieval (2nd ed.). https://doi.org/10.1002/asi.4630300621

پژوهش‌های اقتصادی ایران

برآورد احتمال نکول تسهیلات اعطایی در بانک ملی: مقایسه رویکردهای یادگیری ماشین و اقتصادسنجی

مراجع

مراجع

دوره 30، شماره 103 - شماره پیاپی 103
تیر 1404
صفحه 1-41

برآورد احتمال نکول تسهیلات اعطایی در بانک ملی: مقایسه رویکردهای یادگیری ماشین و اقتصادسنجی

مراجع

مراجع

دوره 30، شماره 103 - شماره پیاپی 103تیر 1404صفحه 1-41

دوره 30، شماره 103 - شماره پیاپی 103
تیر 1404
صفحه 1-41