Template-Type: ReDIF-Paper 1.0 Author-Name: Denis Shibitov Author-Email: ShibitovDS@cbr.ru Author-Workplace-Name: Bank of Russia, Russian Federation Author-Name: Mariam Mamedli Author-Email: MamedliMO@cbr.ru Author-Workplace-Name: Bank of Russia, Russian Federation Title: The finer points of model comparison in machine learning: forecasting based on russian banks’ data Abstract: We evaluate the forecasting ability of machine learning models to predict bank license withdrawal and the violation of statutory capital and liquidity requirements (capital adequacy ratio N1.0, common equity Tier 1 adequacy ratio N1.1, Tier 1 capital adequacy ratio N1.2, N2 instant and N3 current liquidity). On the basis of 35 series from the accounting reports of Russian banks, we form two data sets of 69 and 721 variables and use them to build random forest and gradient boosting models along with neural networks and a stacking model for different forecasting horizons (1, 2, 3, 6, 9 months). Based on the data from February 2014 to October 2018 we show that these models with fine-tuned architectures can successfully compete with logistic regression usually applied for this task. Stacking and random forest generally have the best forecasting performance comparing to the other models. We evaluate models with commonly used performance metrics (ROC-AUC and F1) and show that, depending on the task, F1-score could be better at defining the model’s performance. Comparison of the results depending on the metrics applied and types of cross-validation used illustrate the importance of choosing the appropriate metric for performance evaluation and the cross-validation procedure, which accounts for the characteristics of the data set and the task under consideration. The developed approach shows the advantages of non-linear methods for bank regulation tasks and provides the guidelines for the application of machine learning algorithms to these tasks. Length: 50 pages Creation-Date: 2019-08 Revision-Date: Publication-Status: File-URL: http://cbr.ru/Content/Document/File/87572/wp43_e.pdf File-Format: Application/pdf File-Function: Number:wps43 Classification-JEL: C45, C53, C52, C5. Keywords: machine learning, random forest, neural networks, gradient boosting, forecasting, bank supervision Handle:RePEc:bkr:wpaper:wps43