In the modern financial landscape, there is an increasing interest in using unconventional data sources for predicting Small and Medium Enterprises (SME) default. SMEs, indeed, have a pivotal role in the EU economy constituting 99.8% of all enterprises in the EU-28 non-financial business sector (European Commission (2019)). So far, credit scoring models have been built using only balance sheets indicators, although it has been demonstrated that smaller sized firms do not always comply with financial transparency. In this, work we explore the usage of corporate websites as a further source of information for detection of SMEs default (Crosato, Domenech, & Liberati, 2021). Working with unconventional data raises several challenges in terms of data retrieval, selection and cleaning. On the other hand, it allows to overcome the deterioration of the balance sheet indicators, that generally are published with 2 years of delay with respect to their reference time. We relied on our study on a sample of Spanish SMEs (about 900 firms) whose websites were scraped to obtain online indicators based on the HTML code, as in as in Blazquez and Domenech (2018). The online indicators, merged with balance sheet data, were then analyzed using Machine Learning algorithms (as Random Forest and LS-SVM) and Logistic Regression (as a benchmark) having as input variables both sets of information. Results show the improvement in terms of misclassification rate when the accounting information is augmented by website features. We also provide valuable insights into companies’ websites characteristics related with firms’ financial health. Finally, we show that online indicators are able to predict the future status of some uncommon firms whose accounting indicators would have led to the wrong prediction.

Crosato, L., Domenech, J., Liberati, C. (2023). Improving SMEs default prediction: combining conventional and unconventional data.. In Games & Decisions in Risk & Reliability 7. Provisional Book of Abstracts.

Improving SMEs default prediction: combining conventional and unconventional data.

Liberati, C
2023

Abstract

In the modern financial landscape, there is an increasing interest in using unconventional data sources for predicting Small and Medium Enterprises (SME) default. SMEs, indeed, have a pivotal role in the EU economy constituting 99.8% of all enterprises in the EU-28 non-financial business sector (European Commission (2019)). So far, credit scoring models have been built using only balance sheets indicators, although it has been demonstrated that smaller sized firms do not always comply with financial transparency. In this, work we explore the usage of corporate websites as a further source of information for detection of SMEs default (Crosato, Domenech, & Liberati, 2021). Working with unconventional data raises several challenges in terms of data retrieval, selection and cleaning. On the other hand, it allows to overcome the deterioration of the balance sheet indicators, that generally are published with 2 years of delay with respect to their reference time. We relied on our study on a sample of Spanish SMEs (about 900 firms) whose websites were scraped to obtain online indicators based on the HTML code, as in as in Blazquez and Domenech (2018). The online indicators, merged with balance sheet data, were then analyzed using Machine Learning algorithms (as Random Forest and LS-SVM) and Logistic Regression (as a benchmark) having as input variables both sets of information. Results show the improvement in terms of misclassification rate when the accounting information is augmented by website features. We also provide valuable insights into companies’ websites characteristics related with firms’ financial health. Finally, we show that online indicators are able to predict the future status of some uncommon firms whose accounting indicators would have led to the wrong prediction.
abstract + slide
Default Prediction, SME, Kernel Discriminant, Website Data
English
Workshop on Games and Decisions in Reliability and Risk (GDRR)
2023
Games & Decisions in Risk & Reliability 7. Provisional Book of Abstracts
2023
https://www.icmat.es/congresos/2023/GDRR/Book_of_Abstracts_GDRR.pdf
none
Crosato, L., Domenech, J., Liberati, C. (2023). Improving SMEs default prediction: combining conventional and unconventional data.. In Games & Decisions in Risk & Reliability 7. Provisional Book of Abstracts.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/417584
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact