1. Introduction
Predicting business failure represents a topic of paramount significance and has garnered increasing attention in recent decades. According to the argument posited by Borchert et al. (2023) [
1], the prognostication of business failure offers researchers and stakeholders a highly pertinent tool for assessing a company's fiscal well-being. In concordance with Pereira et al. (2012) [
2], the financial health of enterprises has evolved into one of the most pressing societal concerns for economic agents. Consequently, many models for business failure prediction have been developed to distinguish between failed and non-failed entities. Therefore, this subject necessitates ongoing study and adaptation to the evolving economic milieu to attain ever more precise results.
In a succinct definition, business failure can be construed as the inability of a company to meet all its financial obligations, potentially culminating in bankruptcy. Business failure prediction models are devised to guide companies toward recognizing impending failure through indicators, facilitating proactive measures.
Numerous studies have been undertaken to engender novel models for the prediction of business failure and enhance each model's predictive capacity. The seminal work of Beaver (1966) [
3] is noteworthy as the pioneering exploration of business failure prediction models employing financial ratios for model estimation. Subsequently, subsequent models have been introduced, such as the multiple discriminant analysis model by Altman (1968) [
4], and Deakin (1972) [
5], the logit model proposed by Ohlson (1980) [
6], probit by Hoetker (2007) [
7], and more recently, artificial intelligence-based models, including neural networks by Altman et al. (1994) [
8] and Neves & Vieira (2006) [
9], decision trees as advocated by Gepp et al. (2010) [
10] and Pereira et al. (2010) [
11], and support vector machines, as presented by Alaka et al. (2018) [
12] and Shetty et al. (2022) [
13], in addition to the genetic algorithm proposed by Gordini (2014) [
14].
Similarly, the transport sector assumes a position of paramount importance and indispensability within human needs, facilitating the transboundary movement of individuals and goods. Enterprises within the freight transport sector hold substantial weight in the business landscapes of Portugal, Spain, France, and Italy. According to data from the Bank of Portugal, in 2021, there were 21,497 transport companies in Portugal, encompassing road transport, maritime transport, and air transport entities.
Given the significance of the transport sector, this research aims to juxtapose statistical business failure models with artificial intelligence-based models within the transport sector during the temporal scope of 2014 to 2021, spanning the countries of Portugal, Spain, France, and Italy. The selection of this timeframe is motivated by the emergence of the COVID-19 pandemic in 2020, which has exerted a significant impact on all businesses, resulting in decreased turnover and corporate profits. Consequently, this timeframe enables the assessment of the pandemic's effect on the potential occurrence of organizational failure. Their pivotal roles in transnational trade underpin the selection of these countries through transportation channels, coupled with the substantial volume of freight traffic exchanged among them.
The principal objective of this research is to scrutinize and compare the efficacy of statistical business failure prediction models vis-à-vis artificial intelligence-based models within the transport sector to ascertain the superior model. In alignment with the perspective articulated by J. Pereira et al. (2014) [
15], artificial intelligence techniques employed for business failure prediction are deemed more efficient and exhibit a higher accuracy rate. Thus, this study endeavors to evaluate whether, indeed, artificial intelligence-based models manifest superior precision.
As the prediction of business success stands as a pivotal factor for organizations, particularly in the wake of the global financial crisis 2008, as underscored by Shi & Li (2019) [
16], the relevance of this study is manifest. Furthermore, it offers an opportunity to deepen the knowledge base established by previous research endeavors on this subject matter.
Concerning the research inquiries, the study addresses the following: i) What are the most salient indicators for predicting business failure? ii) Do artificial intelligence-based business failure prediction models surpass statistical models in performance? This research engaged a sample of 4866 companies and a set of 15 financial ratios employed for predictive modeling. All data utilized in this study was sourced from the Bureau Van Dijk - ORBIS database
2. Literature review
The concept of failure, specifically business failure, has been the subject of diverse definitions and scholarly exploration due to the ongoing concern with prognosticating organizational demise. Grounded in the insights of Walsh & Cunningham (2016) [
17], the inception of research into business failure dates back to the 19th century, but it has witnessed intensified and comprehensive investigation in recent years. Beaver (1966) [
3] defines failure as an organization's inability to fulfill its financial obligations, signifying its incapacity to meet creditor obligations, distribute dividends to shareholders, or avoid bankruptcy. Furthermore, Beaver et al. (2010) [
18] characterize failure as the non-fulfillment of financial obligations, encompassing financial difficulties, which entails the failure to meet obligations on time and, on the other hand, the organization's bankruptcy. Conversely, following the perspective advanced by Altman & Hotchkiss (2006) [
19], a company can find itself in a state of economic failure for an extended period without defaulting on its obligations. In this view, the quality of managerial stewardship by the board of directors emerges as the primary determinant of failure. In concordance with Amankwah-Amoah et al. (2022) [
20], business failure materializes when a company cannot recover from a period of decline, ultimately leading to its collapse. In a more general sense, business failure can be delineated as the cessation of operations due to an inability to adapt to external changes (Amankwah-Amoah & Wang, 2019 [
21]). However, as J. M. Pereira et al. (2007) [
22] noted, it is essential to recognize that no singular concept of business failure exists, encompassing a spectrum ranging from legal bankruptcy to insolvency, suspension of payments, or persistent financial losses.
In alignment with the diversity of failure definitions, an array of models and methodologies exists for predicting business failure, commencing with the seminal univariate analysis model introduced by Beaver (1966) [
3] and extending to the multivariate discriminant analysis, logit, and probit models, as expounded by J. M. Pereira et al. (2007) [
22]. Moreover, recent developments have emerged models rooted in artificial intelligence techniques. Corroborating the assertion of Korol & Spyridou (2020) [
23], establishing a financial early warning system assumes pivotal importance for organizations, leveraging reasonable forecasting models that empower stakeholders to evaluate financial risks capable of shaping organizational success or failure. It is imperative to recognize that each of the discussed models has advantages and drawbacks. Consequently, there is no universally superior technique, and selecting an appropriate model hinges on individual circumstances and one's conception of business failure, in alignment with J. Pereira et al. (2014) [
15].
As noted by J. M. Pereira et al. (2016) [
24], the development of business failure models gained momentum in the 1960s, catalyzed by the imperative to scrutinize business failure and the ensuing economic and social ramifications for all stakeholders. The initial techniques employed encompassed statistical methodologies applied to companies' financial data, involving utilizing a set of financial ratios [
25]. Pioneering contributions, such as those of Beaver (1966) [
3] and Altman (1968) [
4], played pivotal roles in this analytical domain. Regarding the most frequently employed technique for predicting business failure, multiple discriminant analysis is the prevailing approach in the corpus of analyzed studies (Alaka et al.,2018 [
12]). [
26,
27]
Statistical techniques have historically served as the prevailing method for predicting business failure. These techniques, founded on a predefined threshold, classify companies as either failure if their scores fall below the specified threshold or non-failures if they do not. Nevertheless, as posited by Ooghe & Spaenjers (2009) [
28], statistical techniques may yield specific errors, imposing costs on companies, such as type I errors (where failed companies are erroneously classified as non-failed) or type II errors (the converse situation). J. M. Pereira et al. (2016) [
24] note that initial research sought to ascertain whether datasets contained sufficient information to forecast impending insolvency or aimed to discern the most effective predictive models. Subsequently, with more comprehensive models, such as artificial intelligence-based techniques, a quest for enhanced accuracy and reduced error rates in business failure prediction emerged.
Following Yeh et al. (2010) [
29], the limitations of early business failure prediction models are twofold. Firstly, these models exclusively relied on financial ratios as independent variables. Secondly, they overlooked the significance of a company's managerial effectiveness as a key variable for classifying failures. While widely adopted, statistical techniques often entail restrictive assumptions such as linearity, normality, and independence among variables (Wu, 2010 [
30]). To circumvent these limitations, alternative techniques rooted in artificial intelligence have been introduced.
According to Shetty et al. (2022) [
13], the 1990s ushered in a new phase in the evolution of business failure prediction models, introducing innovative methods, particularly artificial intelligence algorithms, including neural networks (Špiler, et al, 2022 [
31]) and decision trees. These artificial intelligence-based techniques offer promising alternatives to traditional statistical models, addressing their principal shortcomings. [
32]
Neural Networks (NN), as defined by Neves & Vieira (2006) [
9], represent a prominent artificial intelligence-based method for business failure prediction. These networks, inspired by the architecture of the human brain, can learn directly from examples without prior knowledge of specific problems. A neural network comprises interconnected processing units, each with a calculation function [
33]. The learning process involves iterative adjustments to minimize errors until the network attains equilibrium, resolving the problem (Altman et al., 1994 [
8]).
Neural networks offer several advantages over statistical models. They prevent the need for a pre-established functional relationship among variables, as they can direct knowledge acquisition through the learning process. Furthermore, the collective behavior of multiple units, rather than individual units, contributes to their efficiency (Altman et al., 1994 [
8]). Despite their advantages, neural networks are relatively slow learners and may yield complex, challenging-to-interpret results (Altman et al., 1994 [
8]). Understanding the final rules neural networks acquire can also be challenging (K.-S. Shin & Lee, 2002 [
34]).Numerous studies have compared the performance of neural networks with statistical techniques, revealing that neural networks may exhibit superior predictive capacity in specific scenarios (Altman et al., 1994 [
8]; Gámez et al., 2016 [
35]). It has been suggested that combining neural networks with multiple discriminant analyses may yield more accurate and comprehensible results (Altman et al., 1994 [
8]). Noh, S.-H. (2023) [
36] compared the bankruptcy prediction performance of the LSTM (long short-term memory), LR (logistic regression), k-NN (k-nearest neighbors), DT (decision tree), and RF (random forest) models. For the author, the results of this study provide useful information for selecting a suitable bankruptcy prediction model when the dataset has relatively few bankrupt companies.
Decision Trees (DT) constitute another machine-learning technique for business failure prediction. According to J. M. Pereira et al. (2007) [
22], decision trees map a hierarchy of classes or values based on conditional logic rules, leading to a classification. The main goal is to uncover relationships and dependencies between the variables, usually presented in the form of rules (e.g., "X → Y") (J. M. Pereira et al., 2010 [
11]).
Decision trees offer versatility and a high comprehension rate, facilitating the identification of critical factors for accurate company classification (J. M. Pereira et al., 2007 [
22]). Moreover, they do not necessitate the transformation of variables or the imposition of constraints and can incorporate the costs of incorrect classification, ultimately reducing financial burdens (Gepp et al., 2010 [
10]).
However, decision trees have some drawbacks, including the arbitrary assignment of prior probabilities, rendering them less precise than statistical models. Additionally, they merely indicate the relative importance of variables, unlike statistical models that provide detailed significance levels (Gepp et al., 2010 [
10]). Challenges include creating decision trees, which can be time-consuming, difficulty handling incomplete information, and the possibility of unexpected values (J. M. Pereira et al., 2007 [
22]).
Support Vector Machines (SVM) is an artificial intelligence-based method for business failure prediction. These machines employ a linear model to create an optimal separator for binary classification. The variables closest to this separator, referred to as support vectors, define the outcome, classifying companies as failures or non-failures (Alaka et al., 2018 [
12]). SVMs excel in minimizing structural risk, offer a higher predictive capacity, and are adept at handling overlapping data (Yeh et al., 2010 [
29]; Shetty et al., 2022 [
13]).
SVMs are lauded for their precision and stability in predicting business failure. Their simplicity facilitates integration with traditional statistical techniques, leveraging the strengths of both approaches (Min & Lee, 2005 [
37]). Kernel-based support vector machines (K-SVM) enhance classification accuracy and outperform other prediction models (Shaw & Routray, 2017 [
38]). [
39]
Genetic Algorithms (GA), a stochastic search technique inspired by natural genetics and evolution, also contribute to business failure prediction. GAs transform complex problems into simpler ones that can be treated as discriminant functions (Holland, 1975, as cited by Gordini, 2014 [
14]). GAs excel in optimizing objective functions subject to rigid and flexible constraints and can explore non-linear solution spaces without prior information about the model (K.-S. Shin & Lee, 2002 [
34]; C.-H. Wu et al., 2007 [
40]). GAs entails four key stages: initialization, selection, crossover, and mutation. In the initialization phase, a population of genetic structures, or chromosomes, is distributed in the solution space. The best-performing chromosomes are selected and copied to the next generation, gradually occupying a more significant portion of.
3. Methodology
The principal aim of this research project is to undertake a comparative analysis of the performance of statistical business failure prediction models against artificial intelligence-based prediction models. The methodological framework is structured around two primary approaches: firstly, statistical-based methods, specifically logistic regression (LR), and secondly, artificial intelligence-based methods, including neural networks (NN), support vector machines (SVM), and decision trees (DT).
3.1. Why the transport sector?
The transport sector holds a paramount role in the daily lives of individuals, facilitating the movement of people, goods, and services between nations. It serves as a fundamental component of European businesses and global supply chains [
41]. Moreover, it is instrumental for international and national trade, contributing to economic development by ensuring efficient, safe, and cost-effective goods transport [
42]. As part of the national economy, this sector constitutes a substantial portion of the GDP [
43]. As noted by the European Commission [
44], approximately 5% of the European Union's GDP is attributed to the transport sector, employing over 10 million individuals across Europe. In a country like Portugal, the transport sector employed around 136,000 people in 2021, with a majority engaged in road haulage. [
45,
46]
The sector's significance is further accentuated by its contribution to economic growth. As the economy expands, the transport sector also prospers. It is a catalyst for economic development and exhibits a pre-cyclical nature [
43].
The continuous growth of freight transport within the European Union is propelled by expanding global trade and economic practices like economies of scale and just-in-time deliveries [
42]. Nevertheless, the sector confronts pressing environmental concerns, accounting for a substantial share of greenhouse gas emissions in the European Union. The European Green Deal has set the ambitious target of reducing emissions by 90% [
44]. This necessitates the adoption of sustainable and environmentally friendly transportation methods. [
47,
48]
The transport sector confronts several challenges, including noise, road accidents, congestion, and fluctuating fuel prices due to geopolitical events [
41]. These challenges pose potential threats to the sector's development. In conclusion, the transport sector encompasses land, maritime, and air transport, which are pivotal for exchanging goods between countries. While land transport is dominant in facilitating cross-border interests transport, it is also the most polluting mode. On the other hand, rail and maritime transport offer more environmentally friendly alternatives. [
49]
The dynamic nature of the transport sector is shaped by global economic integration and evolving market dynamics. However, the sector faces numerous challenges, particularly regarding congestion, governance, and environmental concerns. The choice of transportation mode is influenced by factors such as cost, environmental impact, and safety. A financially robust transportation sector plays a pivotal role in fostering economic growth, societal well-being, and sustainability. Its vitality is a linchpin of economic prosperity, enabling the efficient flow of goods and services, enhancing trade, and facilitating access to essential resources. This, in turn, bolsters economic development, creates employment opportunities, and fortifies the overall quality of life for a society's inhabitants. A robust transportation sector also often contributes to adopting sustainable practices, reducing environmental impacts and promoting long-term resilience. A financially sound transport sector is a cornerstone of economic, social, and environmental progress.
3.2. Data selection and preparation
The empirical investigation in this research centers on predicting a company's success or failure through a supervised learning framework. This is accomplished by constructing a dataset comprising various financial indicators of companies and their corresponding success/failure status. These indicators serve as the independent explanatory variables in building predictive models. These models are then evaluated for their ability to predict success or failure on previously unseen data.
The predictive performance of each model is assessed through commonly used classification metrics in machine learning, which include precision, recall, F1 score, and accuracy. It's essential to note that all data is sourced from the Bureau Van Dijk - ORBIS database and encompasses accounting data from companies from 2014 to 2021.
Without a theoretical framework guiding the selection of explanatory variables for business failure, this empirical research adopts financial indicators as the independent variables. These financial ratios were chosen based on their prevalence and significance in prior studies. Financial ratios are selected for modeling due to their capacity to reveal a company's efficiency and allow for comparing relationships between various accounting figures. Furthermore, analyzing a company's financial information is crucial for assessing its financial health and prospects for sustainability. Consequently, these ratios play a pivotal role in predicting the success or failure of organizations. [
4,
50,
51]
A diverse range of financial ratios is available for business failure analysis. These can be grouped into categories such as return on investment, financial leverage, capital turnover, short-term liquidity, cash position, inventory turnover, and accounts receivable turnover [
50]. However, this research focuses on financial structure ratios, profitability, solvency, liquidity, and indebtedness, as these are the most commonly used and of significant interest to researchers [
51]. In this particular dataset, 15 financial indicators were chosen as independent variables: ROE using P/L before tax; ROA using P/L before tax; ROE using Net income; ROA using Net income; Profit margin; EBITDA margin; EBIT margin; Cash flow / Operating revenue; Net assets turnover; Current ratio; Liquidity ratio; Solvency ratio (Asset based); Gearing; Costs of employees / Operating revenue. The selection of the independent variable was based on the literature (Bellovary, Giacomino, and Akers, 2007 [
52]; Kušter et al. 2023 [
53]).
The dataset includes a total of 5,854 companies from four different countries: Italy (1,580 companies), Spain (1,949 companies), Portugal (1,216 companies), and France (1,109 companies). Within the dataset, 3,869 companies are classified as successful (66%), while 1,985 companies are categorized as failed (34%). The independent variables consist of financial indicators and the country of origin. Although the country of origin was initially considered an independent variable, its inclusion had minimal impact on results, leading to its removal.
The dataset structure allows for the construction of discriminant functions for both the statistical models and artificial intelligence-based models. These functions facilitate the assessment of whether the companies are failures or not, and enable a comparison of predictive capabilities between the two model categories.
The initial dataset exhibited an overall completeness rate of approximately 95.9%. To maximize the number of usable data points and standardize the dataset, a particular financial indicator, the solvency ratio based on liabilities was removed from the analysis, as it was the most common missing data point.
Subsequently, the final dataset comprises 4,866 companies, with 592 from Italy, 1,949 from Spain, 1,216 from Portugal, and 1,109 from France. The sample is further divided into 1,985 failed companies and 2,881 non-failed companies. To facilitate model development and evaluation, the dataset was randomly split into training and test sets, with an 80%-20% stratified division, maintaining the ratio of successful to failed companies in both sets. The same training/testing split was consistently applied to all models developed. Finally, the dataset was normalized through standardization, subtracting the mean and dividing by the standard deviation for each independent variable within the training set.
4. Results presentation and discussion
The financial indicators selected above were used to build various machine learning models to predict the success or failure of companies and make it possible to compare statistical models with models based on artificial intelligence. The methods implemented were: Linear Support Vector Machines (L-SVM), Kernel Support Vector Machines (K-SVM), K-Nearest Neighbours, Logistic Regression (LR), Decision Trees (DT), Random Forests (RF), Extremely Random Forests (ERF), AdaBoost and Neural Networks (NN), using software.
In the case of global models and similar to what was seen in da Silva et al. (2023) [
54] in the global models, for each of the methods we tried various perfect combinations of parameters for the specific algorithm and then used the best combination to build the rest of the models. In the case of the L-SVM model, we tried various weights and different losses and numbers of interactions, and for the decision trees we tried different depths and different criteria for measuring a split at each node. As for the neural networks, we tried different numbers of hidden layers and different numbers of neurons in each layer [
54].
As mentioned, the models were trained and trialed on companies in Italy, Spain, Portugal, and France. The results of the models chosen for each type of classifier are summarised in
Table 1 below.
The results presented in
Table 1 show the average and standard deviation of the metrics for 10 runs of each method. This shows that most of the methods are fairly stable over different runs, with the exception of the neural networks, which have a standard deviation of more than 0.5%, which can be explained by the different random initialisation of the network parameters in the different runs. However, this is a significantly stable behaviour.
The Precision, Recovery and f-score metrics were calculated for each class. Thus, given the imbalance in the data set, i.e. the imbalance between non-failed and failed companies, the metrics were calculated for failed companies (C0) and non-failed companies (C1).
In line with the above, the L-SVMs were developed based on balanced class weights, with C = 10, 10,000 interactions and a loss of articulation. The K-SVMs were trained with an RBF kernel, with C =10, balanced class weights and a scaling kernel coefficient. The K-NN was developed based on 5 neighbours, the ball tree algorithm, uniform weights and the Manhattan metric. Logistic regression was trained using the IBFGS solver, balanced class weights and no regularisation. The DTs were developed with a tree depth of 6 with all characteristics, an entropy criterion and uniform class weights. The RF was developed with 1,000 estimators using log2 of the number of features for each estimator, balanced class weights and the Gini index criterion. The ERF was trained with 100 estimators using all the features, uniform class weights and the entropy criterion similar to that used in decision trees. AdaBoost was trained with 10 estimators and the SAMME.R algorithm using a learning rate of 0.5. Finally, the NNs were developed with 3 hidden layers and a softmax output, 100 neurons in the first layer and 10 neurons in the second and third layers, a sparse categorical cross entropy loss without class weights, the Adam optimiser at a rate of 10optimiser at a rate of 10, a batch size of 64 and 100 training epochs.
The results of applying these models to the 4866 companies selected show that most of the methods achieved levels of precision and accuracy in the order of 71% - 73%, with the best precision being obtained by the ERF method.
To predict the failure of failed companies, represented in the table by the C0 metric, the best f score was also obtained by the ERF model with the best balance between precision and recovery. However, in this case the best precision was obtained using AdaBoost and the best recovery was achieved using K-SVM.
In turn, for non-failed companies, metric C1 shown in
Table 1, the best
f-score was achieved using AdaBoost, which recorded the highest recovery in this case. The best precision was achieved using the ERF.
So, to summarise, we can say that the most suitable models for this study are ERF and AdaBoost, as they give the best results in terms of both precision and recovery.
5. Conclusions
The increasing focus on predicting business failure has made it a pertinent subject, potentially serving as a valuable tool for companies in forecasting bankruptcy situations (Shi & Li, 2019 [
16]). This study's significance lies in examining whether artificial intelligence-based models could offer a promising approach in bridging gaps within statistical models, thereby enhancing the accuracy of business failure predictions.
The principal objective of this dissertation was to compare the efficacy of statistical business failure prediction models with artificial intelligence-based models within the transport sector over the time span from 2014 to 2021. The study aimed to ascertain which of these two approaches is more efficient. For this purpose, data on financial ratios, encompassing a total of 15 financial indicators, was obtained from the Bureau Van Dijk - ORBIS database. These indicators were instrumental in assessing the financial health of companies and constructing prediction models. The dataset comprised 4,866 companies distributed across four countries: Italy (592 companies), Spain (1,949 companies), Portugal (1,216 companies), and France (1,109 companies), with a mix of both failed (1,985 companies) and non-failed (2,881 companies) businesses.
Nine models were created using software, including Linear Support Vector Machines (L-SVM), Kernel Support Vector Machines (K-SVM), K-Nearest Neighbors, Logistic Regression (LR), Decision Trees (DT), Random Forests (RF), Extremely Random Forests (ERF), AdaBoost, and Neural Networks (NN). These models were applied to the financial data of the analyzed companies to evaluate their predictive capabilities.
The results revealed that most models exhibited high precision and accuracy, ranging from 71% to 73%, with the ERF model outperforming others in both predictive capacity and accuracy. It was also observed that artificial intelligence-based models outperformed statistical models in predicting business failure, with particular emphasis on the AdaBoost and ERF models.
In terms of the influence of the country of origin as a relevant variable for classifying companies, it was determined that it did not significantly impact the classification, as the results remained consistent. Regarding the study's limitations, it is confined to the transport sector and encompasses data from only four countries, although this dataset proved sufficient to fulfill the research objectives. Additionally, as this research deals with predictive models, it is essential to acknowledge the inherent uncertainty of predicting future events. Another limitation is the reliance on financial indicators provided by the companies themselves, which introduces the risk of data manipulation or alteration.
Future research endeavors in this field may explore a narrower set of statistical and artificial intelligence-based models for an in-depth comparative analysis. Alternatively, the study can delve into using additional financial data or conduct a detailed examination of financial ratios to detect signs of accounting manipulation before model testing and development. Another avenue for future research could involve assessing whether companies should be alerted to potential failure situations, enabling them to take preventive measures.