1. Introduction
The primary causes of fires are mostly human-induced. In fires occurring in rural areas and forests, while natural reasons for the ignition and spread are extremely few, direct or indirect human factors are much more prevalent [
1,
2]. In this case, rural and forested areas and areas intertwined or intermixed with human settlements are under the increasing influence of human activities. If human settlements are in close proximity to natural and human-derived biomass, the main cause of forest, rural, and biomass fires is largely human. It has been observed that there has been an increase of approximately 12% in fires caused by lightning strikes due to global warming in the United States [
3]. Even with this increase, it is still far below the approximately 80% rate of human-induced fires within all fires [
4]. In fires occurring in areas within the blend of wildland areas and human settlements, the people living there and their properties are also endangered [
5]. It is very important for city planning experts and related institutions to take into account studies related to fire to minimize these bilateral risks. Because as cities expand towards natural areas, they also implement various transformations in the field to provide the resources they need (water, energy, transportation, etc.). For example, opening roads, extending power lines, building dams, laying pipelines, opening plantation areas, etc.
In the assessment of risks in rural, forest, and vegetation fires, the selected criteria generally include slope, distance to roads, elevation, distance to water bodies, aspect, type of vegetation, land use, population, climate features, distance to settlements, and fuel type [
6,
7,
8,
9]. Factors such as slope, elevation, aspect, distance to water bodies, and climate features are directly related to geography. Factors such as population, distance to settlements, distance to roads, and land use are directly human-induced, and vegetation, flammable material type, and even climate features can be considered factors related to humans, albeit indirectly. Vegetation cannot exist solely through natural development, it is being destroyed in many places by humans or its qualities are changed through agricultural and forestry activities, thus it is under anthropogenic domination. Thus, humans partly influence what type of fuel will be for fires in rural and forested areas. The negative human impact on climate change is not just carbon released into the air as a result of industrial activities. For example, it is known that urbanization affects microclimate features in proportion to the width of the dam in places where it is dense and spread over large areas or a dam is built in a region. With the expansion of urban areas, the natural water cycle changes in the areas they cover, significant differences occur in the quality and quantity of sunlight reflection and absorption from the lands, and also in wind speed [
10]. In addition to these, anthropogenic heat fluxes and air pollution created lead to local microclimate changes first and then permanent climate changes in the region. The effect of dams on the microclimate is very important for humid and semi-humid regions. It is a fact that dams directly affect the evaporation amount in the region depending on heat, but they also indirectly affect the microclimate because they improve the vegetation on their shores [
11].
The commonly used criteria in modeling studies conducted on fire risk assessment in forest and rural area fires are given in
Table 1, based on a review of relevant research [
7,
8,
9,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34].
It is a known fact that rural areas and forests are under significant encroachment and pressure due to rapid population growth. This has led to an increase in forest and rural area fires. Despite the developments in firefighting technologies in recent years, increased efficiency in prevention activities, reforestation and improvement efforts, and heightened awareness against fires, it has been observed in a report by the Food and Agriculture Organization of the United Nations that the reduction of forests has not been prevented [
35]. These total forest losses at a global scale should be considered as partly due to fires and partly due to loss of forest qualities because of land use. However, regardless of the reason, the decrease in forests and vegetation means less absorption of sunlight and atmospheric carbon, which clearly will contribute to global warming. The rising global warming brings more drought, particularly in the Northern Hemisphere areas close to the equator, due to climate change and causes conditions of low humidity, making forests more susceptible to fire, and once a fire starts, it spreads faster and becomes more difficult to extinguish. This increases the importance of risk assessment studies in terms of taking precautions against fires, planning, and early detection.
The probability of a fire spreading and the damage it will cause is defined as fire risk [
8,
9]. At the end of the 20th century, the first remote infrared scanners were used to locate a forest fire with detection applications [
36]. Today, with the repetition interval and wide area imaging capability of satellite data, it has been possible to gain important information about various fires. Technologies such as remote sensing and geographic information systems contribute significantly to fire prevention by providing capabilities for data collection, analysis, and mapping [
8]. To model fire risk, it is necessary to identify factors that influence the probability of fire [
37]. Altitude, aspect, slope, weather conditions, vegetation, and human factors are the main reasons affecting the occurrence and spread of fire in a region [
38]. In addition, many criteria such as distance to roads, distance to settlements, land use cover are used in risk models [
9].
In this study, the forest and rural areas of the European part of Istanbul and the settlements intertwined with these areas were chosen as the research field. There are settlements spread over rural areas and agricultural activity areas in a significant part of the area covered by the city's boundaries. The city, having the highest population nationwide, its geopolitical importance, history, natural resources, economic importance, etc., have led to the selection of this field for research. Istanbul is also the richest city in terms of academic studies. With a population of approximately 15 million, having a larger population than many countries in the world is among the most important factors directing the examination of risks in the region. There are 2500 people per square kilometer [
39]. Istanbul has been an important city throughout history. Despite losing its capital status years later, it has never lost its characteristic as a strategic city in terms of economic, political, and commercial importance. Rapid population growth, infrastructure needs, and emerging migration problems make this city an attraction center of Turkey [
40]. The city's change reflex has been in the form of growth in recent years, but this has happened uncontrollably. The transformation of fields into building groups, the transformation of the city center into an investment weight, etc., urban growth trends have emerged, but it has been a growth that cannot be controlled and future planning cannot be made [
41]. Istanbul also has the feature of being one of the important financial centers of the world. Considering Eastern Europe and Central Asia, it ranks 77th in The Global Financial Centers Index (GFCI) 32 ranking. With a Gross Domestic Product of 232 billion dollars, it is larger than many national economies. It houses half of the country's industrial assets. It has a large share in economic parameters such as the country's businesses, employment, etc., ranging from 30% to 50%. For these reasons, Istanbul has a high disaster vulnerability due to its economic importance [
42].
The historical importance of the city stems from its 2500-year-old heritage. Hosting historical and cultural areas such as the Historic Peninsula, the Golden Horn, the Bosphorus, etc., makes the city very valuable. The city was chosen as the European Capital of Culture in 2010. The city's significant place in the world heritage is one of the factors directing the study to this area [
43].
Istanbul has experienced many destructive disasters over time. The expectation of an earthquake is frequently mentioned by the scientific community today. Studies are being conducted on a city scale regarding the pre and post-disaster situations. The fact that the city carries this risk is one of the important reasons directing the study to this area. It is known that fire is expected to occur along with an earthquake [
44]. However, the area of interest of this study will not only be urban fires, but also the fire risk created by the natural area within the city and around the city.
In Turkey, fires tend to increase due to the effects of global warming and uncontrolled urbanization. The year 2021 has gone down in history as a very bad year for Turkey in terms of fire. It has been observed that large fires have occurred in 11 provinces in the last 50-year period. According to the records of the General Directorate of Forestry (OGM), 117,734 fire cases have occurred from 1937 to the present, and a very large forest area of 1,851,476 hectares has been lost. This corresponds to approximately 15.73 ha for each fire. In the last 10 years, Muğla (2716), Antalya (2446), Izmir (1649), and then Istanbul (1493), which is the subject of this study, are the four largest provinces in terms of the number of fires. These data are based on the OGM 2022 records [
45].
Fire risk maps of a region are created by combining factors that can cause fire [
18]. For example, methods such as geographic information systems (GIS), analytic hierarchy process (AHP), fuzzy logic, goal programming (GP), artificial neural networks (ANN), random forest, logistic regression and other machine learning methods are used to create risk maps in forest fires [
9].
In this study, the machine learning method, which has recently begun to be used in areas such as fire forecasting, fire risk assessment, creating prediction maps of hazard risk susceptibility, modeling fire behavior, has been used [
46,
47,
48]. Classification with machine learning methods is one of the methods used in many areas recently. Many studies in the literature show that machine learning methods are efficient in classification problems with high accuracy and minimal deviation [
49,
50,
51,
52,
53,
54].
Lu et al. [
49] have developed a classification model aimed at predicting stadium fire risk. Fire risk data of smart stadiums were used in the model. As a result of the study, it was emphasized that the best performance was in the Gradient Boosting model with a F1 score of 81.9% and an accuracy of 93.2%.
Pang et al. [
50] collected fire hotspots, meteorological conditions, land, vegetation, and socioeconomic data from different sources in their study on forest fire prediction. Using these data, they established an artificial neural network, a radial basis function network, a support-vector machine, and a random forest model to determine the thirteen main causes of forest fires in China. As a result of the study, they reported that the prediction accuracies of the four forest fire prediction models were between 75.8% and 89.2%, and the AUC values were between 0.840 and 0.960.
Kalantar et al. [
55] conducted forest fire susceptibility prediction based on remote sensing data using resampling algorithms in machine learning models. As a result of the study, they stated that the boosted regression tree model performed better with the highest AUC value of 0.91 compared to other models. They also emphasized that the resampling process improved the prediction performance of all models.
3. Results and Discussion
This research presents a model for the probability of fire occurrence specific to the European side of Istanbul. This region has a diversity of land use, including rural-urban and natural-urban amalgamations and areas used for agricultural purposes, farms, and cultivated fields. Therefore, predicting fire risk is essential. As the city is considered to be shifted northward to distance it from the fault line passing through the Marmara Sea due to the expected earthquake, incorporating fire risk into new settlement plans is crucial for developing sustainable residential areas.
It is emphasized in the literature that different machine learning algorithms are quite successful due to their ability to learn and model from data, and this typically yields better results than traditional statistical approaches [
62]. In this study, the results obtained from the Random Forest, Extreme Gradient Boosting, and Light Gradient Boosting models are compared with the existing data. The quality parameters used in model validation are given in
Table 6. Examining the Accuracy and AUC values, which are among the most important validation parameters, the best result was obtained from the Random Forest model among the three models. The worst results were given by the LGB model for the accuracy parameter and the XGB model for the AUC parameter. However, the difference between them is negligible. But, with the lowest recall value and F1 score, the RF model stands out significantly in terms of accuracy from the XGB and LGB models, which are close to each other.
Rodriguez and Riva 2014, used machine learning models to assess human-induced wildfires in Spain between 1988 and 2007 and to predict fire risk. When the whole country is considered, it doesn't entirely consist of natural areas and forests but includes regions with different characteristics like Wildland-Urban Interface (WUI), Rural-Urban Interface (RUI), and Wildland-Agricultural Interface (WAI). The study considered factors like population density, energy lines, railways, and agricultural vehicle density [
63]. Except for the railway and agricultural vehicle density, the factors and the manner of field examination are quite similar to this study. As a result of the study, it was revealed that the RF model was the best model with an AUC value of 0.746. In this present study, the RF model also provided the best result with an AUC value of 0.753.
The 10-fold cross-validation method was used in validating the model. The average value of the obtained results was taken into account. The k-fold validation results for all three models are given in
Table 7. Upon reviewing the table, it is observed that the standard deviation (std) values for Accuracy and AUC are low. This is true for all three models.
The fire risk prediction capacity of classification models was tested using ROC analysis. The AUCs of the ROC graphs for RF, XGB, and LGB models are 0.72, 0.70, and 0.69 respectively for the test data (
Figure 17,
Figure 18 and
Figure 19).
The RF and XGB models have shown the importance of the population factor most significantly (
Figure 20 and
Figure 21). The LGB model, on the other hand, has brought forward the proximity to energy lines (
Figure 22). The proximity to energy lines was the second most important factor in the RF model. In the XGB model, the distance to settlements factor took a place of importance close to the value of energy lines in the RF model. The LGB model determined the second most important factor as elevation, at a value close to the first most important factor, proximity to energy lines.
Here, in the classification of the RF and XGB models, the most important factor was population and the second most important factors were energy lines and distance to settlements, and these are factors that are more commonly found as initiators in fires. On the contrary, in the LGB model, while the energy lines, which are the most important factor, are an initiator, the height, which is the second most important factor, is not among the initiating effects in fires. In fact, as a topographic feature, elevation has a direct effect on temperature, humidity, and wind. As elevation increases, the probability of precipitation usually increases, thus reducing fire intensity [
64]. The LGB model placed a factor that has a negative relationship with fire probability as the second most important in the importance ranking. This factor is in the third place in the importance ranking in the RF model and does not have the value in the LGB model's classification graph. Generally, the importance ranking of factors in the classification of RF and XGB models appears to be closer to each other. The RF model is a model that can generally provide high accuracy. It also achieves this accuracy with fewer factor variables. Fewer parameters actually mean easier calibration [
63,
65]. Nevertheless, since no single model can ever be perfect enough to always generate correct predictions, it should be evaluated together with other models.
Human population and structures that develop with the population, such as energy lines, roads, and settlements, are determined as factors of high importance for rural area and forest fires, and it is clear that these factors are also important parameters in planning cities. Based on this, the fire risk in planning settlements that tend to expand towards natural areas can be evaluated with this study model made for the European side of Istanbul and can be safely used in decision-making processes in urban planning. The proposition that this study puts forward for designs in urban planning works is to pull back the approach of determining construction and infrastructure opportunities aimed at reaching sufficient capacities to meet a population size as much as possible. It shows that the dominant influence of the population and its related factors (DS, DP, DR) in planning studies should be reduced and should be included as parameters kept under control.
Figure 1.
Geographic location of Istanbul (produced using QGIS).
Figure 1.
Geographic location of Istanbul (produced using QGIS).
Figure 2.
Collected data and its sources.
Figure 2.
Collected data and its sources.
Figure 3.
GIS Operating Flow Chart.
Figure 3.
GIS Operating Flow Chart.
Figure 4.
OSM map platform.
Figure 4.
OSM map platform.
Figure 5.
Digital Surface Model (DSM), (DEM).
Figure 5.
Digital Surface Model (DSM), (DEM).
Figure 8.
Distance to Settlements.
Figure 8.
Distance to Settlements.
Figure 9.
Distance to roads.
Figure 9.
Distance to roads.
Figure 10.
Distance to water areas.
Figure 10.
Distance to water areas.
Figure 11.
Distance to powerlines.
Figure 11.
Distance to powerlines.
Figure 13.
Fire locations (between 2000–2021).
Figure 13.
Fire locations (between 2000–2021).
Figure 14.
Fire and relationship maps.
Figure 14.
Fire and relationship maps.
Figure 15.
Schematic view of machine learning-based modeling.
Figure 15.
Schematic view of machine learning-based modeling.
Figure 16.
K-fold cross-validation diagram.
Figure 16.
K-fold cross-validation diagram.
Figure 17.
ROC Curves for RF Classifier.
Figure 17.
ROC Curves for RF Classifier.
Figure 18.
ROC Curves for XGB Classifier.
Figure 18.
ROC Curves for XGB Classifier.
Figure 19.
ROC Curves for LGB Classifier.
Figure 19.
ROC Curves for LGB Classifier.
Figure 20.
Future importance of RF Classifier.
Figure 20.
Future importance of RF Classifier.
Figure 21.
Future importance of XGB Classifier.
Figure 21.
Future importance of XGB Classifier.
Figure 22.
Future importance of LGB Classifier.
Figure 22.
Future importance of LGB Classifier.
Table 1.
Criteria Used in Fire Risk Assessment.
Table 1.
Criteria Used in Fire Risk Assessment.
Slope |
Species composition |
Topographic Wetness Index (TWI) |
Aspect |
Development Stage |
Canadian Forest Fire Weather Index (FWI) |
Elevation |
Solar Radiation |
Distance to Fire Response Teams |
Distance to settlement |
Fire regimes (TSF-FR) |
Distance to Fire Watch Towers |
Distance to road |
Tree Species Composition |
Visibility from Fire Watch Towers |
Distance to water |
Topo-morphology |
Distance from the Anti-poaching Camp Shed |
Population |
Land Use |
Distance to Previous Fire Points |
Precipitation |
Stand type |
Topographic Position Index (TPI) |
Vegetation Density |
Stand age |
Tree Stages |
Temperature |
Stand canopy density |
Fuel Type |
Vegetation type |
Distance to fields |
Humidity |
Distance from Agricultural Land |
Forest Cover |
Forest Type |
Wind speed |
Tree Species |
Bare soil index |
Stand Crown closure |
Land Surface Temperature |
Distance to Tourist Spots |
Table 2.
Factors used in the literature studies.
Table 2.
Factors used in the literature studies.
Factors |
Reference number |
12 |
13 |
14 |
15 |
16 |
17 |
8 |
18 |
19 |
7 |
20 |
9 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
Slope |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
Aspect |
x |
|
|
x |
|
x |
x |
|
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
x |
|
x |
x |
x |
x |
Elevation |
x |
|
x |
x |
x |
x |
|
x |
x |
x |
|
x |
x |
x |
x |
x |
x |
|
|
x |
|
x |
x |
x |
x |
x |
Distance to settlements |
x |
x |
|
|
x |
x |
x |
x |
x |
x |
x |
x |
x |
|
x |
x |
x |
x |
|
x |
|
x |
x |
x |
x |
|
Distance to roads |
x |
x |
|
x |
|
x |
x |
x |
x |
|
x |
x |
x |
|
x |
x |
x |
x |
|
x |
|
x |
x |
|
x |
|
Distance to water bodies |
|
|
x |
|
x |
|
|
x |
x |
|
x |
|
|
|
x |
|
|
x |
|
|
|
|
|
x |
x |
|
Land use |
|
|
|
x |
x |
|
x |
x |
|
|
|
|
|
x |
|
|
|
|
|
x |
|
x |
|
x |
|
x |
Precipitation |
|
|
x |
|
x |
|
|
x |
|
|
|
x |
|
x |
x |
|
|
|
|
x |
|
|
|
|
|
x |
Vegetation density |
|
|
|
|
|
|
|
|
|
x |
|
x |
|
|
|
x |
x |
|
|
|
x |
|
|
|
|
|
Temperature |
|
|
x |
|
|
|
|
|
|
|
|
|
|
x |
x |
|
|
|
|
x |
|
|
|
|
|
x |
Plant type |
|
x |
|
|
|
x |
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
x |
x |
|
|
Distance from agricultural land |
|
|
x |
|
|
|
|
|
|
x |
|
|
|
|
x |
x |
|
|
|
|
|
|
|
|
|
|
Wind speed |
|
|
|
|
|
|
|
|
|
|
x |
|
x |
x |
|
|
|
|
|
|
|
|
|
|
|
x |
Stand Crown Closure |
|
|
|
|
|
|
|
|
|
x |
x |
|
|
x |
x |
|
|
|
x |
|
|
|
|
|
|
|
Population |
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
Topographic Wetness Index |
|
|
|
|
|
|
x |
|
|
|
|
x |
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
Canadian Forest Fire Weather Index (FWI) |
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
Tree Stages |
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
Fuel Type |
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
x |
|
|
|
|
|
|
|
|
|
Humidity |
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
Forest type |
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distance to tourist places |
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distance from the anti-poaching Camp Shed |
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distance to fields |
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Forest cover |
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distance to previous fire points |
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tree species |
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Topographic Position Index (TPI) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Land surface temperature |
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bare soil index |
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
|
Species composition |
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
|
Development stage |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
Solar radiation |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
|
|
Fire regimes (TSF-FR) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
|
|
|
|
Tree species composition |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
x |
|
|
|
|
|
|
|
Topo-morphology |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
Soil use |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
Distance to fire response teams |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
Distance to fire watch towers |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
Visibility from fire watch towers |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
Stand type |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
Stand age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
Stand canopy density |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
Human Index |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x |
|
|
|
|
|
Table 3.
Statistic parameters of dataset.
Table 3.
Statistic parameters of dataset.
|
Count |
Mean |
Std |
Min |
25% |
50% |
75% |
Max |
Slope (SL) (°) |
3455 |
6.34 |
6.54 |
0.00 |
1.30 |
4.28 |
9.09 |
48.60 |
Aspect (AS) (°) |
3455 |
134.16 |
109.03 |
-1.00 |
0.00 |
119.48 |
234.88 |
356.55 |
Digital elevation model (DEM) (m) |
3455 |
114.56 |
73.68 |
1.00 |
57.00 |
104.00 |
164.00 |
428.00 |
Distance to powerlines (DP) (m) |
3455 |
4175.25 |
3678.12 |
0.00 |
1315.08 |
3196.81 |
6098.97 |
21801.60 |
Population (PO) (person) |
3455 |
16.13 |
50.24 |
0.03 |
0.20 |
0.94 |
5.38 |
470.96 |
Distance to roads (DR) (m) |
3455 |
145.61 |
183.30 |
0.00 |
22.36 |
80.62 |
206.03 |
2046.85 |
Distance to water areas (DW) (m) |
3455 |
1939.42 |
1396.11 |
0.00 |
853.52 |
1672.00 |
2773.01 |
8547.64 |
Distance to settlements (DS) (m) |
3455 |
543.93 |
799.76 |
0.00 |
0.00 |
257.10 |
730.31 |
5734.47 |
Fire Status (FS) (-) |
3455 |
0.07 |
0.26 |
0.00 |
0.00 |
0.00 |
0.00 |
1.00 |
Table 4.
Data set features.
Table 4.
Data set features.
Platform |
Data |
Source |
Resolution |
OSM |
Road |
http://overpass-turbo.eu |
|
Water Areas |
http://overpass-turbo.eu |
|
Power Line |
http://overpass-turbo.eu |
|
USGS |
SRTM |
http://earthexplorer.usgs.gov/ |
90 m |
ArcGIS |
Land Cover |
https://livingatlas.arcgis.com/ landcoverexplorer
|
10 m - 2021 |
GEE |
WorldPop |
ee.ImageCollection(“WorldPop/ GP/100m/pop”)
|
92.7 m |
FIRMS |
MODIS+Aqua Terra Thermal Anomalies (Fire Locations) |
https://firms.modaps.eodis.nasa.gov/ |
1 km |
Table 5.
Classification confusion matrix.
Table 5.
Classification confusion matrix.
|
Predicted value |
Fire (class 1) |
Non-Fire (class 0) |
Actual value |
Fire (class 1) |
True Positive (TP) |
True Negative (TN) |
Non-Fire (class 0) |
False Positive (FP) |
False Negative (FN) |
Table 6.
Evaluating parameters for classification models.
Table 6.
Evaluating parameters for classification models.
Model |
Accuracy |
AUC |
Recall |
Precision |
F1 |
Random Forest (RF) |
0.9293 |
0.7528 |
0.0346 |
0.4000 |
0.0622 |
Extreme Gradient Boosting (XGB) |
0.9189 |
0.7409 |
0.1546 |
0.3570 |
0.2119 |
Light Gradient Boosting (LGB) |
0.9107 |
0.7508 |
0.1732 |
0.3454 |
0.2138 |
Table 7.
K-fold validation accuracy and AUC predictions for classification models.
Table 7.
K-fold validation accuracy and AUC predictions for classification models.
|
RF |
XGB |
LGB |
Fold |
Accuracy |
AUC |
Accuracy |
AUC |
Accuracy |
AUC |
0 |
0.9298 |
0.6373 |
0.9050 |
0.7278 |
0.9174 |
0.7244 |
1 |
0.9339 |
0.7425 |
0.9174 |
0.7746 |
0.9132 |
0.7569 |
2 |
0.9298 |
0.7586 |
0.9132 |
0.7762 |
0.8967 |
0.7710 |
3 |
0.9421 |
0.8424 |
0.9091 |
0.8434 |
0.9050 |
0.8165 |
4 |
0.9256 |
0.8203 |
0.9215 |
0.7619 |
0.9091 |
0.7793 |
5 |
0.9215 |
0.7975 |
0.9050 |
0.7371 |
0.9050 |
0.7470 |
6 |
0.9256 |
0.8259 |
0.9132 |
0.7956 |
0.9256 |
0.8058 |
7 |
0.9256 |
0.7240 |
0.9174 |
0.7612 |
0.9132 |
0.7269 |
8 |
0.9253 |
0.6611 |
0.9170 |
0.6799 |
0.9087 |
0.6893 |
9 |
0.9253 |
0.7064 |
0.9087 |
0.6197 |
0.9170 |
0.6657 |
Mean |
0.9285 |
0.7516 |
0.9127 |
0.7478 |
0.9111 |
0.7483 |
Std |
0.0056 |
0.0669 |
0.0054 |
0.0590 |
0.0077 |
0.0456 |