Preprint
Article

Comparison of Pixel-Based Classification Algorithms Using Landsat-8 OLI and Sentinel-2 MSI for Land Use/Land Cover Mapping in a Heterogeneous Landscape

Altmetrics

Downloads

211

Views

66

Comments

0

This version is not peer-reviewed

Submitted:

14 July 2023

Posted:

17 July 2023

Read the latest preprint version here

Alerts
Abstract
Satellite-based data classification performance remains a challenge for research community in the field of land use/land cover mapping. Here we investigated supervised per-pixel classifications performance under different scenarios, based on single and seasonal multispectral data combina-tions of different sensors (Landsat-8 OLI and Sentinel-2 MSI). In case of Landsat, seasonal spectral indices (EVI and NDMI) were included. A typical Mediterranean watershed with a complex landscape comprised of various forest and wetland ecosystems, crops, artificial surfaces, and lake water was selected to test our approach. All available geospatial data from national databases (Forest Map, LPIS, Natura2000 habitats, cadastral parcels, etc.) are used as ancillary data for clas-sification training and validation. We examined and compared the performance of ML, RF, KNN and SVM classifiers under different scenarios for land use/land cover mapping, according to Co-pernicus Land Cover nomenclature. In total, eight land use/land cover classes were identified in Landsat-8 OLI and nine in Sentinel-2 MSI for an acceptable overall accuracy over 85%. A com-parison of the overall classification accuracies shows that Sentinel-2 overall accuracy was slightly higher than Landsat-8 (96.68% vs. 93.02%). Respectively, the best-performed algorithm was ML in Sentinel-2 while in Landsat-8 was KNN. However, machine-learning algorithms have similar results regardless the type of sensor. We concluded that best classification performances achieved using seasonal multispectral data. Future research should be oriented towards integrating time-series multispectral data of different sensors and geospatial ancillary data for land use/land cover mapping.
Keywords: 
Subject: Environmental and Earth Sciences  -   Remote Sensing

1. Introduction

Land cover represents the characteristics of earth surface shaped by various natural agents or anthropogenic interventions. From earth-observation perspective, the term “land cover” defines the land types (i.e. vegetation, water bodies, crops, built-up areas, etc.) which can be detected from a distance. Land cover is a critical variable for earth surface studies since it can be changed over time [1]. On the contrary, the term “land use” refers to the way a particular land is used involving the associated economic purpose of this use [2]. Both concepts are interrelated. For example, a land cover type such as a forest may support a series of land uses (e.g. timber production, recreation, rangeland, etc.) while a land use such as agroforestry may include a series of land cover types (e.g. forests, plantations, annual crops, etc.). In this research, they are used complementary to depict all kind of existing land cover or land use types within the study area.
Land use/land cover information is essential for management and monitoring of natural resources, modeling, spatial planning, land administration and sound decision-making. Satellite-based classification provides land use/land cover spatial-explicit information and map generation at global, national or regional scales. However, medium-resolution (Landsat-like, 10–30 m) is more adequate to detect most human–nature interactions [3]. The opening of the Landsat archive in 2008 [4] and the launch of Sentinel-2 in 2015 provide optical multispectral imagery data at medium-high resolution. Free and open access policy on this imagery promoted the development of new products and applications across space and time, especially in the domain of land use/land cover mapping. This data policy combined with the increase of computing power and concurrent reduction of costs, has facilitated large area mapping and expanded the number of users worldwide [5].
Due to the complexity of land use/land cover characterization, several studies have been mainly focused to methods for mapping a single land cover type (i.e. forests, wetlands, forest fires, agriculture, urban areas or water). For example, [6,7] for forests, [8] for urban areas, [9,10] for croplands, [11,12] for wetlands, using either Landsat or Sentinel imagery. However, multiple-class characterization required for simultaneous and spatially exhaustive mapping [13]. Thus, effective and efficient methods are required for satellite imagery classification to provide meaningful information regarding all land use/land cover within a specific area.
A variety of classification approaches (unsupervised, supervised, parametric, non-parametric, object-oriented) has been developed and applied to derive land cover information with different degree of success. Per-pixel classification approaches remain the most popular in the analysis of satellite-derived imagery [14]. Here, we used supervised per-pixel classifications for multiple land use/land cover types mapping.
In supervised approaches, reference data are required to characterize the variability of land cover across space and time and serve as reference dataset for training and validating classification models. A suitable reference data is a fundamental requirement in supervised image classification [15]. We use existing authoritative geospatial datasets of higher accuracy as a pool for training and validation. The reference datasets spans forestlands, cultivated fields, discontinuous urban fabric, built-up areas, and wetland habitat types. The classification scheme of land cover classes is based on Copernicus Land Cover (CLC) nomenclature [16]. Based on CLC2018 land use/land cover distribution, a stratified random sampling scheme is deployed to train the classifier and access classification accuracy. Classification accuracy depends on the satellite imagery, the classification algorithm being used, and the nature of training data as well [17].
Four popular classifiers ML, RF, KNN and SVM selected, and their implementation in Erdas Imagine 2020 was used to run the experiments. Description and analysis on these classifiers can be found in the literature. For example, [18] for Bayesian classifiers; [19] for SVM; [20] for KNN; [21] for Random Forests.
Maximum likelihood method is included in our research due to its wide application and use in commercial image-processing software [22]. On the other hand, the above machine learning algorithms have gained great attention for classifying land use/land cover types in the last decade.
In recent evaluations, SVM and KNN, with the exception of Naïve Bayes (a Maximum Likelihood variant) performed similarly in per-pixel classification of 26 Landsat TM imagery 10kmx10km blocks [23]. In a peri-urban and rural with heterogeneous land cover area in Vietnam, SVM produced the highest overall accuracy (OA) using Sentinel-2 MSI, followed consecutively by RF and KNN. However, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough (>750 pixels/class) [24].
In this study, we test the above mentioned classifiers to derive land use/land cover information. We explore classification performance under six different scenarios. The study investigates the performance of the above classifiers using Landsat-8 OLI and Sentinel-2 MSI across a heterogeneous Mediterranean watershed, based on the same available land cover reference, training and validation data per sensor. In case of Landsat, spectral indices (EVI and NDMI) were included. These indexes have been reported in the literature that improve land use/land cover classifications accuracies [25]. We evaluate classification performance for an area with complex landscape and investigate how single date and seasonal optical multispectral data impact land use/land cover classification accuracy.

2. Study Area

The test site is Mygdonia basin, which is located in Northern Greece. Its watershed covers an area of 190,285ha. It lies East of Thessaloniki city (40˚40΄56,49N and 23˚18΄21,15E, WGS84) at a distance of 41,6 km (Figure 1). The watershed is surrounded by mountains in the North (Mount Krousia) and in the South (Mount Cholomontas), by hills in the West and by Rentina Gorge and Kerdylia mount in the East. Its elevation ranges from 35m to 1,129m. At the center of the watershed, there are two lakes (Koronia and Volvi). The watershed is drained through seasonal and intermittent streams at these lakes.
The two lakes along with their surrounding wetlands are listed as a Wetland of International Importance by the Ramsar Convention since 1975 (GR005: 16,388ha). Along with the valley of Rentina Gorge, they have been designated as Special Conservation Zones within the Natura2000 network (GR1220001 and GR1220003: 28,734.90ha) in 2017. These protected sites constitute a unique complex of interconnected natural ecosystems of lakes, seasonal streams, channels, riparian forests, shrubs, wet meadows and fields.
At the southernmost end of the watershed, on the slopes of mount Cholomontas, there is a portion of another protected area (GR1270001). It has an area of 15,651.14ha dominated by beech, oak and pine forests.
Non-irrigated arable lands are distributed across the watershed up to the productive forests in the south. Intensively cultivated lands, mainly irrigated, surround the wetland ecosystem. According to CLC2018, 49,28% of the watershed is under agricultural use while forestlands occupy 42,04%, water 5,46%, discontinuous urban fabric 1,31%, wetlands 1,17%, and developed areas only 0,33%. Most of the land under agricultural use is used as cropland (93,770 ha) while the area of perennial crops such as fruit and olive tree plantations and vineyards as well, account for only 0,65%. Irrigated lands cover 14,27% while non-irrigated 50,97% of croplands. Approximately 30,98% of forestlands are broadleaf forests, 2,73% are pine forests, 12,04% mixed forests, 36,80% shrubs and 10,10% transitional woodlands (Figure 2).
The climate is considered temperate (Csa-Mediterranean mainland) with warm and dry summers and cool winters. The mean annual temperature is 22.6°C in summer and about 4°C in winter. The mean annual rainfall is 593mm according to the records of the last 40 years.

3. Materials and Methods

3.1. Satellite Imagery

Landsat-8 Operational Land Imager (OLI) surface reflectance (C2L2) data were obtained from the United States Geological Survey website [26]. Two scenes (path/row: 183/032 and 184/032) required to cover entirely the study area. Following a search, cloud free (<10%) scenes were carefully selected for summer and winter seasons. Acquisitions dates were 01 July 2018 and 22 June 2018 for dry season and 28 January 2020 and 17 February 2019 for winter season. A mosaic contained six bands (blue, green, red, near infrared (NIR), shortwave infrared (SWIR 1, SWIR 2) was created at the study area limits
Sentinel-2 (L2a) MSI imagery downloaded from the Sentinels Scientific Data Hub [27]. Each product consists of 100x100 sq. km orthorectified granules or tiles. Four cloud free (<10%) granules required to cover entirely the study area, sensed in summer 2018, were selected (Table 1).
The 13 spectral bands of Sentinel-2a span from the visible to SWIR spectrum, at 10m, 20m and 60m spatial resolutions. The bands at 60m spatial resolution are dedicated primarily for detecting atmospheric features. Therefore, they have been excluded from the analysis [28]. A mosaic of ten bands (2-8, 8a, 11 and 12) was created at the watershed limits. Nearest neighbor interpolation was employed to downscale the spatial resolution of 20m bands at 10m. This process has been shown to perform very satisfactory compared to other approaches [29]. Both Landsat 8 OLI and Sentinel-2a image scenes are spatially registered to Universal Transverse Mercator (UTM)/World Geodetic System 1984 (WGS84) projection.

3.2. Land Cover Reference Data

A series of existing land cover reference data were retrieved from existing national databases (Table 2).
Land use/land cover types in 2018 for the entire watershed were obtained from European Copernicus Program (Corine Land Cover product-CLC 2018). In Europe, CORINE Land Cover (CLC) provides harmonized and comprehensive maps of land cover and land use change at European level [30 Buttner, 2014]. The program was established by the European Commission (EU) in 1990 for facilitating policy making at European level. The most recent CLC2018 comprises of 44 thematic classes at the third level with a minimum mapping unit (MMU) of 25 Ha for areal features, and 5 Ha for changes, respectively. It is an excellent tool for strategic analysis and planning at European level. However, CLC’s thematic content comprises a mixture of land cover and land use classes. In addition, its MMU serves well the needs of the European Union but is not suited for national or local detailed land use/land cover mapping [31].
Information regarding plantations and vineyards either irrigated or non-irrigated retrieved from the Land Parcel Identification System (LPIS). However, these data refer only to parcels for which there are individual claims for subsidies made by farmers and receive European Union Aid [32]. Therefore, they do not represent the entire number of cultivated fields within the entire watershed.
Information on habitat types acquired through the national large-scale Natura2000 database. We retrieved spatial-explicit information on the habitat types and vegetation species dominated the wetland.
Forestlands retrieved from the Forest Map national program. Forest Map is a very-high-resolution diagram at the scale of 1:5,000, depicting forests and non-forests, according to the current legislative framework of Greece [33]. Furthermore, we obtained available forest management plans, from the Hellenic Forest Service to retrieve information on (co)dominant forest species at the stand level and land use types within managed forests. However, forestlands within the available plans cover only a portion of forests equal to 26,857ha (28% of forestlands, according to the Forest Map). Forestlands outside plans are mainly unmanaged of different structure and crown cover, distributed across the watershed and comprised of degraded broadleaf forests (mixed or not), evergreen shrubs, and reforested pine forests.

3.3. Sampling Design

A stratified random sampling design is adopted. Copernicus CLC18 product used as the basis for sampling units distribution across all identified classes. Based on these classes, the sample size estimated to be equal to 2,356 for a required thematic accuracy of 85%. The samples distributed randomly, proportional to CLC2018 class area (Table 3).

3.4. Training Data

Based on the above sample distribution, sampling plots were defined at the pixel spatial resolution (30x30m) of Landsat imagery. Each random point was located at the center of the respective pixel using a gridded fishnet on Landsat-8. Each plot was divided into 3x3 pixels to coincide with Sentinel-2a spatial resolution (10m).
Land cover reference data were processed to generate the following seven thematic datasets with the highest accuracy: Based on the Forest Map, we excluded non-forest areas and created a dataset exclusively for forests. In areas, where the forest dataset overlapped with forest management plans, we extracted information on forests (brooadleaf, needleaf or shrubs) at the stand level, based on the dominant species. In wetlands, we excluded forestlands based on the above forest dataset. Then, we create a natural habitats (inland marshes, shrubs, wet meadows and high reeds) dataset, excluding all other land use/land cover types, based on their unique Natura2000 database 4-digit codes. Discontinued urban fabric (small towns and villages) areas extracted from urban zones provided by the Forest Map. A dataset regarding roads and built-up areas generated by processing the cadastral database. Plantation trees (olive, fruit and forest ones) and vineyards extracted from the LPIS database. The last generated dataset consist of crops either irrigated or non-irrigated.
In addition, Google Earth high-resolution (2019) imagery was used for visual interpretation of each plot, based on physiognomic attributes (color, shape, size, pattern and texture). This orthoimagery was the closest existing one to satellite imagery acquisition dates.
Based on the above interpretation, cross-referenced by each of the thematic datasets, each plot was assigned a land use/land cover unique type. Thus, a consistent large database was generated for selection of training data.

3.4. Classification

Four classification methods were applied, one parametric (ML) and three machine learning classifiers, KNN, RF and SVM. All procedures in this study were implemented using the Erdas Imagine 2020 commercial software.
We tested the utility of single-dated (summer 2018) and combination of summer-winter spectral bands of Landsat-8 OLI and Sentinel-2a MSI as data input, developing six different scenarios. Two of them refer to the use of spectral indices (EVI and NDMI) with single-date and seasonal Landsat-8 OLI spectral bands. EVI is sensitive to vegetation intra-annual variations while NDMI is sensitive to moisture content. They both used for different types of vegetation and irrigated fields discrimination. We acknowledged that many other different combinations of spectral and temporal features or approaches could be used. We decided to limit our research to the aforementioned features in our analysis.
In the initial phase, we tested numerous iterations of classifications with different combinations and number of land use/land cover classes on both types of imagery. However, classification performance was unacceptable (<85%) for over ten classes in both types of imagery. Land cover classification accuracy is affected by the number of classes identified. Overall classification accuracy decreases by increasing the number of classes [34].
Therefore, we adopted a classification scheme of 9 classes using Sentinel-2a and 8 classes using Landsat-8 OLI (Table 4). Rare and small-sized classes either grouped to form new classes or integrated to existing large enough classes. For example, vineyards, fruit and olive trees integrated to arable non-irrigated lands. Mineral extraction sites, discontinued urban fabrics, and industrial/commercial classes form a new class entitled as “Artificial Surfaces”. Classes such as, land principally occupied by agriculture, complex cultivated patterns, were deleted. They are generic land use/land cover types and include several other land types.
During the process, we selected 3,535 training data (polygons) for Landsat-8 OLI and 2,753 for Sentinel-2a MSI classification. We defined a set of training polygons by random sampling 70% of the points selected for class validation. This data is then used to train supervised classification algorithms. The remaining 30% of samples was used for classification validation(Table 4).
Training polygons were manually generated at random locations of sample plots taking in account that cover types should be spectrally homogeneous. For this reason, in many cases, we forced to generate training polygons away from sample locations. We avoided long and thin training polygons. Small polygons tend to be prone to edge effect. Moreover, we selected more training polygons in areas where land cover reference data was missing or in highly heterogeneous areas, in order to increase classification accuracy. The generation of training data in areas where land cover reference data are missing proved to be an issue. Their selection was based on our expert knowledge of the study area in relation to spectral data.

3.5. Accuracy Assessment

All classifiers were tested on the entire watershed based on the same training and validation data per sensor. We evaluated classification performance using the Overall Accuracy (OA), Producers’ Accuracy (PA), Users’ Accuracy (UA) and Kappa Coefficient (KC). For accuracy assessment, we selected 1,260 validation points for Landsat-8 OLI and 782 points for Sentinel-2A MSI.

3.5.1. Landsat-8 Scenarios

In scenario 1, we used single date Landsat-8 imagery acquired in summer. The ML classifier produced a slight higher overall accuracy (91.67%) comparing to machine learning classifiers (Table 5). In terms of class accuracy, the best results for ML achieved for water bodies, non-irrigated arable land and shrubs. However, needleleaf forest and high reeds have the lowest user accuracy and K-coefficient. Needleleaf forest is confused with broadleaf forest in mixed-forest areas. Moreover, unsuccessful reforested pine forests are confused with tall sclerophylous vegetation (evergreen shrubs).
Very dense high reeds in wetlands are confused with broadleaf forest in the hillsides showing similar spectral signature. The low producer accuracy reported in high reeds class is also an issue for all machine learning classifiers.
In scenario 2, we used single date Landsat-8 OLI combined with EVI and NDMI in-dices acquired in summer as well. Again, the ML classifier produced the highest overall accuracy (90.95%) over machine learning classifiers (Table 6). The contribution of indices in the classification process is not sufficient. The performance of all machine classifiers is similar and close to ML overall accuracy.
In scenario 3, we used multi-dated Landsat-8 OLI (summer and winter). For the first time, all classifiers have an overall accuracy slightly over 90% (Table 7). In this scenario, KNN produced the highest overall accuracy (91.90%) followed by ML, SVM and RF (Table 7). Accuracy in all classes is improved except those of needleleaf forests and high reeds.
In scenario 4, we used multi-dated Landsat-8 OLI (summer and winter) combined with the respective EVI and NDMI indices. Overall accuracy of all classifications improved but show similar results. The KNN classifier produced the highest overall accuracy (93.02%) (Table 8).

3.5.2. Sentinel-2A

In scenario 1, we used single-dated Sentinel-2a imagery. RF classifier produced slightly higher OA (93.86%) compared to ML and KNN (Table 9) which have similar results. SVM classifier has the lowest overall accuracy. The best results achieved for water bodies, broadleaved forests, needleleaf forests and non-irrigated arable lands. The lowest class producers’ accuracy is observed for roads in SVM and high reeds in KNN. However, high reeds class accuracy was low in SVM as well.
In scenario 2, we used multi-dated Sentinel-2a imagery (summer and winter). The ML classifier produced the highest overall accuracy (96.68%). Machine learning classifiers produced lower but similar results (Table 10). We observed that roads class is confused with artificial surfaces class, especially within urban areas. In addition, the class of shrubs is confused partially with non-irrigated lands, whereas small-sized fields are surrounded by shrubs.

4. Results

Table 11 shows the obtained overall accuracy per classifier and sensor in each developed scenario. In reference to Landsat-8 OLI classification, KNN was the best classifier in scenario 4, achieving the highest OA=93.02%. In this case, KC reached the highest value (0.9227). Under the same scenario, the ML classifier reached the second highest OA=92.46%, followed by SVM with OA=92.30%, and RF with OA=91.67%.
In scenario 4, all classifiers achieved the highest OA among all previous scenarios with minor variations (<1%) due to sufficient training data. Thus, we concluded that the use of multi-dated multispectral seasonal Lansdat-8 OLI data combined with spectral indices increases the performance and overall accuracy of classification for land use/land cover mapping. However, the contribution of spectral indices (EVI and NDMI) in classification performance was not significant (+1%) in all scenarios. The resulted classified maps is presented in Figure 3.
In reference to Sentinel-2a MSI classification, the ML classifier has the highest OA=96.68% under scenario 2, when intra-annual seasonal multispectral data is used (Figure 4). It is observed that the OA of ML is higher than the OA of RF (OA=93.86%) which ranked first in scenario 1. In scenario 2, the SVM classifier produced the highest accuracy (OA=92.84%), followed by KNN (OA=92.71%) and RF (OA=91.82%). We observed a major variation amongst ML and machine learning classifiers performance. The OA of ML is higher (>4%) than machine learning classifiers.

5. Conclusions

The aim of this work was to analyze the performance and accuracy of different classification classifiers (ML KNN, RF and SVM) and evaluate Landsat-8 OLI and Sentinel-2A MSI imagery for the identification and mapping of land use/land cover types in a highly heterogeneous Mediterranean site.
Based on the results, the achieved overall classification accuracies for both satellite imagery was acceptable (>85%) and the performance of selected machine learning classifiers was quite similar and statistically not significant in all scenarios. The best performing classifier is ML using seasonal bi-temporal Sentinel-2a imagery (OA 96.68%). This OA is higher than the best performing classifier KNN (93.02%) using seasonal bi-temporal Landsat-8 OLI imagery combined with spectral indices EVI and NDMI. Following a visual assessment of the respective classified map by the ML classifier, we realize that it overestimates artificial surfaces (Figure 4). Artificial surfaces are confused with high reflectance bare soils or open areas with no vegetation cover across the watershed. However, this finding cannot be supported by the respective confusion matrix (Table 12).
As far the performance of machine learning methods, we believe that more training data are required in case of Sentinel-2a classification. Machine learning methods require enough training samples to make optimum decisions [23]. However, the high spatial variability and spatial structure of the study area (small-sized area and sparsely land cover classes) affects the selection of proper training data.
In terms of class user’s accuracy, the lowest accuracy observed for High reeds (80%) followed by Needleleaf forests (88,37), according to confusion matrix in scenario 2 (Table 12). High reeds are commonly confused with shrubs in the wetlands, whereas they form mixed associations with shrubs (Tamarix sp.). In areas of unsuccessful reforestation with pine trees, needleleaf forests are confused with evergreen shrubs.
In reference to Landsat-8 (scenario 4), the lowest class users’ accuracy observed for Broadleaf forests (84,73%). This can be explained by the existing mixed formations of low in height broadleaf trees with shrubs, which span the watershed. PA’s lowest accuracy 80,33% observed for Needleleaf Forests and 82,76% for High Reeds.
We would recommend the classifier KNN using Landsat-8 imagery for land use/land cover mapping. However, we would be prudent with the application of the ML classifier using Sentinel-2 imagery. Machine learning algorithms are stable and produce similar results in all scenarios.
This paper presents a methodology for testing different classifiers for land use/land cover mapping of a high heterogeneous and complex landscape. It also includes a process for processing available geospatial databases, and a multi-source training data preparation. Integration of intra-annual temporal-spectral data into classification produces land use/land cover maps of high accuracy. This study represents an important step toward multiple-class land use/land cover mapping using spectral-temporal Landsat-8 or Sentinel-2 features by providing a quantitative assessment on classification accuracy. Our work contributes to the evaluation of classification algorithms for updating Copernicus Land Cover product. It documents that major classes at the 3rd level of Copernicus nomenclature, such as urban fabric, roads, irrigated and non-irrigated lands, broadleaf or needleleaf forests, shrubs, water bodies, large enough streams and wetlands vegetation can be classified with high accuracy based on seasonal multispectral data.
We believe that more research is required in the domain of land use/land cover mapping. Future research should be oriented towards the development of novel methods by integrating ancillary geospatial data or by integrating time-series spectral-temporal data into a classification model for land use/land cover mapping.
Table 13. Confusion matrix for scenario 4 (Landsat-8 KNN classification).
Table 13. Confusion matrix for scenario 4 (Landsat-8 KNN classification).
Artificial surfaces Broadleaf forest Needleleaf forest Non-irrigated arable land Permanent freshwater lakes Permanently irrigated land High reeds Shrubs Total PA (%) UA (%) KC
Artificial surfaces 88 0 0 6 0 0 0 0 94 95.65 93.62 0.9311
Broadleaf forest 0 172 7 0 0 2 4 18 203 91.01 84.73 0.8203
Needleleaf forest 0 2 49 0 0 0 0 0 51 80.33 96.08 0.9588
Non-irrigated arable land 3 1 0 350 0 0 0 8 362 95.11 96.69 0.9532
Permanent freshwater lakes 0 0 0 0 56 0 0 0 56 100.0 100.0 1.0000
Permanently irrigated land 0 2 0 0 0 78 0 2 82 95.12 95.12 0.9478
High reeds 0 0 0 0 0 0 24 0 24 82.76 100.0 1.0000
Shrubs 1 12 5 12 0 2 1 355 388 92.69 91.49 0.8778
Total 92 189 61 368 56 82 29 383 1260 93.02 0.9109

Acknowledgments

The research was carried out within the facilities of Hellenic Cadastre. We thank the Hellenic Cadastre and the Hellenic Forest Service for providing access to the national databases used in this study.

References

  1. Di Gregorio, A., M. Henry, E. Donegan, Y. Finegold, J. Latham, I. Jonckheere and R. Cumani, 2016. Classification Concepts, Land Cover Classification System: Software version 3, FAO, Rome, 29p.
  2. Kosmidou, V., Petrou, Z., Bunce, R.G.H., Mücher, C.A., Jongman, R.H.G., Bogers, M.M.B., Lucas, R.M., Tomaselli, V., Blonda, P., Padoa-Schioppa, E., Manakos, I., Petrou, M., 2014. Harmonization of the Land Cover Classification System (LCCS) with the General Habitat Categories (GHC) classification system. Ecol. Indic. 36, 290–300. [CrossRef]
  3. Chen, J., Chen, J., Liao, A., Cao, X., Chen, L., Chen, X., He, C., Han, G., Peng, S., Lu, M., Zhang, W., Tong, X., & Mills, J. (2015). Global land cover mapping at 30 m resolution: A POK-based operational approach. ISPRS J. Photogram. Rem. Sens., 103, 7–27. [CrossRef]
  4. Woodcock, C.E., Allen, R.G., Anderson, M., Belward, A., Bindschadler, R., Cohen, W., Feng, G., Goward, N.S., Helder, D., Helmer, E., Nemani, R., Oreopoulos, L., Schott, J., Thenkabail, S.P., Vermote, F.E., Vogelmann, J., Wulder, A.M. and Wynne, R. Free access to Landsat imagery, Science, 2008, 320 (5879). [CrossRef]
  5. Wulder, M.A., Coops, N.C., Roy, D., White, J.C., and Hermosilla, T. Land Cover 2.0. Int. J. Rem. Sens., 2018. [CrossRef]
  6. Chrysafis, I., Mallinis, G., Gitas, I., Tsakiri-Strati, M. Estimating Mediterranean forest parameters using multi seasonal Landsat 8 OLI imagery and an ensemble learning method. Rem. Sens. Environ., 2017, 199, 154–166. [CrossRef]
  7. Pasquarella, V.J., Holden, C.E., Woodcock, C.E. Improved mapping of forest type using spectral-temporal Landsat features. Rem. Sens. Environ., 2018, 210, 193–207. [CrossRef]
  8. Gounaridis, D., Koukoulas, S. Urban land cover thematic disaggregation, employing datasets from multiple sources and Random Forests modeling, Int. J. Appl. Earth Obs. Geoinf., 2016, 51, 1–10. [CrossRef]
  9. Graesser, J., Ramankutty, N. Detection of cropland field parcels from Landsat imagery, Remote Sens. Environ., 2017, 201, 165–180. [CrossRef]
  10. Belgiu, M., Csillik, O. Sentinel-2 cropland mapping using pixel-based and object-based time-weighted dynamic time warping analysis. Remote Sens. Environ., 2018, 204, 509–523. [CrossRef]
  11. Bhatnagar, S., Gill, L., Regan, S., Naughton, O., Johnston, P., Waldren, S., Ghosh, B. Mapping Vegetation Communities Inside Wetlands Using Sentinel-2 Imagery in Ireland. Int. J. Appl. Earth Obs. Geoinf., 2020, 88, 102083. [CrossRef]
  12. Chatziantoniou, A., Petropoulos, G.P., Psomiadis, E. Co-Orbital Sentinel 1 and 2 for LULC mapping with emphasis on wetlands in a mediterranean setting based on machine learning, Remote Sens., 2017, 9. [CrossRef]
  13. Gómez, C., White, J.C., Wulder, M.A. Optical remotely sensed time series data for land cover classification: A review, ISPRS J. Photogramm. Remote Sens., 2016, 116, 55–72. [CrossRef]
  14. Maxwell, A.E., Warner, T.A., Fang, F. Implementation of machine-learning classification in remote sensing: An applied review, Int. J. Remote Sens., 2018. [CrossRef]
  15. Foody, G. M., Mahesh, P., Rocchini, D., Garzon-Lopez, X.C., Bastin, L. The Sensitivity of Mapping Methods to Reference Data Quality: Training Supervised Image Classifications with Imperfect Reference Data, ISPRS Intern. J. Geo-Inf., 2016, 5, 11, 199. [CrossRef]
  16. Aune-Lundberg, L., Strand, G.H. The content and accuracy of the CORINE Land Cover dataset for Norway. Int. J. Appl. Earth Obs. Geoinf., 2021, 96, 102266. [CrossRef]
  17. Stehman, S. V., and Foody, M.G. Key issues in rigorous accuracy assessment of land cover products, Rem. Sens. Environ., 2019, 231, 111199. [CrossRef]
  18. Jensen, R.J. Thematic information extraction: pattern recognition. In Book Introductory Digital Image Processing, 3rd ed; Clarke, C.K. ed, Pearson Prentice Hall; NJ, USA, 2005, pp. 337-379.
  19. Mountrakis, G., Im, J., Ogole, C. Support vector machines in remote sensing: A review. ISPRS J. Photogramm. Remote Sens., 2011. [CrossRef]
  20. Chirici, G., Mura, M., Mcinerney, D., Py, N., Tomppo, E.O., Waser, L.T., Travaglini, D., Mcroberts, R.E. A meta-analysis and review of the literature on the k-Nearest Neighbors technique for forestry applications that use remotely sensed data. Remote Sens. Environ., 2016, 176, 282–294. [CrossRef]
  21. Belgiu, M., Dragut, L. Random forest in remote sensing: a review of applications and future directions. ISPRS J. Photogram. Remote Sens., 2016, 114, 24–31. [CrossRef]
  22. Yu, L., Liang, L., Wang, J., Zhao, Y., Cheng, Q., Hu, L., Liu, S., Yu, Liang, Wang, X., Zhu, P., Li, Xueyan, Xu, Y., Li, C., Fu, W., Li, Xuecao, Li, W., Liu, C., Cong, N., Zhang, H., Sun, F., Bi, X., Xin, Q., Li, D., Yan, D., Zhu, Z., Goodchild, M.F., Gong, P. Meta-discoveries from a synthesis of satellite-based land-cover mapping research, Int. J. Remote Sens., 2014. [CrossRef]
  23. Heydari, S.S., Mountrakis, G. Effect of classifier selection, reference sample size, reference class distribution and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites, Remote Sens. Environ., 2018, 204. [CrossRef]
  24. Noi, T.P., Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery, Sensors 2017, 18. [CrossRef]
  25. Chaves, M.E.D., Picoli, M.C.A., Sanches, I.D. Recent applications of Landsat 8/OLI and Sentinel-2/MSI for land use and land cover mapping: A systematic review, Remote Sens., 2020, 12. [CrossRef]
  26. Landsat-8 C2L2 images courtesy of the U.S. Geological Survey, http://earthexplorer.usgs.gov.
  27. Copernicus Sentinel data 2018 for Sentinel data, European Space Agency–ESA, produced from ESA remote sensing data, https://scihub.copernicus.eu/dhus/#/home.
  28. Drusch, M., Del Bello, U., Carlier, S., Colin, O., Fernandez, V., Gascon, F., Hoersch, B., Isola, C., Laberinti, P., Martimort, P., Meygret, A., Spoto, F., Sy, O., Marchese, F., Bargellini, P. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services, J. Remote. Sens. Environ., 2012, 120, 25–36. [CrossRef]
  29. Zheng, H., Du, P. Chen, J., Xia, J., Li, E., Xu, Z., Li, X., Yokoya, N. Performance evaluation of downscaling sentinel-2 imagery for land use and land cover classification by spectral-spatial features, Remote. Sens, 2017, 9(12), 1274. [CrossRef]
  30. Büttner, G. Corine Land Cover and Land Change Products. In Book Land Use and Land Cover Mapping in Europe: Practices and Trends, I. Manakos and M. Braun, eds.; Springer, NY, USA, 2014; 18, p. 55-75.
  31. Kuntz, S., Schmeer, E., Jochum, M., Smith, G. Towards an European land cover monitoring service and high-resolution layers. In Book Remote Sensing and Digital Image Processing, Freek D. van der Meer, ed; Springer, NY, USA, 2014; 18, p. 43–52.
  32. Sagris, V., Devos, W. Core Conceptual Model for Land Parcel Identification System (LCM): GeoCAP Technical Specification, Version 1.1., 2009, Office for Official Publications of the European Communities, Luxembourg.
  33. Vogiatzis, M. Cadastral Mapping of Forestlands in Greece, Photogram. Eng. Remote Sens., 2008, 74:39-46. [CrossRef]
  34. Thinh, T.V., Duong, C.P., Nasahara, N.K., Tadono, T. How does land use/land cover map’s accuracy depend on number of classification classes?, SOLA, 2019, 15:28-31. [CrossRef]
Figure 1. Location of study area (in blue) over VHR orthoimage of 2015.
Figure 1. Location of study area (in blue) over VHR orthoimage of 2015.
Preprints 79540 g001
Figure 2. CLC2018 Land use/land cover types.
Figure 2. CLC2018 Land use/land cover types.
Preprints 79540 g002
Figure 3. Land use/land cover map 2018 (KNN, Landsat-8 OLI).
Figure 3. Land use/land cover map 2018 (KNN, Landsat-8 OLI).
Preprints 79540 g003
Figure 4. Land use/land cover map 2018 (ML, Sentinel-2a).
Figure 4. Land use/land cover map 2018 (ML, Sentinel-2a).
Preprints 79540 g004
Table 1. Landsat-8 OLI & Sentinel-2a MSI scenes.
Table 1. Landsat-8 OLI & Sentinel-2a MSI scenes.
Satellite Date Granule
Landsat-8 OLI 06-07.2018 LC08_L2SP_183032_20180701_20200831_02_T1
LC08_L2SP_184032_20180622_20200831_02_T1
28.01.2020/17.02.2019 LC08_L2SP_183032_20200128_20200823_02_T1
LC08_L2SP_184032_20190217_20200829_02_T1
Semtinel-2a MSI 03.07.2018 L2A_T34TFK_A015820_20180703T092224
L2A_T34TFL_A015820_20180703T092224
L2A_T34TGK_A015820_20180703T092224
L2A_T34TGL_A015820_20180703T092224
05.12.2017 L2A_T34TFK_A012817_20171205T092648
L2A_T34TFL_A012817_20171205T092648
12.12.2017 L2A_T34TGK_A012917_20171212T091748
L2A_T34TGL_A012917_20171212T091748
Table 2. Geospatial land cover reference data.
Table 2. Geospatial land cover reference data.
Datasts Source Date Scale Data provider
Agricultural fields LPIS 2018 1:5.000 HC1
Habitats Natura2000 2017 1: 5.000 HC
Urban zones Forest Map 2021 1:5.000 HC
Forest/Non forest lands Forest Map 2021 1:5.000 HC
Forest Stands Forest Management Plans 2007-2018 1:20.000 HFS2
Built-up areas and roads Cadastral database 2021 1:1.000 HC
1 Hellenic Cadastre, 2 Hellenic Forest Service.
Table 3. Sample distribution per CLC2018 class.
Table 3. Sample distribution per CLC2018 class.
No Code Class Area (ha) # Sample points
1 112 Discontinuous urban fabric 2,501.82 37
2 121 Industrial or commercial zones 443.74 6
3 122 Road and rail networks and associated land 752.93 11
4 131 Mineral extraction sites 161.03 2
5 211 Non-irrigated arable land 47,798.64 699
6 212 Permanently irrigated arable land 13,381.79 196
7 221 Vineyards 129.21 2
8 222 Fruit trees 71.24 1
9 223 Olive trees 404.69 6
10 231 Pastures 1,092.00 16
11 242 Complex cultivation patterns 5,083.44 74
12 243 Land principally occupied by agriculture 25,811.16 378
13 311 Broadleaf forest 24,783.86 362
14 312 Coniferous forest 2,180.58 31
15 313 Mixed forest 9,630.05 141
16 321 Natural grasslands 5,878.48 85
17 323 Sclerophyllous vegetation 29,441.81 431
18 324 Transitional woodland/shrub 8,082.38 117
19 331 Beaches, dunes, sand 572.6 8
20 411 Inland marshes 1,660.55 24
21 512 Water bodies 10,397.98 152
Total 190,285.09 2,356
Table 4. Description of training data and control points.
Table 4. Description of training data and control points.
Satellite imagery Classes Training data No of pixels % of total pixels Control points
Landsat-8 OLI Artificial surfaces 170 3,771 0.18% 92
Non-irrigated arable land 650 21,726 1.03% 368
Permanently irrigated land 177 7,433 0.35% 82
Broadleaf forest 490 16,169 0.76% 189
Needleleaf forest 98 1,486 0.07% 61
Shrubs 473 16,664 0.79% 383
Reeds 130 497 0.02% 56
Permanent freshwater lakes 87 980 0.05% 29
Total 2,275 68,726 3.25% 1,260
Study Area 2,116,028
Sentinel-2 MSI Artificial surfaces 83 6,088 0.03% 41
Non-irrigated arable land 594 112,992 0.59% 254
Permanently irrigated land 139 12,449 0.07% 45
Broadleaf forest 459 104,017 0.55% 113
Needleleaf forest 89 2,637 0.01% 40
Shrubs 385 29,562 0.16% 195
Reeds 62 6,739 0.04% 17
Roads 30 1,017 0.01% 23
Permanent freshwater lakes 130 116,113 0.61% 54
Total 1,971 391,614 2.06% 782
Study Area 19,050,300
Table 5. Landsat-8 scenario 1 classification accuracy metrics.
Table 5. Landsat-8 scenario 1 classification accuracy metrics.
Classes PA UA KC
ML Artificial surfaces 92.39% 83.33% 0.8202
Broadleaf forest 84.66% 93.02% 0.9179
Needleleaf forest 93.44% 76.00% 0.7478
Non-irrigated arable land 93.21% 96.08% 0.9446
Permanent freshwater lakes 96.43% 100.00% 1,0000
Permanently irrigated land 93.90% 96.25% 0.9599
High reeds 100.00% 64.44% 0.6361
Shrubs 91.38% 93.33% 0.9042
Overall accuracy 91.67% 0.8946
KNN Artificial surfaces 82.61% 90.48% 0.8973
Broadleaf forest 89.95% 79.07% 0.7538
Needleleaf forest 81.97% 86.21% 0.8551
Non-irrigated arable land 91.58% 93.09% 0.9024
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 86.59% 92.21% 0.9167
High reeds 34.48% 100.00% 1.0000
Shrubs 91.91% 88.44% 0.8339
Overall accuracy 89.05% 0.8598
RF Artificial surfaces 81.52% 88.24% 0.8731
Broadleaf forest 89.42% 87.11% 0.8484
Needleleaf forest 85.25% 89.66% 0.8913
Non-irrigated arable land 92.12% 91.37% 0.8782
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 89.02% 86.90% 0.8599
High reeds 75.86% 84.62% 0.8425
Shrubs 89.82% 89.12% 0.8437
Overall accuracy 89.68% 0.8684
SVM Artificial surfaces 85.87% 96.34% 0.9605
Broadleaf forest 91.53% 77.58% 0.7362
Needleleaf forest 63.93% 90.70% 0.9022
Non-irrigated arable land 97.01% 91.54% 0.8805
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 89.02% 93.59% 0.9314
High reeds 0.00% 0.00% -0.0236
Shrubs 87.99% 87.08% 0.8144
Overall accuracy 88.41% 0.8509
Table 6. Landsat-8 scenario 2 classification accuracy metrics.
Table 6. Landsat-8 scenario 2 classification accuracy metrics.
Classes PA UA KC
ML Artificial surfaces 94.57% 82.08% 0.8066
Broadleaf forest 84.13% 91.91% 0.9048
Needleleaf forest 95.08% 73.42% 0.7207
Non-irrigated arable land 90.22% 96.51% 0.9507
Permanent freshwater lakes 98.21% 100.00% 1.0000
Permanently irrigated land 95.12% 92.86% 0.9236
Reeds 96.55% 73.68% 0.7306
Shrubs 91.12% 91.60% 0.8793
Overall accuracy 90.95% 0.8857
KNN Artificial surfaces 82.61% 91.57% 0.9090
Broadleaf forest 89.42% 80.09% 0.7658
Needleleaf forest 83.61% 83.61% 0.8277
Non-irrigated arable land 91.03% 93.31% 0.9056
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 87.80% 92.31% 0.9177
Reeds 31.03% 100.00% 1.0000
Shrubs 92.69% 88.09% 0.8289
Overall accuracy 89.13% 0.8608
RF Artificial surfaces 82.61% 92.68% 0.9211
Broadleaf forest 88.89% 87.05% 0.8476
Needleleaf forest 85.25% 89.66% 0.8913
Non-irrigated arable land 93.21% 91.47% 0.8795
Permanent freshwater lakes 100.00% 100.00% 1,0000
Permanently irrigated land 89.02% 86.90% 0.8599
Reeds 72.41% 87.50% 0.8721
Shrubs 91.38% 90.21% 0.8593
Overall accuracy 90.40% 0.8773
SVM Artificial surfaces 85.87% 95.18% 0.9480
Broadleaf forest 92.06% 84.47% 0.8172
Needleleaf forest 73.77% 88.24% 0.8764
Non-irrigated arable land 96.74% 91.75% 0.8835
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 90.24% 93.67% 0.9323
Reeds 0.00% 0.00% -0.0236
Shrubs 89.30% 86.36% 0.8041
Overall accuracy 89.37% 0.8632
Table 7. Landsat-8 scenario 3 classification accuracy metrics.
Table 7. Landsat-8 scenario 3 classification accuracy metrics.
Classes PA UA KC
ML Artificial surfaces 96.74% 80.18% 0.7862
Broadleaf forest 83.07% 92.90% 0.9165
Needleleaf forest 98.36% 66.67% 0.6497
Non-irrigated arable land 93.75% 98.57% 0.9798
Permanent freshwater lakes 98.21% 100.00% 1.0000
Permanently irrigated land 98.78% 94.19% 0.9378
High reeds 100.00% 74.36% 0.7375
Shrubs 88.77% 94.44% 0.9202
Overall accuracy 91.75% 0.8962
KNN Artificial surfaces 93.48% 92.47% 0.9188
Broadleaf forest 91.53% 80.84% 0.7746
Needleleaf forest 78.69% 96.00% 0.9580
Non-irrigated arable land 94.57% 96.67% 0.9529
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 95.12% 96.30% 0.9604
High reeds 68.97% 86.96% 0.8665
Shrubs 91.12% 91.12% 0.8725
Overall accuracy 91.90% 0.8968
RF Artificial surfaces 78.26% 90.00% 0.8921
Broadleaf forest 87.30% 89.19% 0.8728
Needleleaf forest 85.25% 92.86% 0.9249
Non-irrigated arable land 95.38% 87.97% 0.8301
Permanent freshwater lakes 100.00% 98.25% 0.9816
Permanently irrigated land 96.34% 87.78% 0.8693
High reeds 65.52% 79.17% 0.7868
Shrubs 87.99% 91.33% 0.8754
Overall accuracy 90.40% 0.8773
SVM Artificial surfaces 82.61% 93.83% 0.9334
Broadleaf forest 84.13% 86.41% 0.8402
Needleleaf forest 78.69% 92.31% 0.9192
Non-irrigated arable land 96.20% 92.91% 0.8999
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 96.34% 84.04% 0.8293
High reeds 79.31% 88.46% 0.8819
Shrubs 90.34% 89.64% 0.8511
Overall accuracy 90.56% 0.8793
Table 8. Landsat-8 scenario 4 classification accuracy metrics.
Table 8. Landsat-8 scenario 4 classification accuracy metrics.
Classes PA UA KC
ML Artificial surfaces 100.00% 80.70% 0.7918
Broadleaf forest 83.60% 94.61% 0.9366
Needleleaf forest 96.72% 72.84% 0.7146
Non-irrigated arable land 92.66% 98.27% 0.9756
Permanent freshwater lakes 98.21% 100.00% 1.0000
Permanently irrigated land 98.78% 92.05% 0.9149
High reeds 100.00% 78.38% 0.7787
Shrubs 91.38% 94.34% 0.9187
Overall accuracy 92.46% 0.9050
KNN Artificial surfaces 95.65% 93.62% 0.9311
Broadleaf forest 91.01% 84.73% 0.8203
Needleleaf forest 80.33% 96.08% 0.9588
Non-irrigated arable land 95.11% 96.69% 0.9532
Permanent freshwater lakes 100.00% 100.00% 1,0000
Permanently irrigated land 95.12% 95.12% 0.9478
High reeds 82.76% 100.00% 1,0000
Shrubs 92.69% 91.49% 0.8778
Overall accuracy 93.02% 0.9109
RF Artificial surfaces 78.26% 93.51% 0.9300
Broadleaf forest 87.83% 90.71% 0.8907
Needleleaf forest 85.25% 91.23% 0.9078
Non-irrigated arable land 96.20% 90.77% 0.8696
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 97.56% 89.89% 0.8918
High reeds 93.10% 90.00% 0.8976
Shrubs 90.86% 92.06% 0.8860
Overall accuracy 91.67% 0.8936
SVM Artificial surfaces 88.04% 96.43% 0.9615
Broadleaf forest 85.71% 90.50% 0.8883
Needleleaf forest 81.97% 92.59% 0.9222
Non-irrigated arable land 96.20% 94.15% 0.9174
Permanent freshwater lakes 100.00% 100.00% 1.0000
Permanently irrigated land 97.56% 90.91% 0.9028
High reeds 96.55% 87.50% 0.8721
Shrubs 91.91% 90.03% 0.8567
Overall accuracy 92.30% 0.9017
Table 9. Sentinel-2a scenario 1 classification accuracy metrics.
Table 9. Sentinel-2a scenario 1 classification accuracy metrics.
Classes PA UA KC
ML Artificial surfaces 92.68% 88.37% 0.8773
Broadleaf forest 94.69% 99.07% 0.9892
Needleleaf forest 97.50% 79.59% 0.7849
Non-irrigated arable land 89.76% 97.85% 0.9682
Water bodies 96.30% 100.00% 1.0000
Permanently irrigated land 100.00% 93.75% 0.9337
High reeds 94.12% 94.12% 0.9399
Roads 100.00% 79.31% 0.7868
Shrubs 93.85% 90.15% 0.8687
Overall accuracy 93.48% 0.9188
KNN Artificial surfaces 75.61% 100.00% 1.0000
Broadleaf forest 96.46% 95.61% 0.9487
Needleleaf forest 95.00% 84.44% 0.8361
Non-irrigated arable land 96.06% 92.78% 0.8930
Water bodies 100.00% 100.00% 1.0000
Permanently irrigated land 97.78% 93.62% 0.9323
High reeds 52.94% 81.82% 0.8141
Roads 86.96% 86.96% 0.8656
Shrubs 93.33% 93.81% 0.9176
Overall accuracy 93.48% 0.9178
RF Artificial surfaces 82.93% 89.47% 0.8889
Broadleaf forest 97.35% 94.83% 0.9395
Needleleaf forest 97.50% 95.12% 0.9486
Non-irrigated arable land 96.06% 91.73% 0.8775
Water bodies 100.00% 100.00% 1.0000
Permanently irrigated land 91.11% 93.18% 0.9277
High reeds 94.12% 94.12% 0.9399
Roads 73.91% 94.44% 0.9428
Shrubs 91.79% 95.21% 0.9362
Overall accuracy 93.86% 0.9227
SVM Artificial surfaces 92.68% 84.44% 0.8358
Broadleaf forest 97.35% 84.62% 0.8202
Needleleaf forest 75.00% 78.95% 0.7781
Non-irrigated arable land 94.49% 93.02% 0.8967
Water bodies 98.15% 98.15% 0.9801
Permanently irrigated land 86.67% 97.50% 0.9735
High Reeds 58.82% 90.91% 0.9071
Roads 65.22% 93.75% 0.9356
Shrubs 89.23% 91.58% 0.8878
Overall accuracy 90.66% 0.8824
Table 10. Sentinel-2a scenario 2 classification accuracy metrics.
Table 10. Sentinel-2a scenario 2 classification accuracy metrics.
Classes PA UA KC
ML Artificial surfaces 95.12% 92.86% 0.9246
Broadleaf forest 98.23% 98.23% 0.9793
Needleleaf forest 95.00% 88.37% 0.8775
Non-irrigated arable land 97.24% 99.20% 0.9881
Water bodies 96.30% 100.00% 1.0000
Permanently irrigated land 100.00% 97.83% 0.9769
High reeds 94.12% 80.00% 0.7956
Roads 100.00% 92.00% 0.9176
Shrubs 94.87% 96.35% 0.9514
Overall accuracy 96.68% 0.9584
KNN Artificial surfaces 80.49% 97.06% 0.9690
Broadleaf forest 94.69% 88.43% 0.8648
Needleleaf forest 95.00% 92.68% 0.9229
Non-irrigated arable land 96.06% 93.13% 0.8982
Water bodies 100.00% 100.00% 1.0000
Permanently irrigated land 88.89% 83.33% 0.8232
High reeds 94.12% 88.89% 0.8864
Roads 86.96% 90.91% 0.9063
Shrubs 88.72% 95.05% 0.9341
Overall accuracy 92.71% 0.9085
RF Artificial surfaces 90.24% 82.22% 0.8124
Broadleaf forest 93.81% 90.60% 0.8901
Needleleaf forest 95.00% 82.61% 0.8167
Non-irrigated arable land 97.24% 90.81% 0.8639
Water bodies 100.00% 100.00% 1.0000
Permanently irrigated land 91.11% 83.67% 0.8268
High reeds 88.24% 93.75% 0.9361
Roads 65.22% 100.00% 1.0000
Shrubs 84.62% 98.21% 0.9762
Overall accuracy 91.82% 0.8972
SVM Artificial surfaces 90.24% 84.09% 0.8321
Broadleaf forest 95.58% 86.40% 0.841
Needleleaf forest 92.50% 94.87% 0.946
Non-irrigated arable land 96.06% 95.69% 0.9361
Water bodies 98.15% 98.15% 0.9801
Permanently irrigated land 95.56% 93.48% 0.9308
High reeds 88.24% 83.33% 0.8296
Roads 73.91% 80.95% 0.8038
Shrubs 88.21% 95.56% 0.9408
Overall accuracy 92.84% 0.9813
Table 11. Overall classification accuracy per scenario.
Table 11. Overall classification accuracy per scenario.
Classification scenarios (2018) Bands / Indices Seasons Type of classifier OA KC
Landsat-8 OLI Scenario 1 6 bands (B2-B7) S ML 91.67% 0.8946
K-NN 89.05% 0.8598
RF 89.68% 0.8684
SVM 88.41% 0.8509
Scenario 2 6 bands (B2-B7)+EVI + NDMI S ML 90.95% 0.8857
KNN 89.13% 0.8608
RF 90.40% 0.8773
SVM 89.37% 0.8632
Scenario 3 12 bands (B2-B7) S+W ML 91.75% 0.8962
KNN 91.90% 0.8968
RF 89.76% 0.8692
SVM 90.56% 0.8793
Scenario 4 12 bands (B2-B7+ EVI + NDMI S+W ML 92.46% 0.9050
KNN 93.02% 0.9109
RF 91.67% 0.8936
SVM 92.30% 0.9017
Sentinel-2A Scenario 1 10 bands (2-8, 8A, 11-12) S ML 93.48% 0.9188
KNN 93.48% 0.9178
RF 93.86% 0.9227
SVM 90.66% 0.8824
Scenario 2 10 bands (2-8, 8A, 11-12) S+W ML 96.68% 0.9584
KNN 92.71% 0.9085
RF 91.82% 0.8972
SVM 92.84% 0.9103
Table 12. Confusion matrix for the best scenario 2 (Sentinel-2a ML classification).
Table 12. Confusion matrix for the best scenario 2 (Sentinel-2a ML classification).
Artificial surfaces Broadleaf forest Needleleaf forest Non-irrigated arable land Permanent freshwater lakes Permanently irrigated land High reeds Roads Shrubs Total PA (%) UA (%) KC
Artificial surfaces 39 0 0 1 1 0 1 0 0 42 95.12 92.86 0.9246
Broadleaf forest 0 111 0 0 0 0 0 0 2 113 98.23 98.23 0.9793
Needleleaf forest 0 2 38 0 0 0 0 0 3 43 95.00 88.37 0.8775
Non-irrigated arable land 0 0 0 247 1 0 0 0 1 249 97.24 99.20 0.9881
Permanent freshwater lakes 0 0 0 0 52 0 0 0 0 52 96.30 100.0 1.0000
Permanently irrigated land 0 0 0 1 0 45 0 0 0 46 100.0 97.83 0.9769
High reeds 0 0 0 0 0 0 16 0 4 20 94.12 80.00 0.7956
Roads 2 0 0 0 0 0 0 23 0 25 100.0 92.00 0.9176
Shrubs 0 0 2 5 0 0 0 0 185 192 94.87 96.35 0.9514
Total 41 113 40 254 54 45 17 23 195 782 96.68 0.9584
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated