Comparison of Pixel-Based Classification Algorithms Using Landsat-8 OLI and Sentinel-2 MSI for Land Use/Land Cover Mapping in a Heterogeneous Landscape

Preprint

Article

Comparison of Pixel-Based Classification Algorithms Using Landsat-8 OLI and Sentinel-2 MSI for Land Use/Land Cover Mapping in a Heterogeneous Landscape

Altmetrics

Downloads

211

Views

Comments

Moschos Vogiatzis^*,Ioannis Eleftheriadis

This version is not peer-reviewed

Submitted:

14 July 2023

Posted:

17 July 2023

Read the latest preprint version here

Alerts

Abstract

Satellite-based data classification performance remains a challenge for research community in the field of land use/land cover mapping. Here we investigated supervised per-pixel classifications performance under different scenarios, based on single and seasonal multispectral data combina-tions of different sensors (Landsat-8 OLI and Sentinel-2 MSI). In case of Landsat, seasonal spectral indices (EVI and NDMI) were included. A typical Mediterranean watershed with a complex landscape comprised of various forest and wetland ecosystems, crops, artificial surfaces, and lake water was selected to test our approach. All available geospatial data from national databases (Forest Map, LPIS, Natura2000 habitats, cadastral parcels, etc.) are used as ancillary data for clas-sification training and validation. We examined and compared the performance of ML, RF, KNN and SVM classifiers under different scenarios for land use/land cover mapping, according to Co-pernicus Land Cover nomenclature. In total, eight land use/land cover classes were identified in Landsat-8 OLI and nine in Sentinel-2 MSI for an acceptable overall accuracy over 85%. A com-parison of the overall classification accuracies shows that Sentinel-2 overall accuracy was slightly higher than Landsat-8 (96.68% vs. 93.02%). Respectively, the best-performed algorithm was ML in Sentinel-2 while in Landsat-8 was KNN. However, machine-learning algorithms have similar results regardless the type of sensor. We concluded that best classification performances achieved using seasonal multispectral data. Future research should be oriented towards integrating time-series multispectral data of different sensors and geospatial ancillary data for land use/land cover mapping.

Keywords:

Subject: Environmental and Earth Sciences - Remote Sensing

1. Introduction

Land cover represents the characteristics of earth surface shaped by various natural agents or anthropogenic interventions. From earth-observation perspective, the term “land cover” defines the land types (i.e. vegetation, water bodies, crops, built-up areas, etc.) which can be detected from a distance. Land cover is a critical variable for earth surface studies since it can be changed over time [1]. On the contrary, the term “land use” refers to the way a particular land is used involving the associated economic purpose of this use [2]. Both concepts are interrelated. For example, a land cover type such as a forest may support a series of land uses (e.g. timber production, recreation, rangeland, etc.) while a land use such as agroforestry may include a series of land cover types (e.g. forests, plantations, annual crops, etc.). In this research, they are used complementary to depict all kind of existing land cover or land use types within the study area.

Land use/land cover information is essential for management and monitoring of natural resources, modeling, spatial planning, land administration and sound decision-making. Satellite-based classification provides land use/land cover spatial-explicit information and map generation at global, national or regional scales. However, medium-resolution (Landsat-like, 10–30 m) is more adequate to detect most human–nature interactions [3]. The opening of the Landsat archive in 2008 [4] and the launch of Sentinel-2 in 2015 provide optical multispectral imagery data at medium-high resolution. Free and open access policy on this imagery promoted the development of new products and applications across space and time, especially in the domain of land use/land cover mapping. This data policy combined with the increase of computing power and concurrent reduction of costs, has facilitated large area mapping and expanded the number of users worldwide [5].

Due to the complexity of land use/land cover characterization, several studies have been mainly focused to methods for mapping a single land cover type (i.e. forests, wetlands, forest fires, agriculture, urban areas or water). For example, [6,7] for forests, [8] for urban areas, [9,10] for croplands, [11,12] for wetlands, using either Landsat or Sentinel imagery. However, multiple-class characterization required for simultaneous and spatially exhaustive mapping [13]. Thus, effective and efficient methods are required for satellite imagery classification to provide meaningful information regarding all land use/land cover within a specific area.

A variety of classification approaches (unsupervised, supervised, parametric, non-parametric, object-oriented) has been developed and applied to derive land cover information with different degree of success. Per-pixel classification approaches remain the most popular in the analysis of satellite-derived imagery [14]. Here, we used supervised per-pixel classifications for multiple land use/land cover types mapping.

In supervised approaches, reference data are required to characterize the variability of land cover across space and time and serve as reference dataset for training and validating classification models. A suitable reference data is a fundamental requirement in supervised image classification [15]. We use existing authoritative geospatial datasets of higher accuracy as a pool for training and validation. The reference datasets spans forestlands, cultivated fields, discontinuous urban fabric, built-up areas, and wetland habitat types. The classification scheme of land cover classes is based on Copernicus Land Cover (CLC) nomenclature [16]. Based on CLC2018 land use/land cover distribution, a stratified random sampling scheme is deployed to train the classifier and access classification accuracy. Classification accuracy depends on the satellite imagery, the classification algorithm being used, and the nature of training data as well [17].

Four popular classifiers ML, RF, KNN and SVM selected, and their implementation in Erdas Imagine 2020 was used to run the experiments. Description and analysis on these classifiers can be found in the literature. For example, [18] for Bayesian classifiers; [19] for SVM; [20] for KNN; [21] for Random Forests.

Maximum likelihood method is included in our research due to its wide application and use in commercial image-processing software [22]. On the other hand, the above machine learning algorithms have gained great attention for classifying land use/land cover types in the last decade.

In recent evaluations, SVM and KNN, with the exception of Naïve Bayes (a Maximum Likelihood variant) performed similarly in per-pixel classification of 26 Landsat TM imagery 10kmx10km blocks [23]. In a peri-urban and rural with heterogeneous land cover area in Vietnam, SVM produced the highest overall accuracy (OA) using Sentinel-2 MSI, followed consecutively by RF and KNN. However, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough (>750 pixels/class) [24].

In this study, we test the above mentioned classifiers to derive land use/land cover information. We explore classification performance under six different scenarios. The study investigates the performance of the above classifiers using Landsat-8 OLI and Sentinel-2 MSI across a heterogeneous Mediterranean watershed, based on the same available land cover reference, training and validation data per sensor. In case of Landsat, spectral indices (EVI and NDMI) were included. These indexes have been reported in the literature that improve land use/land cover classifications accuracies [25]. We evaluate classification performance for an area with complex landscape and investigate how single date and seasonal optical multispectral data impact land use/land cover classification accuracy.

2. Study Area

The test site is Mygdonia basin, which is located in Northern Greece. Its watershed covers an area of 190,285ha. It lies East of Thessaloniki city (40˚40΄56,49N and 23˚18΄21,15E, WGS84) at a distance of 41,6 km (Figure 1). The watershed is surrounded by mountains in the North (Mount Krousia) and in the South (Mount Cholomontas), by hills in the West and by Rentina Gorge and Kerdylia mount in the East. Its elevation ranges from 35m to 1,129m. At the center of the watershed, there are two lakes (Koronia and Volvi). The watershed is drained through seasonal and intermittent streams at these lakes.

The two lakes along with their surrounding wetlands are listed as a Wetland of International Importance by the Ramsar Convention since 1975 (GR005: 16,388ha). Along with the valley of Rentina Gorge, they have been designated as Special Conservation Zones within the Natura2000 network (GR1220001 and GR1220003: 28,734.90ha) in 2017. These protected sites constitute a unique complex of interconnected natural ecosystems of lakes, seasonal streams, channels, riparian forests, shrubs, wet meadows and fields.

At the southernmost end of the watershed, on the slopes of mount Cholomontas, there is a portion of another protected area (GR1270001). It has an area of 15,651.14ha dominated by beech, oak and pine forests.

Non-irrigated arable lands are distributed across the watershed up to the productive forests in the south. Intensively cultivated lands, mainly irrigated, surround the wetland ecosystem. According to CLC2018, 49,28% of the watershed is under agricultural use while forestlands occupy 42,04%, water 5,46%, discontinuous urban fabric 1,31%, wetlands 1,17%, and developed areas only 0,33%. Most of the land under agricultural use is used as cropland (93,770 ha) while the area of perennial crops such as fruit and olive tree plantations and vineyards as well, account for only 0,65%. Irrigated lands cover 14,27% while non-irrigated 50,97% of croplands. Approximately 30,98% of forestlands are broadleaf forests, 2,73% are pine forests, 12,04% mixed forests, 36,80% shrubs and 10,10% transitional woodlands (Figure 2).

The climate is considered temperate (Csa-Mediterranean mainland) with warm and dry summers and cool winters. The mean annual temperature is 22.6°C in summer and about 4°C in winter. The mean annual rainfall is 593mm according to the records of the last 40 years.

3. Materials and Methods

3.1. Satellite Imagery

Landsat-8 Operational Land Imager (OLI) surface reflectance (C2L2) data were obtained from the United States Geological Survey website [26]. Two scenes (path/row: 183/032 and 184/032) required to cover entirely the study area. Following a search, cloud free (<10%) scenes were carefully selected for summer and winter seasons. Acquisitions dates were 01 July 2018 and 22 June 2018 for dry season and 28 January 2020 and 17 February 2019 for winter season. A mosaic contained six bands (blue, green, red, near infrared (NIR), shortwave infrared (SWIR 1, SWIR 2) was created at the study area limits

Sentinel-2 (L2a) MSI imagery downloaded from the Sentinels Scientific Data Hub [27]. Each product consists of 100x100 sq. km orthorectified granules or tiles. Four cloud free (<10%) granules required to cover entirely the study area, sensed in summer 2018, were selected (Table 1).

The 13 spectral bands of Sentinel-2a span from the visible to SWIR spectrum, at 10m, 20m and 60m spatial resolutions. The bands at 60m spatial resolution are dedicated primarily for detecting atmospheric features. Therefore, they have been excluded from the analysis [28]. A mosaic of ten bands (2-8, 8a, 11 and 12) was created at the watershed limits. Nearest neighbor interpolation was employed to downscale the spatial resolution of 20m bands at 10m. This process has been shown to perform very satisfactory compared to other approaches [29]. Both Landsat 8 OLI and Sentinel-2a image scenes are spatially registered to Universal Transverse Mercator (UTM)/World Geodetic System 1984 (WGS84) projection.

3.2. Land Cover Reference Data

A series of existing land cover reference data were retrieved from existing national databases (Table 2).

Land use/land cover types in 2018 for the entire watershed were obtained from European Copernicus Program (Corine Land Cover product-CLC 2018). In Europe, CORINE Land Cover (CLC) provides harmonized and comprehensive maps of land cover and land use change at European level [30 Buttner, 2014]. The program was established by the European Commission (EU) in 1990 for facilitating policy making at European level. The most recent CLC2018 comprises of 44 thematic classes at the third level with a minimum mapping unit (MMU) of 25 Ha for areal features, and 5 Ha for changes, respectively. It is an excellent tool for strategic analysis and planning at European level. However, CLC’s thematic content comprises a mixture of land cover and land use classes. In addition, its MMU serves well the needs of the European Union but is not suited for national or local detailed land use/land cover mapping [31].

Information regarding plantations and vineyards either irrigated or non-irrigated retrieved from the Land Parcel Identification System (LPIS). However, these data refer only to parcels for which there are individual claims for subsidies made by farmers and receive European Union Aid [32]. Therefore, they do not represent the entire number of cultivated fields within the entire watershed.

Information on habitat types acquired through the national large-scale Natura2000 database. We retrieved spatial-explicit information on the habitat types and vegetation species dominated the wetland.

Forestlands retrieved from the Forest Map national program. Forest Map is a very-high-resolution diagram at the scale of 1:5,000, depicting forests and non-forests, according to the current legislative framework of Greece [33]. Furthermore, we obtained available forest management plans, from the Hellenic Forest Service to retrieve information on (co)dominant forest species at the stand level and land use types within managed forests. However, forestlands within the available plans cover only a portion of forests equal to 26,857ha (28% of forestlands, according to the Forest Map). Forestlands outside plans are mainly unmanaged of different structure and crown cover, distributed across the watershed and comprised of degraded broadleaf forests (mixed or not), evergreen shrubs, and reforested pine forests.

3.3. Sampling Design

A stratified random sampling design is adopted. Copernicus CLC18 product used as the basis for sampling units distribution across all identified classes. Based on these classes, the sample size estimated to be equal to 2,356 for a required thematic accuracy of 85%. The samples distributed randomly, proportional to CLC2018 class area (Table 3).

3.4. Training Data

Based on the above sample distribution, sampling plots were defined at the pixel spatial resolution (30x30m) of Landsat imagery. Each random point was located at the center of the respective pixel using a gridded fishnet on Landsat-8. Each plot was divided into 3x3 pixels to coincide with Sentinel-2a spatial resolution (10m).

Land cover reference data were processed to generate the following seven thematic datasets with the highest accuracy: Based on the Forest Map, we excluded non-forest areas and created a dataset exclusively for forests. In areas, where the forest dataset overlapped with forest management plans, we extracted information on forests (brooadleaf, needleaf or shrubs) at the stand level, based on the dominant species. In wetlands, we excluded forestlands based on the above forest dataset. Then, we create a natural habitats (inland marshes, shrubs, wet meadows and high reeds) dataset, excluding all other land use/land cover types, based on their unique Natura2000 database 4-digit codes. Discontinued urban fabric (small towns and villages) areas extracted from urban zones provided by the Forest Map. A dataset regarding roads and built-up areas generated by processing the cadastral database. Plantation trees (olive, fruit and forest ones) and vineyards extracted from the LPIS database. The last generated dataset consist of crops either irrigated or non-irrigated.

In addition, Google Earth high-resolution (2019) imagery was used for visual interpretation of each plot, based on physiognomic attributes (color, shape, size, pattern and texture). This orthoimagery was the closest existing one to satellite imagery acquisition dates.

Based on the above interpretation, cross-referenced by each of the thematic datasets, each plot was assigned a land use/land cover unique type. Thus, a consistent large database was generated for selection of training data.

3.4. Classification

Four classification methods were applied, one parametric (ML) and three machine learning classifiers, KNN, RF and SVM. All procedures in this study were implemented using the Erdas Imagine 2020 commercial software.

We tested the utility of single-dated (summer 2018) and combination of summer-winter spectral bands of Landsat-8 OLI and Sentinel-2a MSI as data input, developing six different scenarios. Two of them refer to the use of spectral indices (EVI and NDMI) with single-date and seasonal Landsat-8 OLI spectral bands. EVI is sensitive to vegetation intra-annual variations while NDMI is sensitive to moisture content. They both used for different types of vegetation and irrigated fields discrimination. We acknowledged that many other different combinations of spectral and temporal features or approaches could be used. We decided to limit our research to the aforementioned features in our analysis.

In the initial phase, we tested numerous iterations of classifications with different combinations and number of land use/land cover classes on both types of imagery. However, classification performance was unacceptable (<85%) for over ten classes in both types of imagery. Land cover classification accuracy is affected by the number of classes identified. Overall classification accuracy decreases by increasing the number of classes [34].

Therefore, we adopted a classification scheme of 9 classes using Sentinel-2a and 8 classes using Landsat-8 OLI (Table 4). Rare and small-sized classes either grouped to form new classes or integrated to existing large enough classes. For example, vineyards, fruit and olive trees integrated to arable non-irrigated lands. Mineral extraction sites, discontinued urban fabrics, and industrial/commercial classes form a new class entitled as “Artificial Surfaces”. Classes such as, land principally occupied by agriculture, complex cultivated patterns, were deleted. They are generic land use/land cover types and include several other land types.

During the process, we selected 3,535 training data (polygons) for Landsat-8 OLI and 2,753 for Sentinel-2a MSI classification. We defined a set of training polygons by random sampling 70% of the points selected for class validation. This data is then used to train supervised classification algorithms. The remaining 30% of samples was used for classification validation(Table 4).

Training polygons were manually generated at random locations of sample plots taking in account that cover types should be spectrally homogeneous. For this reason, in many cases, we forced to generate training polygons away from sample locations. We avoided long and thin training polygons. Small polygons tend to be prone to edge effect. Moreover, we selected more training polygons in areas where land cover reference data was missing or in highly heterogeneous areas, in order to increase classification accuracy. The generation of training data in areas where land cover reference data are missing proved to be an issue. Their selection was based on our expert knowledge of the study area in relation to spectral data.

3.5. Accuracy Assessment

All classifiers were tested on the entire watershed based on the same training and validation data per sensor. We evaluated classification performance using the Overall Accuracy (OA), Producers’ Accuracy (PA), Users’ Accuracy (UA) and Kappa Coefficient (KC). For accuracy assessment, we selected 1,260 validation points for Landsat-8 OLI and 782 points for Sentinel-2A MSI.

3.5.1. Landsat-8 Scenarios

In scenario 1, we used single date Landsat-8 imagery acquired in summer. The ML classifier produced a slight higher overall accuracy (91.67%) comparing to machine learning classifiers (Table 5). In terms of class accuracy, the best results for ML achieved for water bodies, non-irrigated arable land and shrubs. However, needleleaf forest and high reeds have the lowest user accuracy and K-coefficient. Needleleaf forest is confused with broadleaf forest in mixed-forest areas. Moreover, unsuccessful reforested pine forests are confused with tall sclerophylous vegetation (evergreen shrubs).

Very dense high reeds in wetlands are confused with broadleaf forest in the hillsides showing similar spectral signature. The low producer accuracy reported in high reeds class is also an issue for all machine learning classifiers.

In scenario 2, we used single date Landsat-8 OLI combined with EVI and NDMI in-dices acquired in summer as well. Again, the ML classifier produced the highest overall accuracy (90.95%) over machine learning classifiers (Table 6). The contribution of indices in the classification process is not sufficient. The performance of all machine classifiers is similar and close to ML overall accuracy.

In scenario 3, we used multi-dated Landsat-8 OLI (summer and winter). For the first time, all classifiers have an overall accuracy slightly over 90% (Table 7). In this scenario, KNN produced the highest overall accuracy (91.90%) followed by ML, SVM and RF (Table 7). Accuracy in all classes is improved except those of needleleaf forests and high reeds.

In scenario 4, we used multi-dated Landsat-8 OLI (summer and winter) combined with the respective EVI and NDMI indices. Overall accuracy of all classifications improved but show similar results. The KNN classifier produced the highest overall accuracy (93.02%) (Table 8).

3.5.2. Sentinel-2A

In scenario 1, we used single-dated Sentinel-2a imagery. RF classifier produced slightly higher OA (93.86%) compared to ML and KNN (Table 9) which have similar results. SVM classifier has the lowest overall accuracy. The best results achieved for water bodies, broadleaved forests, needleleaf forests and non-irrigated arable lands. The lowest class producers’ accuracy is observed for roads in SVM and high reeds in KNN. However, high reeds class accuracy was low in SVM as well.

In scenario 2, we used multi-dated Sentinel-2a imagery (summer and winter). The ML classifier produced the highest overall accuracy (96.68%). Machine learning classifiers produced lower but similar results (Table 10). We observed that roads class is confused with artificial surfaces class, especially within urban areas. In addition, the class of shrubs is confused partially with non-irrigated lands, whereas small-sized fields are surrounded by shrubs.

4. Results

Table 11 shows the obtained overall accuracy per classifier and sensor in each developed scenario. In reference to Landsat-8 OLI classification, KNN was the best classifier in scenario 4, achieving the highest OA=93.02%. In this case, KC reached the highest value (0.9227). Under the same scenario, the ML classifier reached the second highest OA=92.46%, followed by SVM with OA=92.30%, and RF with OA=91.67%.

In scenario 4, all classifiers achieved the highest OA among all previous scenarios with minor variations (<1%) due to sufficient training data. Thus, we concluded that the use of multi-dated multispectral seasonal Lansdat-8 OLI data combined with spectral indices increases the performance and overall accuracy of classification for land use/land cover mapping. However, the contribution of spectral indices (EVI and NDMI) in classification performance was not significant (+1%) in all scenarios. The resulted classified maps is presented in Figure 3.

In reference to Sentinel-2a MSI classification, the ML classifier has the highest OA=96.68% under scenario 2, when intra-annual seasonal multispectral data is used (Figure 4). It is observed that the OA of ML is higher than the OA of RF (OA=93.86%) which ranked first in scenario 1. In scenario 2, the SVM classifier produced the highest accuracy (OA=92.84%), followed by KNN (OA=92.71%) and RF (OA=91.82%). We observed a major variation amongst ML and machine learning classifiers performance. The OA of ML is higher (>4%) than machine learning classifiers.

5. Conclusions

The aim of this work was to analyze the performance and accuracy of different classification classifiers (ML KNN, RF and SVM) and evaluate Landsat-8 OLI and Sentinel-2A MSI imagery for the identification and mapping of land use/land cover types in a highly heterogeneous Mediterranean site.

Based on the results, the achieved overall classification accuracies for both satellite imagery was acceptable (>85%) and the performance of selected machine learning classifiers was quite similar and statistically not significant in all scenarios. The best performing classifier is ML using seasonal bi-temporal Sentinel-2a imagery (OA 96.68%). This OA is higher than the best performing classifier KNN (93.02%) using seasonal bi-temporal Landsat-8 OLI imagery combined with spectral indices EVI and NDMI. Following a visual assessment of the respective classified map by the ML classifier, we realize that it overestimates artificial surfaces (Figure 4). Artificial surfaces are confused with high reflectance bare soils or open areas with no vegetation cover across the watershed. However, this finding cannot be supported by the respective confusion matrix (Table 12).

As far the performance of machine learning methods, we believe that more training data are required in case of Sentinel-2a classification. Machine learning methods require enough training samples to make optimum decisions [23]. However, the high spatial variability and spatial structure of the study area (small-sized area and sparsely land cover classes) affects the selection of proper training data.

In terms of class user’s accuracy, the lowest accuracy observed for High reeds (80%) followed by Needleleaf forests (88,37), according to confusion matrix in scenario 2 (Table 12). High reeds are commonly confused with shrubs in the wetlands, whereas they form mixed associations with shrubs (Tamarix sp.). In areas of unsuccessful reforestation with pine trees, needleleaf forests are confused with evergreen shrubs.

In reference to Landsat-8 (scenario 4), the lowest class users’ accuracy observed for Broadleaf forests (84,73%). This can be explained by the existing mixed formations of low in height broadleaf trees with shrubs, which span the watershed. PA’s lowest accuracy 80,33% observed for Needleleaf Forests and 82,76% for High Reeds.

We would recommend the classifier KNN using Landsat-8 imagery for land use/land cover mapping. However, we would be prudent with the application of the ML classifier using Sentinel-2 imagery. Machine learning algorithms are stable and produce similar results in all scenarios.

This paper presents a methodology for testing different classifiers for land use/land cover mapping of a high heterogeneous and complex landscape. It also includes a process for processing available geospatial databases, and a multi-source training data preparation. Integration of intra-annual temporal-spectral data into classification produces land use/land cover maps of high accuracy. This study represents an important step toward multiple-class land use/land cover mapping using spectral-temporal Landsat-8 or Sentinel-2 features by providing a quantitative assessment on classification accuracy. Our work contributes to the evaluation of classification algorithms for updating Copernicus Land Cover product. It documents that major classes at the 3rd level of Copernicus nomenclature, such as urban fabric, roads, irrigated and non-irrigated lands, broadleaf or needleleaf forests, shrubs, water bodies, large enough streams and wetlands vegetation can be classified with high accuracy based on seasonal multispectral data.

We believe that more research is required in the domain of land use/land cover mapping. Future research should be oriented towards the development of novel methods by integrating ancillary geospatial data or by integrating time-series spectral-temporal data into a classification model for land use/land cover mapping.

Table 13. Confusion matrix for scenario 4 (Landsat-8 KNN classification).

	Artificial surfaces	Broadleaf forest	Needleleaf forest	Non-irrigated arable land	Permanent freshwater lakes	Permanently irrigated land	High reeds	Shrubs	Total	PA (%)	UA (%)	KC
Artificial surfaces	88	0	0	6	0	0	0	0	94	95.65	93.62	0.9311
Broadleaf forest	0	172	7	0	0	2	4	18	203	91.01	84.73	0.8203
Needleleaf forest	0	2	49	0	0	0	0	0	51	80.33	96.08	0.9588
Non-irrigated arable land	3	1	0	350	0	0	0	8	362	95.11	96.69	0.9532
Permanent freshwater lakes	0	0	0	0	56	0	0	0	56	100.0	100.0	1.0000
Permanently irrigated land	0	2	0	0	0	78	0	2	82	95.12	95.12	0.9478
High reeds	0	0	0	0	0	0	24	0	24	82.76	100.0	1.0000
Shrubs	1	12	5	12	0	2	1	355	388	92.69	91.49	0.8778
Total	92	189	61	368	56	82	29	383	1260	93.02		0.9109

Acknowledgments

The research was carried out within the facilities of Hellenic Cadastre. We thank the Hellenic Cadastre and the Hellenic Forest Service for providing access to the national databases used in this study.

References

Di Gregorio, A., M. Henry, E. Donegan, Y. Finegold, J. Latham, I. Jonckheere and R. Cumani, 2016. Classification Concepts, Land Cover Classification System: Software version 3, FAO, Rome, 29p.
Kosmidou, V., Petrou, Z., Bunce, R.G.H., Mücher, C.A., Jongman, R.H.G., Bogers, M.M.B., Lucas, R.M., Tomaselli, V., Blonda, P., Padoa-Schioppa, E., Manakos, I., Petrou, M., 2014. Harmonization of the Land Cover Classification System (LCCS) with the General Habitat Categories (GHC) classification system. Ecol. Indic. 36, 290–300. [CrossRef]
Chen, J., Chen, J., Liao, A., Cao, X., Chen, L., Chen, X., He, C., Han, G., Peng, S., Lu, M., Zhang, W., Tong, X., & Mills, J. (2015). Global land cover mapping at 30 m resolution: A POK-based operational approach. ISPRS J. Photogram. Rem. Sens., 103, 7–27. [CrossRef]
Woodcock, C.E., Allen, R.G., Anderson, M., Belward, A., Bindschadler, R., Cohen, W., Feng, G., Goward, N.S., Helder, D., Helmer, E., Nemani, R., Oreopoulos, L., Schott, J., Thenkabail, S.P., Vermote, F.E., Vogelmann, J., Wulder, A.M. and Wynne, R. Free access to Landsat imagery, Science, 2008, 320 (5879). [CrossRef]
Wulder, M.A., Coops, N.C., Roy, D., White, J.C., and Hermosilla, T. Land Cover 2.0. Int. J. Rem. Sens., 2018. [CrossRef]
Chrysafis, I., Mallinis, G., Gitas, I., Tsakiri-Strati, M. Estimating Mediterranean forest parameters using multi seasonal Landsat 8 OLI imagery and an ensemble learning method. Rem. Sens. Environ., 2017, 199, 154–166. [CrossRef]
Pasquarella, V.J., Holden, C.E., Woodcock, C.E. Improved mapping of forest type using spectral-temporal Landsat features. Rem. Sens. Environ., 2018, 210, 193–207. [CrossRef]
Gounaridis, D., Koukoulas, S. Urban land cover thematic disaggregation, employing datasets from multiple sources and Random Forests modeling, Int. J. Appl. Earth Obs. Geoinf., 2016, 51, 1–10. [CrossRef]
Graesser, J., Ramankutty, N. Detection of cropland field parcels from Landsat imagery, Remote Sens. Environ., 2017, 201, 165–180. [CrossRef]
Belgiu, M., Csillik, O. Sentinel-2 cropland mapping using pixel-based and object-based time-weighted dynamic time warping analysis. Remote Sens. Environ., 2018, 204, 509–523. [CrossRef]
Bhatnagar, S., Gill, L., Regan, S., Naughton, O., Johnston, P., Waldren, S., Ghosh, B. Mapping Vegetation Communities Inside Wetlands Using Sentinel-2 Imagery in Ireland. Int. J. Appl. Earth Obs. Geoinf., 2020, 88, 102083. [CrossRef]
Chatziantoniou, A., Petropoulos, G.P., Psomiadis, E. Co-Orbital Sentinel 1 and 2 for LULC mapping with emphasis on wetlands in a mediterranean setting based on machine learning, Remote Sens., 2017, 9. [CrossRef]
Gómez, C., White, J.C., Wulder, M.A. Optical remotely sensed time series data for land cover classification: A review, ISPRS J. Photogramm. Remote Sens., 2016, 116, 55–72. [CrossRef]
Maxwell, A.E., Warner, T.A., Fang, F. Implementation of machine-learning classification in remote sensing: An applied review, Int. J. Remote Sens., 2018. [CrossRef]
Foody, G. M., Mahesh, P., Rocchini, D., Garzon-Lopez, X.C., Bastin, L. The Sensitivity of Mapping Methods to Reference Data Quality: Training Supervised Image Classifications with Imperfect Reference Data, ISPRS Intern. J. Geo-Inf., 2016, 5, 11, 199. [CrossRef]
Aune-Lundberg, L., Strand, G.H. The content and accuracy of the CORINE Land Cover dataset for Norway. Int. J. Appl. Earth Obs. Geoinf., 2021, 96, 102266. [CrossRef]
Stehman, S. V., and Foody, M.G. Key issues in rigorous accuracy assessment of land cover products, Rem. Sens. Environ., 2019, 231, 111199. [CrossRef]
Jensen, R.J. Thematic information extraction: pattern recognition. In Book Introductory Digital Image Processing, 3^rd ed; Clarke, C.K. ed, Pearson Prentice Hall; NJ, USA, 2005, pp. 337-379.
Mountrakis, G., Im, J., Ogole, C. Support vector machines in remote sensing: A review. ISPRS J. Photogramm. Remote Sens., 2011. [CrossRef]
Chirici, G., Mura, M., Mcinerney, D., Py, N., Tomppo, E.O., Waser, L.T., Travaglini, D., Mcroberts, R.E. A meta-analysis and review of the literature on the k-Nearest Neighbors technique for forestry applications that use remotely sensed data. Remote Sens. Environ., 2016, 176, 282–294. [CrossRef]
Belgiu, M., Dragut, L. Random forest in remote sensing: a review of applications and future directions. ISPRS J. Photogram. Remote Sens., 2016, 114, 24–31. [CrossRef]
Yu, L., Liang, L., Wang, J., Zhao, Y., Cheng, Q., Hu, L., Liu, S., Yu, Liang, Wang, X., Zhu, P., Li, Xueyan, Xu, Y., Li, C., Fu, W., Li, Xuecao, Li, W., Liu, C., Cong, N., Zhang, H., Sun, F., Bi, X., Xin, Q., Li, D., Yan, D., Zhu, Z., Goodchild, M.F., Gong, P. Meta-discoveries from a synthesis of satellite-based land-cover mapping research, Int. J. Remote Sens., 2014. [CrossRef]
Heydari, S.S., Mountrakis, G. Effect of classifier selection, reference sample size, reference class distribution and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites, Remote Sens. Environ., 2018, 204. [CrossRef]
Noi, T.P., Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery, Sensors 2017, 18. [CrossRef]
Chaves, M.E.D., Picoli, M.C.A., Sanches, I.D. Recent applications of Landsat 8/OLI and Sentinel-2/MSI for land use and land cover mapping: A systematic review, Remote Sens., 2020, 12. [CrossRef]
Landsat-8 C2L2 images courtesy of the U.S. Geological Survey, http://earthexplorer.usgs.gov.
Copernicus Sentinel data 2018 for Sentinel data, European Space Agency–ESA, produced from ESA remote sensing data, https://scihub.copernicus.eu/dhus/#/home.
Drusch, M., Del Bello, U., Carlier, S., Colin, O., Fernandez, V., Gascon, F., Hoersch, B., Isola, C., Laberinti, P., Martimort, P., Meygret, A., Spoto, F., Sy, O., Marchese, F., Bargellini, P. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services, J. Remote. Sens. Environ., 2012, 120, 25–36. [CrossRef]
Zheng, H., Du, P. Chen, J., Xia, J., Li, E., Xu, Z., Li, X., Yokoya, N. Performance evaluation of downscaling sentinel-2 imagery for land use and land cover classification by spectral-spatial features, Remote. Sens, 2017, 9(12), 1274. [CrossRef]
Büttner, G. Corine Land Cover and Land Change Products. In Book Land Use and Land Cover Mapping in Europe: Practices and Trends, I. Manakos and M. Braun, eds.; Springer, NY, USA, 2014; 18, p. 55-75.
Kuntz, S., Schmeer, E., Jochum, M., Smith, G. Towards an European land cover monitoring service and high-resolution layers. In Book Remote Sensing and Digital Image Processing, Freek D. van der Meer, ed; Springer, NY, USA, 2014; 18, p. 43–52.
Sagris, V., Devos, W. Core Conceptual Model for Land Parcel Identification System (LCM): GeoCAP Technical Specification, Version 1.1., 2009, Office for Official Publications of the European Communities, Luxembourg.
Vogiatzis, M. Cadastral Mapping of Forestlands in Greece, Photogram. Eng. Remote Sens., 2008, 74:39-46. [CrossRef]
Thinh, T.V., Duong, C.P., Nasahara, N.K., Tadono, T. How does land use/land cover map’s accuracy depend on number of classification classes?, SOLA, 2019, 15:28-31. [CrossRef]

Figure 1. Location of study area (in blue) over VHR orthoimage of 2015.

Figure 2. CLC2018 Land use/land cover types.

Figure 3. Land use/land cover map 2018 (KNN, Landsat-8 OLI).

Figure 4. Land use/land cover map 2018 (ML, Sentinel-2a).

Table 1. Landsat-8 OLI & Sentinel-2a MSI scenes.

Satellite	Date	Granule
Landsat-8 OLI	06-07.2018	LC08_L2SP_183032_20180701_20200831_02_T1
		LC08_L2SP_184032_20180622_20200831_02_T1
	28.01.2020/17.02.2019	LC08_L2SP_183032_20200128_20200823_02_T1
		LC08_L2SP_184032_20190217_20200829_02_T1
Semtinel-2a MSI	03.07.2018	L2A_T34TFK_A015820_20180703T092224
		L2A_T34TFL_A015820_20180703T092224
		L2A_T34TGK_A015820_20180703T092224
		L2A_T34TGL_A015820_20180703T092224
	05.12.2017	L2A_T34TFK_A012817_20171205T092648
		L2A_T34TFL_A012817_20171205T092648
	12.12.2017	L2A_T34TGK_A012917_20171212T091748
		L2A_T34TGL_A012917_20171212T091748

Table 2. Geospatial land cover reference data.

Datasts	Source	Date	Scale	Data provider
Agricultural fields	LPIS	2018	1:5.000	HC¹
Habitats	Natura2000	2017	1: 5.000	HC
Urban zones	Forest Map	2021	1:5.000	HC
Forest/Non forest lands	Forest Map	2021	1:5.000	HC
Forest Stands	Forest Management Plans	2007-2018	1:20.000	HFS²
Built-up areas and roads	Cadastral database	2021	1:1.000	HC

¹ Hellenic Cadastre, ² Hellenic Forest Service.

Table 3. Sample distribution per CLC2018 class.

No	Code	Class	Area (ha)	# Sample points
1	112	Discontinuous urban fabric	2,501.82	37
2	121	Industrial or commercial zones	443.74	6
3	122	Road and rail networks and associated land	752.93	11
4	131	Mineral extraction sites	161.03	2
5	211	Non-irrigated arable land	47,798.64	699
6	212	Permanently irrigated arable land	13,381.79	196
7	221	Vineyards	129.21	2
8	222	Fruit trees	71.24	1
9	223	Olive trees	404.69	6
10	231	Pastures	1,092.00	16
11	242	Complex cultivation patterns	5,083.44	74
12	243	Land principally occupied by agriculture	25,811.16	378
13	311	Broadleaf forest	24,783.86	362
14	312	Coniferous forest	2,180.58	31
15	313	Mixed forest	9,630.05	141
16	321	Natural grasslands	5,878.48	85
17	323	Sclerophyllous vegetation	29,441.81	431
18	324	Transitional woodland/shrub	8,082.38	117
19	331	Beaches, dunes, sand	572.6	8
20	411	Inland marshes	1,660.55	24
21	512	Water bodies	10,397.98	152
		Total	190,285.09	2,356

Table 4. Description of training data and control points.

Satellite imagery	Classes	Training data	No of pixels	% of total pixels	Control points
Landsat-8 OLI	Artificial surfaces	170	3,771	0.18%	92
	Non-irrigated arable land	650	21,726	1.03%	368
	Permanently irrigated land	177	7,433	0.35%	82
	Broadleaf forest	490	16,169	0.76%	189
	Needleleaf forest	98	1,486	0.07%	61
	Shrubs	473	16,664	0.79%	383
	Reeds	130	497	0.02%	56
	Permanent freshwater lakes	87	980	0.05%	29
	Total	2,275	68,726	3.25%	1,260
	Study Area		2,116,028
Sentinel-2 MSI	Artificial surfaces	83	6,088	0.03%	41
	Non-irrigated arable land	594	112,992	0.59%	254
	Permanently irrigated land	139	12,449	0.07%	45
	Broadleaf forest	459	104,017	0.55%	113
	Needleleaf forest	89	2,637	0.01%	40
	Shrubs	385	29,562	0.16%	195
	Reeds	62	6,739	0.04%	17
	Roads	30	1,017	0.01%	23
	Permanent freshwater lakes	130	116,113	0.61%	54
	Total	1,971	391,614	2.06%	782
	Study Area		19,050,300

Table 5. Landsat-8 scenario 1 classification accuracy metrics.

	Classes	PA	UA	KC
ML	Artificial surfaces	92.39%	83.33%	0.8202
	Broadleaf forest	84.66%	93.02%	0.9179
	Needleleaf forest	93.44%	76.00%	0.7478
	Non-irrigated arable land	93.21%	96.08%	0.9446
	Permanent freshwater lakes	96.43%	100.00%	1,0000
	Permanently irrigated land	93.90%	96.25%	0.9599
	High reeds	100.00%	64.44%	0.6361
	Shrubs	91.38%	93.33%	0.9042
	Overall accuracy	91.67%		0.8946
KNN	Artificial surfaces	82.61%	90.48%	0.8973
	Broadleaf forest	89.95%	79.07%	0.7538
	Needleleaf forest	81.97%	86.21%	0.8551
	Non-irrigated arable land	91.58%	93.09%	0.9024
	Permanent freshwater lakes	100.00%	100.00%	1.0000
	Permanently irrigated land	86.59%	92.21%	0.9167
	High reeds	34.48%	100.00%	1.0000
	Shrubs	91.91%	88.44%	0.8339
	Overall accuracy	89.05%		0.8598
RF	Artificial surfaces	81.52%	88.24%	0.8731
	Broadleaf forest	89.42%	87.11%	0.8484
	Needleleaf forest	85.25%	89.66%	0.8913
	Non-irrigated arable land	92.12%	91.37%	0.8782
	Permanent freshwater lakes	100.00%	100.00%	1.0000
	Permanently irrigated land	89.02%	86.90%	0.8599
	High reeds	75.86%	84.62%	0.8425
	Shrubs	89.82%	89.12%	0.8437
	Overall accuracy	89.68%		0.8684
SVM	Artificial surfaces	85.87%	96.34%	0.9605
	Broadleaf forest	91.53%	77.58%	0.7362
	Needleleaf forest	63.93%	90.70%	0.9022
	Non-irrigated arable land	97.01%	91.54%	0.8805
	Permanent freshwater lakes	100.00%	100.00%	1.0000
	Permanently irrigated land	89.02%	93.59%	0.9314
	High reeds	0.00%	0.00%	-0.0236
	Shrubs	87.99%	87.08%	0.8144
	Overall accuracy	88.41%		0.8509

Table 6. Landsat-8 scenario 2 classification accuracy metrics.

	Classes	PA	UA		KC
ML	Artificial surfaces	94.57%	82.08%		0.8066
	Broadleaf forest	84.13%	91.91%		0.9048
	Needleleaf forest	95.08%	73.42%		0.7207
	Non-irrigated arable land	90.22%	96.51%		0.9507
	Permanent freshwater lakes	98.21%	100.00%		1.0000
	Permanently irrigated land	95.12%	92.86%		0.9236
	Reeds	96.55%	73.68%		0.7306
	Shrubs	91.12%	91.60%		0.8793
	Overall accuracy	90.95%			0.8857
KNN	Artificial surfaces	82.61%	91.57%		0.9090
	Broadleaf forest	89.42%	80.09%		0.7658
	Needleleaf forest	83.61%	83.61%		0.8277
	Non-irrigated arable land	91.03%	93.31%		0.9056
	Permanent freshwater lakes	100.00%	100.00%		1.0000
	Permanently irrigated land	87.80%	92.31%		0.9177
	Reeds	31.03%	100.00%		1.0000
	Shrubs	92.69%	88.09%		0.8289
	Overall accuracy	89.13%			0.8608
RF	Artificial surfaces	82.61%	92.68%		0.9211
	Broadleaf forest	88.89%	87.05%		0.8476
	Needleleaf forest	85.25%	89.66%		0.8913
	Non-irrigated arable land	93.21%	91.47%		0.8795
	Permanent freshwater lakes	100.00%	100.00%		1,0000
	Permanently irrigated land	89.02%	86.90%		0.8599
	Reeds	72.41%	87.50%		0.8721
	Shrubs	91.38%	90.21%		0.8593
	Overall accuracy	90.40%			0.8773
SVM	Artificial surfaces	85.87%	95.18%	0.9480
	Broadleaf forest	92.06%	84.47%	0.8172
	Needleleaf forest	73.77%	88.24%	0.8764
	Non-irrigated arable land	96.74%	91.75%	0.8835
	Permanent freshwater lakes	100.00%	100.00%	1.0000
	Permanently irrigated land	90.24%	93.67%	0.9323
	Reeds	0.00%	0.00%	-0.0236
	Shrubs	89.30%	86.36%	0.8041
	Overall accuracy	89.37%		0.8632

Table 7. Landsat-8 scenario 3 classification accuracy metrics.

	Classes	PA		UA		KC
ML	Artificial surfaces	96.74%		80.18%		0.7862
	Broadleaf forest	83.07%		92.90%		0.9165
	Needleleaf forest	98.36%		66.67%		0.6497
	Non-irrigated arable land	93.75%		98.57%		0.9798
	Permanent freshwater lakes	98.21%		100.00%		1.0000
	Permanently irrigated land	98.78%		94.19%		0.9378
	High reeds	100.00%		74.36%		0.7375
	Shrubs	88.77%		94.44%		0.9202
	Overall accuracy	91.75%				0.8962
KNN	Artificial surfaces	93.48%		92.47%		0.9188
	Broadleaf forest	91.53%		80.84%		0.7746
	Needleleaf forest	78.69%		96.00%		0.9580
	Non-irrigated arable land	94.57%		96.67%		0.9529
	Permanent freshwater lakes	100.00%		100.00%		1.0000
	Permanently irrigated land	95.12%		96.30%		0.9604
	High reeds	68.97%		86.96%		0.8665
	Shrubs	91.12%		91.12%		0.8725
	Overall accuracy	91.90%				0.8968
RF	Artificial surfaces	78.26%		90.00%		0.8921
	Broadleaf forest	87.30%		89.19%		0.8728
	Needleleaf forest	85.25%		92.86%		0.9249
	Non-irrigated arable land	95.38%		87.97%		0.8301
	Permanent freshwater lakes	100.00%		98.25%		0.9816
	Permanently irrigated land	96.34%		87.78%		0.8693
	High reeds	65.52%		79.17%		0.7868
	Shrubs	87.99%		91.33%		0.8754
	Overall accuracy	90.40%				0.8773
SVM	Artificial surfaces	82.61%	93.83%			0.9334
	Broadleaf forest	84.13%	86.41%			0.8402
	Needleleaf forest	78.69%	92.31%			0.9192
	Non-irrigated arable land	96.20%	92.91%			0.8999
	Permanent freshwater lakes	100.00%	100.00%			1.0000
	Permanently irrigated land	96.34%	84.04%			0.8293
	High reeds	79.31%	88.46%			0.8819
	Shrubs	90.34%	89.64%			0.8511
	Overall accuracy	90.56%			0.8793

Table 8. Landsat-8 scenario 4 classification accuracy metrics.

	Classes	PA		UA		KC
ML	Artificial surfaces	100.00%		80.70%		0.7918
	Broadleaf forest	83.60%		94.61%		0.9366
	Needleleaf forest	96.72%		72.84%		0.7146
	Non-irrigated arable land	92.66%		98.27%		0.9756
	Permanent freshwater lakes	98.21%		100.00%		1.0000
	Permanently irrigated land	98.78%		92.05%		0.9149
	High reeds	100.00%		78.38%		0.7787
	Shrubs	91.38%		94.34%		0.9187
	Overall accuracy	92.46%				0.9050
KNN	Artificial surfaces	95.65%		93.62%		0.9311
	Broadleaf forest	91.01%		84.73%		0.8203
	Needleleaf forest	80.33%		96.08%		0.9588
	Non-irrigated arable land	95.11%		96.69%		0.9532
	Permanent freshwater lakes	100.00%		100.00%		1,0000
	Permanently irrigated land	95.12%		95.12%		0.9478
	High reeds	82.76%		100.00%		1,0000
	Shrubs	92.69%		91.49%		0.8778
	Overall accuracy	93.02%				0.9109
RF	Artificial surfaces	78.26%		93.51%		0.9300
	Broadleaf forest	87.83%		90.71%		0.8907
	Needleleaf forest	85.25%		91.23%		0.9078
	Non-irrigated arable land	96.20%		90.77%		0.8696
	Permanent freshwater lakes	100.00%		100.00%		1.0000
	Permanently irrigated land	97.56%		89.89%		0.8918
	High reeds	93.10%		90.00%		0.8976
	Shrubs	90.86%		92.06%		0.8860
	Overall accuracy	91.67%				0.8936
SVM	Artificial surfaces	88.04%	96.43%		0.9615
	Broadleaf forest	85.71%	90.50%		0.8883
	Needleleaf forest	81.97%	92.59%		0.9222
	Non-irrigated arable land	96.20%	94.15%		0.9174
	Permanent freshwater lakes	100.00%	100.00%		1.0000
	Permanently irrigated land	97.56%	90.91%		0.9028
	High reeds	96.55%	87.50%		0.8721
	Shrubs	91.91%	90.03%		0.8567
	Overall accuracy	92.30%			0.9017

Table 9. Sentinel-2a scenario 1 classification accuracy metrics.

	Classes	PA	UA	KC
ML	Artificial surfaces	92.68%	88.37%	0.8773
	Broadleaf forest	94.69%	99.07%	0.9892
	Needleleaf forest	97.50%	79.59%	0.7849
	Non-irrigated arable land	89.76%	97.85%	0.9682
	Water bodies	96.30%	100.00%	1.0000
	Permanently irrigated land	100.00%	93.75%	0.9337
	High reeds	94.12%	94.12%	0.9399
	Roads	100.00%	79.31%	0.7868
	Shrubs	93.85%	90.15%	0.8687
	Overall accuracy	93.48%		0.9188
KNN	Artificial surfaces	75.61%	100.00%	1.0000
	Broadleaf forest	96.46%	95.61%	0.9487
	Needleleaf forest	95.00%	84.44%	0.8361
	Non-irrigated arable land	96.06%	92.78%	0.8930
	Water bodies	100.00%	100.00%	1.0000
	Permanently irrigated land	97.78%	93.62%	0.9323
	High reeds	52.94%	81.82%	0.8141
	Roads	86.96%	86.96%	0.8656
	Shrubs	93.33%	93.81%	0.9176
	Overall accuracy	93.48%		0.9178
RF	Artificial surfaces	82.93%	89.47%	0.8889
	Broadleaf forest	97.35%	94.83%	0.9395
	Needleleaf forest	97.50%	95.12%	0.9486
	Non-irrigated arable land	96.06%	91.73%	0.8775
	Water bodies	100.00%	100.00%	1.0000
	Permanently irrigated land	91.11%	93.18%	0.9277
	High reeds	94.12%	94.12%	0.9399
	Roads	73.91%	94.44%	0.9428
	Shrubs	91.79%	95.21%	0.9362
	Overall accuracy	93.86%		0.9227
SVM	Artificial surfaces	92.68%	84.44%	0.8358
	Broadleaf forest	97.35%	84.62%	0.8202
	Needleleaf forest	75.00%	78.95%	0.7781
	Non-irrigated arable land	94.49%	93.02%	0.8967
	Water bodies	98.15%	98.15%	0.9801
	Permanently irrigated land	86.67%	97.50%	0.9735
	High Reeds	58.82%	90.91%	0.9071
	Roads	65.22%	93.75%	0.9356
	Shrubs	89.23%	91.58%	0.8878
	Overall accuracy	90.66%		0.8824

Table 10. Sentinel-2a scenario 2 classification accuracy metrics.

	Classes	PA	UA	KC
ML	Artificial surfaces	95.12%	92.86%	0.9246
	Broadleaf forest	98.23%	98.23%	0.9793
	Needleleaf forest	95.00%	88.37%	0.8775
	Non-irrigated arable land	97.24%	99.20%	0.9881
	Water bodies	96.30%	100.00%	1.0000
	Permanently irrigated land	100.00%	97.83%	0.9769
	High reeds	94.12%	80.00%	0.7956
	Roads	100.00%	92.00%	0.9176
	Shrubs	94.87%	96.35%	0.9514
	Overall accuracy	96.68%		0.9584
KNN	Artificial surfaces	80.49%	97.06%	0.9690
	Broadleaf forest	94.69%	88.43%	0.8648
	Needleleaf forest	95.00%	92.68%	0.9229
	Non-irrigated arable land	96.06%	93.13%	0.8982
	Water bodies	100.00%	100.00%	1.0000
	Permanently irrigated land	88.89%	83.33%	0.8232
	High reeds	94.12%	88.89%	0.8864
	Roads	86.96%	90.91%	0.9063
	Shrubs	88.72%	95.05%	0.9341
	Overall accuracy	92.71%		0.9085
RF	Artificial surfaces	90.24%	82.22%	0.8124
	Broadleaf forest	93.81%	90.60%	0.8901
	Needleleaf forest	95.00%	82.61%	0.8167
	Non-irrigated arable land	97.24%	90.81%	0.8639
	Water bodies	100.00%	100.00%	1.0000
	Permanently irrigated land	91.11%	83.67%	0.8268
	High reeds	88.24%	93.75%	0.9361
	Roads	65.22%	100.00%	1.0000
	Shrubs	84.62%	98.21%	0.9762
	Overall accuracy	91.82%		0.8972
SVM	Artificial surfaces	90.24%	84.09%	0.8321
	Broadleaf forest	95.58%	86.40%	0.841
	Needleleaf forest	92.50%	94.87%	0.946
	Non-irrigated arable land	96.06%	95.69%	0.9361
	Water bodies	98.15%	98.15%	0.9801
	Permanently irrigated land	95.56%	93.48%	0.9308
	High reeds	88.24%	83.33%	0.8296
	Roads	73.91%	80.95%	0.8038
	Shrubs	88.21%	95.56%	0.9408
	Overall accuracy	92.84%		0.9813

Table 11. Overall classification accuracy per scenario.

Classification scenarios (2018)		Bands / Indices	Seasons	Type of classifier	OA	KC
Landsat-8 OLI	Scenario 1	6 bands (B2-B7)	S	ML	91.67%	0.8946
				K-NN	89.05%	0.8598
				RF	89.68%	0.8684
				SVM	88.41%	0.8509
	Scenario 2	6 bands (B2-B7)+EVI + NDMI	S	ML	90.95%	0.8857
				KNN	89.13%	0.8608
				RF	90.40%	0.8773
				SVM	89.37%	0.8632
	Scenario 3	12 bands (B2-B7)	S+W	ML	91.75%	0.8962
				KNN	91.90%	0.8968
				RF	89.76%	0.8692
				SVM	90.56%	0.8793
	Scenario 4	12 bands (B2-B7+ EVI + NDMI	S+W	ML	92.46%	0.9050
				KNN	93.02%	0.9109
				RF	91.67%	0.8936
				SVM	92.30%	0.9017
Sentinel-2A	Scenario 1	10 bands (2-8, 8A, 11-12)	S	ML	93.48%	0.9188
				KNN	93.48%	0.9178
				RF	93.86%	0.9227
				SVM	90.66%	0.8824
	Scenario 2	10 bands (2-8, 8A, 11-12)	S+W	ML	96.68%	0.9584
				KNN	92.71%	0.9085
				RF	91.82%	0.8972
				SVM	92.84%	0.9103

Table 12. Confusion matrix for the best scenario 2 (Sentinel-2a ML classification).

	Artificial surfaces	Broadleaf forest	Needleleaf forest	Non-irrigated arable land	Permanent freshwater lakes	Permanently irrigated land	High reeds	Roads	Shrubs	Total	PA (%)	UA (%)	KC
Artificial surfaces	39	0	0	1	1	0	1	0	0	42	95.12	92.86	0.9246
Broadleaf forest	0	111	0	0	0	0	0	0	2	113	98.23	98.23	0.9793
Needleleaf forest	0	2	38	0	0	0	0	0	3	43	95.00	88.37	0.8775
Non-irrigated arable land	0	0	0	247	1	0	0	0	1	249	97.24	99.20	0.9881
Permanent freshwater lakes	0	0	0	0	52	0	0	0	0	52	96.30	100.0	1.0000
Permanently irrigated land	0	0	0	1	0	45	0	0	0	46	100.0	97.83	0.9769
High reeds	0	0	0	0	0	0	16	0	4	20	94.12	80.00	0.7956
Roads	2	0	0	0	0	0	0	23	0	25	100.0	92.00	0.9176
Shrubs	0	0	2	5	0	0	0	0	185	192	94.87	96.35	0.9514
Total	41	113	40	254	54	45	17	23	195	782	96.68		0.9584

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

Comparison of Pixel-Based Classification Algorithms Using Landsat-8 OLI and Sentinel-2 MSI for Land Use/Land Cover Mapping in a Heterogeneous Landscape

Abstract

1. Introduction

2. Study Area

3. Materials and Methods

3.1. Satellite Imagery

3.2. Land Cover Reference Data

3.3. Sampling Design

3.4. Training Data

3.4. Classification

3.5. Accuracy Assessment

3.5.1. Landsat-8 Scenarios

3.5.2. Sentinel-2A

4. Results

5. Conclusions

Acknowledgments

References

MDPI Initiatives

Important Links

Subscribe