1. Introduction
Alterations in land use and land cover (LULC) arising from anthropic activities are one of the principal environmental problems studied by scientists [
1,
2,
3,
4,
5,
6]. Adverse changes in conservation areas cause environmental damage and harm our quality of life, such as reduction in the available water for human consumption, or reduction in soil nutrients for food production. [
4]. Society is highly dependent on a functional and stable land system for food production and access to natural resources, including water, timber, fiber, ore, and fuel, among other ecosystem services [
5,
7]. In this instance, the functionality and stability of land systems depend on the interaction of soil, water, and plants within the ecological composition and the quality of native vegetation [
8,
9]. Thus, mapping the composition of vegetation, not only provides information on the region’s quantitative and qualitative state, but is a crucial initial step in the analyzing and monitoring its management, including the state of the natural vegetation and the impact of anthropic activities on affected ecosystems [
10].
The Brazilian Cerrado biome is a global biodiversity hotspot. It encompasses three of South America’s largest river basins (Amazon, São Francisco, and Prata), contributing nearly half Brazil’s surface water and recharging the Bambuí, Urucuia, and Guarani aquifers, which depend on its integrity [
11]. Its vital significance for the conservation of species and the provision of ecosystem services notwithstanding, only 53.1% of its original area has been preserved [
12], as it lost 279,000 km² between 1985 and 2021 from LULC changes arising from such anthropic endeavors as crops, pastures, forestry, mining, industrialization, and human expansion [
12].
A similar trend can be observed in the Atlantic Forest, where 70% of Brazilians live, as only 24.3% remains as a forest formation [
13]. The forest largely corresponds to the legal reserve and permanent protection areas (PPAs), whose role is to preserve the quality of water, soil and, biodiversity [
4,
8,
9]. The quality of the vegetation cover has also been adversely impacted, as there was a 23% loss of mature forest from 1985 to 2021 [
5]. In 37 years, 98,000 km² of primary vegetation were suppressed, while 88,000 km² regenerated into secondary vegetation [
13], which comprises 26% of forest cover in the Atlantic Forest. Thus, mapping the spatial composition of forest vegetation is fundamental for the evaluation of the spatial pattern of ecosystem services in an environmental study[
10].
Without adequate LULC management focused on sustainability, regional river basins suffer persistent alterations. The Lobo Reservoir Hydrographic Basin (LRHB), hereafter basin, is formed by streams and rivers that contribute to the reservoir [
2,
3,
4,
14]. It lies in the center-west of the state of São Paulo, Brazil, a transitional region between the Brazilian Cerrado and the Atlantic Forest, over 200 km from the city of São Paulo, one of the principal financial, corporate, and commercial centers of South America [
4,
14]. The state encompasses just 16.6% and 28.4% of the two biomes, respectively [
12,
13].
Algorithms using machine learning can efficiently classify images from remote sensing and are capable of handling high-dimensional data and map thematic classes with complex features [
15]. As remote sensing can provide cost-efficient data, it is suitable for LULC evaluation at higher spatial resolution. The free availability of medium-high spatial resolution imagery, such as Sentinel-1 radar data and Sentinel-2 optical data, can facilitate mapping the state of native vegetation in regions where ecosystem services require finer-scale assessments [
16].Several studies have examined the applicability of active synthetic aperture radar (SAR) to map LULC in terms of specific objectives [
10,
16,
17,
18,
19,
20,
21,
22,
23]. One study delineated cocoa agroforests in Cameroon from textural features extracted from SAR imaging [
22], and another evaluated the value of short-time baseline coherence of SAR images over a complex agricultural area in the state of São Paulo [
21]. With all-day and all-weather operational capability, these sensors provide diverse, complementary physical data that enhances spectral data when combined with optical imaging [
10,
16,
19,
23]. In addition, the fusion of Sentinel-1 and Sentinel-2 increased the accuracy of LULC mapping, using complementary information, such as spectral [
18,
20], backscattering, polarimetry, and interferometry [
17].
However, the works in general has been carried out to map vegetated and non-vegetated areas without distinction in relation to the type of vegetation characteristic of the region under study. Furthermore, the mappings carried out in studies and published in high-impact journals have not yet explored the expanded extraction of LULC thematic classes with the intention of differentiating types of transitional vegetation between different biomes, and reforestation for commercial and research purposes. In Brazil, research has focused on mapping the vegetation characteristic of the Amazon Forest, thus leaving other biomes with tropical characteristics without scientific reference. Therefore, this study seeks to fill the gaps in the field of remote sensing using multi-sensor and spatial data typology.
Regarding remote sensing LULC mapping, advanced artificial neural networks, especially deep learning models, have gained increased attention due to their end-to-end nature [
16]. An obstacle affecting the performance of deep learning networks is the dearth of training samples [
24]. Accordingly, a framework based on deep learning for classifying native vegetation in the transitional regions that can integrate SAR and optical data and effectively handle training samples should be developed.
This study develops such an architecture for mapping native vegetation from integrated spatial information from the Sentinel-1 and Sentinel-2 satellites. To this end, the Lobo Reservoir Hydrographic Basin, which is located in an environmental conservation unit protected by Brazilian law for the purpose of sustainable use, was adopted as its study area, as it is a region with significant agricultural, mining, and industrial development. The spatial composition of the basin’s native vegetation is formed by permanent protection areas, consisting of fragments of the Atlantic Forest in regions of rivers and streams, close to springs and the region comprising the Ecological Station of Itirapina composed of vegetation characteristic of the typical savannah and grassland Brazilian Cerrado biome. The study further contributes to national and international research, as the basin is part of the long-term ecological research program of the National Council for Scientific and Technological Development, which promotes advances in environmental and ecological research [
2,
3,
4,
8,
14,
25,
26].
4. Discussion
Evaluation of the methodological approach used in this study verifies the integrated use of data from the Sentinel-1 and Sentinel-2 satellites to classify the spatial composition of native vegetation in transitional areas between the Brazilian Cerrado and Atlantic Forest interface. The composition of the basin’s native vegetation (
Figure 6) is formed by PPAs from the courses of rivers and streams, such as fragments of the Atlantic Forest, PPAs from springs and the region comprising the Itirapina Ecological Station that composed of vegetation characteristic of the savannah and Brazilian Cerrado biome. Mapping such areas comprising the basin’s native vegetation types is essential for evaluative studies of water ecosystem services [
16], as they play an vital role filtering sediment that would otherwise be deposited in waterways and reservoirs [
9,
41,
42,
43]. In addition, they represent conservation areas that seek to preserve the basin’s natural resources [
4,
26].
The study’s analysis of the importance of attributes for LULC classification depicted in
Figure 4 indicated that in C1 the most important, in order of descending importance, were the backscatter coefficients (σ
0VH e σ
0VV), the H-α polarimetric decomposition, and finally the interferometric coherence (
). In C2, the order was B11, B12, B05, B8A, B06, B07, and B08. In C3, the order was that of C2 followed by that of C1.
The C1 confusion matrix demonstrated the difficulties of LULC classification using solely radar data, as it attained the worst values of the three methods examined in the study, achieving only 55.73% OA. The RF algorithm was extremely difficult to classify, confusing the wetlands and water classes, representing 100% OE and IE. As
Figure 3 indicates, the only SAR variable able to differentiate them was the backscatter coefficient σ
0VH. The basin’s water class is primarily composed of the Lobo Reservoir, drained from the rivers and streams that constitute the basin. However, the waterways limited width make them difficult to distinguish from wetlands of native vegetation in RGB band composition using only radar data (
Figure 5). [
44], noted that wetlands, in addition to being spatially complex, are temporally variable, and suggest that temporal information from Sentinel-1 has greater discriminatory effectiveness over forested and non-forested wetlands and non-wetlands. While this study did not use time series, it was able to attain sufficient precision for the wetlands class in C3 with 77.78% PA and 94.92% UA, in which all optical data differentiated the two classes quite well and optimized the SAR classification.
In addition, the capacity of σ
0VH to distinguish wooded and non-wooded wetlands is related to sensitivity to volume dispersion which largely depends on the vegetative structures of the tree canopy [
45]. While this variability is evident in the boxplot graphs, depicting the variables by class (
Figure 3), it was not possible to observe the same distinction during the collection of training samples, likely due to the small transitional gradient among the basin’s natural characteristics.
Among the difficulties in LULC classification using only SAR data, the RF algorithm did not distinguish pasture well, confusing it with agriculture and Brazilian Cerrado, and attaining 84.73% EO and 45.36% EI. Agriculture was confused with experimental reforestation and exposed soil, attaining 81.18% EO and 90.75% EI. [
46], used C-band backscatter coefficient values (RISAT-1) to map rice, vegetation, buildings, and water bodies, through HH and VH polarizations, obtaining 88.57% OA. While [
31], mapped the agricultural land, dense vegetation, sparse vegetation, building, fallow, water bodies, and sand classes and obtained 80.27% (OA) accuracy with only backscattering and 87.7% when adding textural variables.
The basin’s LULC classes are divided into ten categories (
Figure 7) to optimize the mapping of structure of native vegetation (Atlantic Forest and Brazilian Cerrado) and conservation areas (experimental reforestation and wetlands). However, C1 attained only 55.73% OA, with backscatter coefficients its most significant variable (
Figure 4). By integrating SAR and optical data, C3 increased OA to 96.54%. The results indicate that while SAR data are widely used because they can differentiate some LULC classes, when used in isolation, they encounter difficulties, for example, in visual interpretation during training. The more classes there are and the greater their distinction, the lower overall accuracy is likely to be, although further research is needed to understand how this works.
The study maps the basin’s native vegetation, but C1 encountered difficulties, with the Brazilian Cerrado class registering OE and IE rates of 62.42% and 67.03%, respectively, and was confused with experimental reforestation. In addition, native vegetation class which obtained 49.25% OE and 20.80% IE was confused with productive reforestation. This type of class confusion is reported by other studies as well, [
17], observed that the polarimetric decomposition attributes (H-α) were crucial in the stratified identification of different classes in the Brazilian Amazon Forest, however, certain similarities were noted in the distribution of forest and field classes.
In
Figure 3, the values of entropy and Alpha (H-α) for the Brazilian Cerrado were, on average, slightly higher (0.76-0.88) than those for native vegetation (Atlantic Forest) (0.70-0.80) and resemble those of the pasture, wetlands, and experimental reforestation classes. According to [
47], this may result from the presence of trees or shrubs in pasture areas, which increase surface roughness, generating a SAR signal like that of forest areas and savannah and grassland forests (Brazilian Cerrado).
[
32], noted an increase in mean entropy values (0.6-0.7) in some areas of dry tropical deciduous forests in India, while [
17], obtained mean entropy values (0.65-0.75) for Amazon Forest classes. Thus, the SAR variable best able to distinguish Brazilian Cerrado and native vegetation (Atlantic Forest), as well as other classes, was the backscatter coefficient σ
0VH, corroborating the results presented in
Figure 4. Integrating optical data in classification, the RF algorithm attained PA and UA values of 100% and 97.76%, respectively, for the Brazilian Cerrado and 94.21% and 91.97%, respectively, for native vegetation (Atlantic Forest).
The overall accuracy of 96.54% attained by C 3 conforms with results in other studies integrating SAR and optical data [
10,
16,
21,
22,
23,
37,
38,
39]. [
10], evaluated the integrated use of Sentinel-1 and Sentinel-2 data for LULC mapping in a Mediterranean region and attained 90.33% OA. [
23], assessed the same data integration for mapping woody plants in savannas and attained 93% OA. Accordingly, the study’s objective was achieved, with an overall accuracy slightly better than in some studies [
10,
16,
21,
22,
23].
Author Contributions
Allita Rezende dos Santos: Conceptualization, Methodology, Formal analysis, Investigation, Writing—original draft, Writing—review & editing, Funding acquisition. Mariana A. G. A. Barbosa: Conceptualization, Methodology, Formal analysis, Investigation, Writing—review & editing, Funding acquisition. Phelipe da Silva Anjinho: Conceptualization, Methodology, Formal analysis, Investigation, Writing—review & editing, Funding acquisition. Denise Parizotto: Conceptualization, Methodology, Formal analysis, Investigation, Writing—review & editing, Funding acquisition. Frederico Fábio Mauad: Resources, Project administration, Supervision.