1. Introduction
Land Surface Temperature (LST) is crucial for modeling the Earth's energy balance. LST provides valuable insights for studying urban heat islands [
1], soil moisture[
2,
3], droughts[
4], and vegetation[
5]. Such applications require detailed LST information at various spatial and temporal scales. Ground weather stations are commonly used to gather detailed LST information, but their sparse distribution limits their effectiveness in mapping LST over large areas [
6].
Remote sensing techniques, as an alternative method to estimate LST, allows users to indirectly deduce surface temperature of large areas from space. However, remote sensing techniques employed for retrieving LST information can be challenged by the conflict or the trade-off between spatial and temporal resolution. Indeed, the space-based high spatial resolution LST products generally corresponds to low temporal information and vice versa. Irrespective of the selected LST product, the user would have to deal with uncertainties either related to the spatial or temporal domain. Coarse spatial resolutions LST products (e.g., 1000m) confuse temperatures of different land covers and avoid detailed assessment. Spatial Downscaling (SD) methods can assist in solving this conflict. Spatial downscaling can be defined as the process of translating spatial information from coarse to fine spatial resolution. High spatio-temporal LST products is particularly necessary for urban studies; as the built-up areas have high heterogeneity in terms of land surface information [
6]. Furthermore, the availability of high spatio-temporal resolution LST data makes it possible to observe and monitor the temporal change of temperature over distinct land covers such as fields, roads, buildings.
Spatial downscaling methods can be divided into: physical models [
7], spatio-temporal fusion models [
8,
9], and regression-based models [
10,
11,
12]. Out of these, regression-based modelling is the most commonly adopted downscaling method due to its ease of practical implementation as compared to other methods [
11]. Regression based methods aim to model the relationship between input predictors (e.g., spectral indices, terrain factors, land use land cover information) and the target variable (LST). The generated model consequently can be used to estimate fine scale LST data when fine scale predictors are used as input.
Several studies investigated LST downscaling problem using regression models. These studies utilized and developed models such as: DisTrad [
13], TsHARP [
10], NL-DisTrad [
14], Geographically Weighted Regression (GWR) [
15], Co-Kriging [
16], Random Forest (RF) [
11,
12], and Support Vector Machines (SVM) [
12]. Generally, irrespective of the applied methods, predictors such as surface reflectance and spectral indices were commonly derived from optical data. Such predictors apparently rely on clear weather conditions that can hinder the construction and usage of downscaling models due to the lack of cloud-free images. Additionally, these predictors are ineffective during nighttime due to the limitations of optical imagery. In this study, to overcome the above-mentioned limitations, Synthetic Aperture Radar (SAR) data is used for deriving the predictors to downscale LST image. Due to the active imaging mode and the employed microwave electromagnetic waves, radar signals penetrate through clouds, making SAR system capable to acquire information from Earth’s surface during both daytime and night-time and independent of weather condition [
17]. The sensitivity of radar backscattering to surface parameters such as surface roughness, soil moisture, surface dielectric constant and vegetation indices has long been established in several studies [
18,
19,
20,
21,
22,
23]. The anticipated relationship between these surface parameters obtained via radar and temperature inspires exploration of the use of SAR images to downscale LST products. In this paper, we investigate the main question: can SAR images be used as a predictor to downscale the available coarse resolution Thermal Infrared (TIR)-based LST products? We conduct several experimental analyses to answer this question. Random Forest (RF) regression and a Convolutional Neural Network (CNN) are considered to perform the downscaling problem.
The rest of the paper is organized as follows:
Section 2 introduces the materials and the methods, while the experimental results are given in
Section 3. Discussion of the achieved results is presented in
Section 4. Finally,
Section 5 is dedicated to conclusions.
4. Discussion
The primary objective of this research was to evaluate the efficacy of radar-derived predictors in downscaling Land Surface Temperature (LST). The quantitative evaluation metrics presented in
Table 2 demonstrate the effectiveness of using the VV and VH bands of Sentinel-1 GRD data as predictors for LST downscaling. The downscaled LST maps generated using the radar-based random forest downscaling models exhibit favorable agreement with the validation data, as depicted in
Figure 4. Moreover, the incorporation of feature engineering techniques, such as including neighboring values and land cover proportion, enhances the quantitative performance of the radar-based random forest downscaling models and yields improved qualitative results. To better illustrate the effect of feature inclusion in SAR-based downscaling framework, results from two different small regions over Zuid-Holland province, namely Lansingerland (the red star in
Figure 7a) and Wassenaar (the red star in
Figure 8a), were reported. In particular,
Figure 7 shows the effect of including neighborhood values in the downscaling method with and without residual correction. As can be seen, the LST image generated with the inclusion of neighboring values (
Figure 7c and
Figure 7e) exhibits smoother and more gradual variations compared to the version without such inclusion (
Figure 7b and
Figure 7d). Further, the inclusion of neighboring values also mitigates numerous false high LST estimates that may arise from elevated values of the backscattering coefficient. High backscatter values do not necessarily correspond to high LST values, as the raw radar backscatter is influenced by various variables beyond LST alone. Forested areas, for instance are wrongly mapped to high LST values due to high backscatter values. Furthermore, as can be seen in
Figure 8, the incorporation of land cover features resolves the issue related to the forested areas. Additionally, the inclusion of land cover features assists in better boundary delineation between different land cover categories, addressing a limitation observed in the downscaled results obtained solely using predictors derived from Sentinel-1 GRD data (VV and VH). In general, comparison of radar-based downscaling LST with original Landsat LST image confirms the applicability of the proposed framework.
Table 7 provides both non-residual corrected and residual corrected RMSE values for all the radar-based random forest downscaling experiments. Although the quantitative performance gains achieved through the inclusion of these features in the models are not substantial, qualitatively, these engineered features effectively address several challenges.
The residual correction process, as described by (1) and (2), effectively addresses certain challenges mentioned above. However, it is important to acknowledge that the application of this process can introduce boxy patterns in regions where the model's predictive capability is limited. This occurs because the residuals are obtained at a coarser scale, and the corresponding fine-scale pixels are adjusted using this constant residual value. The significance of this issue varies depending on the practical use of these downscaled products. Consequently, it is generally preferable to enhance predictive power through the incorporation of additional features rather than relying heavily on the residual correction method.
For further evaluation, the radar-based random forest downscaling is compared with optical-based random forest downscaling model (
Table 2 and
Table 3,
Figure 4 and
Figure 5). In terms of quantitative metrics, the optical-based random forest downscaling model exhibits slightly superior performance compared to the radar-based counterpart. Several factors may contribute to the optical dataset outperforming the radar dataset. Firstly, as discussed earlier, radar images typically display more pixel value variations over local regions, whereas optical bands exhibit smoother variations in pixel values. Although efforts were made to address this issue by incorporating neighboring values of radar bands as features, datasets with inherent smooth patterns generally yield better results compared to those without such patterns. Additionally, the boundary delineation observed in the pixel values of optical images between different features in the study area, such as urban and green areas, aligns with the spatial patterns of observed differences in LST values. However, the same cannot be observed for radar images. To mitigate this issue, information from the land cover image was integrated into the model, which successfully addressed the problem. However, integration of land cover information with optical data may result in even better optical-based downscaling models as emphasized in other studies [
11,
12]. Another characteristic that may contribute to the slightly better performance of the optical dataset is the difference in sensor characteristics between radar and optical data, particularly the disparity in viewing geometry. Sentinel-1 SAR satellites collect images using a side-looking geometry, whereas both Landsat-8 LST and Sentinel-2 optical data are acquired using a nadir-looking geometry, which may introduce some spatial pattern mismatches.
From examining
Table 2 and
Table 3, the downscaling models for the dates of 10
th April 2020 and 28
th May 2020 exhibit poorer RMSE values as compared to the model for 25
th March 2020. This discrepancy could be attributed to the greater temporal difference between the acquisition of LST and radar images for the dates of 10
th April 2020 and 28
th May 2020 (1 day) compared to that of 25
th March 2020 (same day).
In addition to the conventional random forest downscaling model, this study also introduced a CNN-based regression architecture for LST downscaling. The primary objective was to establish an end-to-end mapping between coarse-scale target values and fine-scale predictor values. As indicated in
Table 6, the CNN-based architecture exhibits slightly better quantitative performance compared to the random forest model when utilizing the same optical input. However, the difference in performance is not substantial. From a qualitative standpoint, the CNN-based downscaling model preserves the structural characteristics of features within the study area, in contrast to the RF-based model. Consequently, there are minimal drastic variations observed in the CNN-based downscaled LST values across local regions (
Figure 6e and
Figure 6h). When using VV, VH, and land cover as predictors without residual correction, the evaluation metrics for the CNN-based downscaling model are inferior to those of the RF-based downscaling model (
Table 8). For optical data, the evaluation metrics for RF and CNN models stays the same before residual correction. This discrepancy can be attributed, in part, to the fact that the feature inputs for the radar-based downscaling model, i.e., the neighbors and the land cover proportion do not correlate with the features learned by the convolution layers. Consequently, it may be erroneous to directly compare these models. However, this discrepancy could also be attributed to the fact that incorporation of any images that exhibit noisy spatial patterns like radar would always result in bad estimates due to the nature of the developed architecture. Specifically, the proposed CNN architecture attempts to map the intrinsic fine structure of the coarse resolution pixel with the value of corresponding coarse resolution pixel. Consequently, inputs lacking a smoothly varying pattern can pose challenges for the proposed CNN downscaling architecture in identifying appropriate features. If this hypothesis holds true, it highlights a limitation in the developed architecture and suggests the need for modifications. Potential modifications could involve integrating a fully connected network that functions as a traditional downscaling regression algorithm, where the inputs consist of the coarse-resolution predictors and target images. However, definitive conclusions cannot be drawn without conducting further experimentation.
5. Conclusion
This study introduced a new way to estimate land surface temperature using synthetic aperture radar data. Two different machine learning techniques, i.e., Random Forest and Convolutional Neural network, were experimented to downscale the coarse resolution LST image from 1000 m to 100 m. The information from Sentinel-1 SAR images served as predictors and was fed into the models to obtain high-resolution LST images. The achieved results were evaluated against the original LST information form Landsat-8 images at 100 m spatial resolution. Additionally, the performance of the proposed radar-based downscaling method was compared with the optical-based downscaling method. Remarkably, despite the inherent limitations of radar data such as geometric distortion, the performance of the downscaling models built with radar predictors was comparable to those constructed with optical predictors. A notable advantage of radar data over optical data is however its weather independence, enabling the downscaling models to be similarly unaffected by weather conditions. Additionally, inclusion of additional information from fully polarimetric SAR image, including second order covariance matrix, can also help to improve the performance of the proposed radar based downscaling method. Finally, one of the main challenges in generating high-resolution LST data from coarse-scale predictors is the temporal aspect. While radar images have shown promising results for spatial downscaling, they are not suitable for accurate temporal estimation. Temporal variations in LST values may not align with changes in predictors derived from Sentinel-1 GRD intensity images. To address this issue, incorporating phase information from Sentinel-1 Single Look Complex (SLC) product can help capture the temporal disparities in LST values over time. This presents an avenue for future research to explore and overcome this limitation.
Figure 1.
Map showing the study area: Zuid-Holland province (left) and various municipalities in Zuid-Holland province (right).
Figure 1.
Map showing the study area: Zuid-Holland province (left) and various municipalities in Zuid-Holland province (right).
Figure 2.
(a) Representation of neighboring values integration procedure: connection 'A' represents traditional pixel to pixel mapping, whereas connection 'B' represents neighbourhood to pixel mapping between the predictor image Pc and the target image Tc. (b) Representation land cover proportion integration procedure: Po refers to the predictor land cover image at original 10 m resolution, whereas Tc refer to target values at 1000 m respectively.
Figure 2.
(a) Representation of neighboring values integration procedure: connection 'A' represents traditional pixel to pixel mapping, whereas connection 'B' represents neighbourhood to pixel mapping between the predictor image Pc and the target image Tc. (b) Representation land cover proportion integration procedure: Po refers to the predictor land cover image at original 10 m resolution, whereas Tc refer to target values at 1000 m respectively.
Figure 3.
Developed model architecture for CNN-based downscaling experiments.
Figure 3.
Developed model architecture for CNN-based downscaling experiments.
Figure 4.
Results of the radar-based RF downscaling experiment (SAR image acquisition date: 25th March 2020. (a) the original Landsat-8 validation LST (100 m), (b) downscaled LST (100 m) without neighbours, (c) downscaled LST (100 m) with 5x5 neighbours, (d) downscaled LST (100 m) with 5x5 neighbours and land cover. Second row represents the scatter plot between the validation Landsat-8 LST and downscaled LST (e) without neighbours, (f) with 5x5 neighbours, (g) with 5x5 neighbors and land cover.
Figure 4.
Results of the radar-based RF downscaling experiment (SAR image acquisition date: 25th March 2020. (a) the original Landsat-8 validation LST (100 m), (b) downscaled LST (100 m) without neighbours, (c) downscaled LST (100 m) with 5x5 neighbours, (d) downscaled LST (100 m) with 5x5 neighbours and land cover. Second row represents the scatter plot between the validation Landsat-8 LST and downscaled LST (e) without neighbours, (f) with 5x5 neighbours, (g) with 5x5 neighbors and land cover.
Figure 5.
Results of the optical based RF downscaling experiments (optical image acquisition date: 26th March 2020). (a) downscaled LST (100 m) using six spectral channels, (b) scatter plot of downscaled image versus the validation Landsat-8 LST image (100 m).
Figure 5.
Results of the optical based RF downscaling experiments (optical image acquisition date: 26th March 2020). (a) downscaled LST (100 m) using six spectral channels, (b) scatter plot of downscaled image versus the validation Landsat-8 LST image (100 m).
Figure 6.
Results from the CNN-based downscaling experiments. First row displays Landsat-8 validation LST (100 m) for different dates. Second row shows generated downscaled LST for different dates using CNN. The input predictors in (e, f, & g) are VV, VH, and land cover, while in (h) is optical data. The third row displays the scatterplot with regression line between the downscaled (second row) and validation LST (first row) at 100 m.
Figure 6.
Results from the CNN-based downscaling experiments. First row displays Landsat-8 validation LST (100 m) for different dates. Second row shows generated downscaled LST for different dates using CNN. The input predictors in (e, f, & g) are VV, VH, and land cover, while in (h) is optical data. The third row displays the scatterplot with regression line between the downscaled (second row) and validation LST (first row) at 100 m.
Figure 7.
Effect of neighborhood feature inclusion in downscaled LST’s. (a) area of interest: Lansingerland. (b and d) represents downscaled LST’s without inclusion of neighbors and (c and e) with inclusion of 5x5 neighbors. Here, (b and c) are achieved downscaled LST’s before residual correction and (d and e) after residual correction.
Figure 7.
Effect of neighborhood feature inclusion in downscaled LST’s. (a) area of interest: Lansingerland. (b and d) represents downscaled LST’s without inclusion of neighbors and (c and e) with inclusion of 5x5 neighbors. Here, (b and c) are achieved downscaled LST’s before residual correction and (d and e) after residual correction.
Figure 8.
Effect of land cover feature inclusion in downscaled LST’s. (a) area of interest: Wassenar, (b) represents the mask image of the tree cover class with green color, (c and f) represents the achieved downscaled LST’s with inclusion of 5x5 neighbors, and (d and g) with inclusion of 5x5 neighbors and land cover. (e and h) show the corresponding Landsat-8 validation LST. Here, (c and d) are achieved downscaled LST’s before residual correction and (f and g) after residual correction.
Figure 8.
Effect of land cover feature inclusion in downscaled LST’s. (a) area of interest: Wassenar, (b) represents the mask image of the tree cover class with green color, (c and f) represents the achieved downscaled LST’s with inclusion of 5x5 neighbors, and (d and g) with inclusion of 5x5 neighbors and land cover. (e and h) show the corresponding Landsat-8 validation LST. Here, (c and d) are achieved downscaled LST’s before residual correction and (f and g) after residual correction.
Table 1.
Overview of the datasets used for this study.
Table 1.
Overview of the datasets used for this study.
Dataset [spatial resolution], [Google Earth Engine Tag] |
Acquisition Date [Time] |
Use case |
Landsat-8 LST [60 m], [“LANDSAT/LC08/C02/T1_L2”] |
25-03-2020 [10:33], 10-04-2020 [10:33], 28-05-2020 [10:33] |
Target variable (Aggregated 1000 m LST), Validation data (Original 100 m LST) |
Sentinel-1 SAR [10 m], [“COPERNICUS/S1_GRD”] |
25-03-2020 [17:25], 11-04-2020 [17:33], 29-05-2020 [17:33] |
Predictor variable |
ESA WorldCover v100 [10 m], [“ESA/WorldCover/v100”] |
One image for the entire year of 2020 |
Predictor variable |
Sentinel-2 MSI [10 m], [“COPERNICUS/S2_SR_HARMONIZED”] |
26-03-2020 [10:46] |
Predictor variable |
Table 2.
Evaluation metrics for the radar- and optical-based downscaling experiments.
Table 2.
Evaluation metrics for the radar- and optical-based downscaling experiments.
|
Radar |
Optic |
|
25/03/2020 |
10/04/2020 |
28/05/2020 |
25/03/2020 |
|
Without neighbors |
With 5x5 neighbors |
5x5 neighbors and land cover |
Without neighbors |
With 5x5 neighbors |
5x5 neighbors & land cover |
Without neighbors |
With 5x5 neighbors |
5x5 neighbors & land cover |
With six bands |
RMSE |
1.44 |
1.25 |
1.21 |
2.10 |
1.93 |
1.70 |
2.76 |
2.61 |
2.55 |
1.12 |
Correlation Coefficient (r) |
0.89 |
0.92 |
0.93 |
0.94 |
0.95 |
0.96 |
0.89 |
0.91 |
0.91 |
0.93 |
Coefficient of determination (R2) |
0.80 |
0.84 |
0.86 |
0.88 |
0.90 |
0.92 |
0.80 |
0.82 |
0.83 |
0.87 |
Table 3.
RMSE of all the downscaling experiments by each land cover class in the study area.
Table 3.
RMSE of all the downscaling experiments by each land cover class in the study area.
|
Radar predictors |
Optical predictors |
|
25/03/2020 |
10/04/2020 |
28/05/2020 |
25/032020 |
|
Without neighbors |
With 5x5 neighbors |
5x5 neighbors & land cover |
Without neighbors |
With 5x5 neighbors |
5x5 neighbors & land cover |
Without neighbors |
With 5x5 neighbors |
5x5 neighbors & land cover |
With 6 optical bands |
Tree Cover |
1.73 |
1.51 |
1.20 |
2.30 |
2.19 |
1.67 |
3.19 |
3.14 |
3.31 |
1.22 |
Shrubland |
3.11 |
3.03 |
2.29 |
4.41 |
4.48 |
2.48 |
6.52 |
6.60 |
3.24 |
2.10 |
Grassland |
1.14 |
1.00 |
0.96 |
1.67 |
1.56 |
1.41 |
2.35 |
2.19 |
2.07 |
0.95 |
Cropland |
1.61 |
1.42 |
1.05 |
2.55 |
2.18 |
1.70 |
3.34 |
3.32 |
3.41 |
0.96 |
Built-Up |
1.83 |
1.62 |
1.98 |
2.25 |
2.22 |
2.29 |
2.81 |
2.76 |
3.21 |
1.68 |
Bare/sparse vegetation |
2.81 |
2.58 |
2.10 |
3.22 |
3.14 |
2.81 |
4.85 |
4.47 |
4.30 |
1.81 |
Permanent water bodies |
1.37 |
1.09 |
1.03 |
2.31 |
2.01 |
1.73 |
2.77 |
2.45 |
1.79 |
0.92 |
Herbaceous wetland |
1.93 |
1.65 |
1.37 |
2.88 |
2.68 |
1.91 |
3.39 |
3.07 |
2.36 |
1.68 |
Table 4.
Evaluation metrics of the CNN-based downscaling experiments.
Table 4.
Evaluation metrics of the CNN-based downscaling experiments.
|
with VV, VH, and land cover |
with six optical bands |
|
25/03/2020 |
10/04/2020 |
28/05/2020 |
25/03/2020 |
RMSE |
1.04 |
1.44 |
2.23 |
1.09 |
Correlation Coefficient (r) |
0.94 |
0.97 |
0.93 |
0.94 |
Coefficient of Determination (R2) |
0.89 |
0.94 |
0.87 |
0.88 |
Table 5.
RMSE of the CNN-based downscaling experiments for each land cover.
Table 5.
RMSE of the CNN-based downscaling experiments for each land cover.
|
with VV, VH, and land cover |
with six optical bands |
|
25/03/2020 |
10/04/2020 |
28/05/2020 |
25/03/2020 |
Tree Cover |
1.09 |
1.50 |
3.12 |
1.18 |
Shrubland |
2.18 |
2.60 |
3.45 |
2.36 |
Grassland |
0.87 |
1.18 |
1.97 |
0.92 |
Cropland |
0.99 |
1.53 |
2.18 |
1.04 |
Built-Up |
1.55 |
1.83 |
2.52 |
1.58 |
Bare/sparse vegetation |
2.16 |
2.60 |
3.71 |
2.04 |
Permanent water bodies |
0.85 |
1.42 |
1.86 |
0.89 |
Herbaceous wetland |
1.50 |
3.21 |
3.66 |
1.57 |
Table 6.
Evaluation metrics of the optical based RF and CNN downscaling models.
Table 6.
Evaluation metrics of the optical based RF and CNN downscaling models.
|
with six optical bands |
|
RF |
CNN |
RMSE |
1.12 |
1.09 |
Correlation Coefficient (r) |
0.93 |
0.94 |
Coefficient of Determination (R2) |
0.87 |
0.88 |
Table 7.
RMSE before and after residual correction of radar-based RF downscaling experiments.
Table 7.
RMSE before and after residual correction of radar-based RF downscaling experiments.
|
Without neighbors |
With 5x5 neighbors |
With 5x5 neighbors and land cover |
|
|
|
|
|
|
|
25/03/2020 |
1.80 |
1.44 |
1.63 |
1.25 |
1.41 |
1.21 |
10/04/2020 |
2.73 |
2.10 |
2.56 |
1.93 |
2.20 |
1.70 |
28/05/2020 |
3.51 |
2.76 |
3.38 |
2.61 |
3.19 |
2.55 |
Table 8.
RMSE values for RF and CNN downscaling models before and after residual correction.
Table 8.
RMSE values for RF and CNN downscaling models before and after residual correction.
|
RF |
CNN |
|
|
|
|
|
Predictors derived from VV, VH and land cover |
1.41 |
1.21 |
2.09 |
1.04 |
Predictors derived from six bands of optical imagery |
1.39 |
1.12 |
1.39 |
1.09 |