2.1. Existing Lead Classification Method
In this section, we shortly explain the original algorithm discussed in [
28] and then describe in details the improved lead detection algorithm presented in this study. The previously published method for the detection of leads is based on texture analysis of Sentinel-1 SAR images taken in the Extra Wide (EW) swath mode with 40 meters pixel size. The scenes downloaded from the Copernicus Open Access Hub (
https://scihub.copernicus.eu) are calibrated to
values with the calibration and noise look up tables provided in the metadata of the SAR datasets. Since sea ice backscatter depends on the incidence angle, an incidence angle correction with the coefficient 0.213 is applied to the HH channel. Then the polarization ratio HH / HV is calculated. Throughout the next steps of the classification the two images, (i) the HH channel and (ii) the polarization ratio, are considered separately. At the end, the final result of the classification is calculated as the combination of the two branches.
Figure 1 shows an example of the individual lead classification steps and the final binary lead classification result.
In short, the existing lead classification from [
28] consists of the following steps: The
pixel bilateral filter is applied to images to reduce speckle noise [
31]. After that, texture descriptors based on gray-level co-occurrence matrix (GLCM) are calculated for
pixel windows surrounding every second pixel of the scene in each direction at 1-pixel distance [
32]. As the result, the resolution of texture descriptors is two times coarser than the input Sentinel-1 scene. The window size is set to 9 pixels in order to collect enough statistics on texture on one side, and to be able to resolve leads of several pixel width on the other side. An increase of the window size would increase the minimal lead size that can be detected. On the other hand, a decrease of the window size would increase ambiguities in texture descriptors. In the next step, the texture descriptors are passed as input features to the Random Forest Classifier [
33,
34]. The classifier calculates probabilities for each pixel to represent a part of a lead based on the texture of the pixel surrounding. The resolution (pixel size) of the result is 80 meters. In the Murashkin et al. article [
28], describing the first version of the lead detection algorithm, we suggested to use a threshold, e.g., 50%, for probabilities to produce a binary lead map.
Based on this previous work we here propose an improved Sentinel-1 lead detection algorithm. The first update to the algorithm is the improved Sentinel-1 EW preprocessing procedure that balances backscatter between sub-swath, i.e., brightness of the SAR image (
Section 2.2). The second update is the use of a U-Net convolutional neural network instead of GLCM feature calculation and random forest classification (
Section 2.3).
2.2. Improved Sentinel-1 Preprocessing
The previously suggested preprocessing algorithm used a scalloping noise correction (provided within the Sentinel-1 meta data since 13 March 2018). However, the backscatter corrected with the thermal noise vector provided with the default Sentinel-1 auxiliary data shows discrepancies at the sub-swath boundaries, especially pronounced on the cross-polarization channel, which has been addressed by [
35,
36,
37]. To correct for the discrepancies, we apply a sub-swath balancing technique, that brings Normalized Radar Cross Section (NRCS) values within different sub-swath to a uniform scale. The thermal noise data provided in the Sentinel-1 auxiliary data show discrepancies at the sub-swath borders. We assume that the range dependence of thermal noise within a sub-swath has little discrepancy, whereas the absolute value has some error and can be corrected with a scale factor
.
is the
i-th sub-swath,
cor is the corrected value,
raw is the raw value.
is the corrected noise value in the sub-swath
,
is the raw noise value in the sub-swath
, provided as look-up tables in the auxiliary data. To ensure the land does not introduce a variation, significantly larger than that of sea ice, the maximal NRCS values are limited with 700 for the HH channel and 300 for the HV channel for the sub-swath balancing coefficient calculation. We assume Sentinel-1 scenes to be homogeneous at the sub-swath border, therefore, average NRCS values should not change drastically from one sub-swath to the next one. Thus, the SAR NRCS values averaged in along track direction should be continuous in the range direction.
where
is Digital Number (the value provided in the Sentinel-1 level 1 product before calibration) averaged in along track direction at the sub-swath
to sub-swath
transition. The 5th sub-swath is taken as reference and its scale factor is 1, 1st to 4th swaths are brought to the NRCS values of the 5th sub-swath, see
Figure 2e,f. Then, the noise scale factor can be calculated with
Scaling thermal noise for the adjacent sub-swath by
allows us to produce a more uniform Sentinel-1 image, which improves image classification robustness. An illustration of the preprocessed Sentinel-1 scenes is shown in
Figure 2.
Figure 2a,b show two Sentinel-1 EW HV channels with the preprocessing procedure suggested by ESA, with the noise tables provided as Sentinel-1 meta data, applied. The different sub-swaths are still noticeable, especially sub-swath 1 (left part of images). The same images with sub-swath balancing applied are shown in
Figure 2c,d.
Figure 2e,f show NRCS values averaged along the azimuth direction. The abrupt changes in average brightness between swaths with the basic noise correction applied (blue line) are eliminated with the swath balancing applied (orange line).
2.3. Improved Lead Detection
For lead detection, we use the U-Net convolutional neural network introduced in [
29]. It is represented by a multi level encoder-decoder architecture, where encoder and decoder are connected on every level. We increased the depth of the encoder and the decoder to six layers compared to four layers suggested in the original study, the algorithm is schematically shown in
Figure 3. This is done to increase the "field of view" of the convolution layers stack (or the "effective size of the convolution layers stack"). Further increase would require increase of the input tile size and increase of the computational complexity of the algorithm. A preprocessed Sentinel-1 image is split into 512-pixel square patches. The input for the model consists of a stack of the patches. Each of them has two channels: the HH channel and HV channel, and therefore the input has the dimension 512x512x2 pixels. Both channels are normalized to [-1; 1] with following values:
dB and 4 dB for the HH channel,
dB and
dB for the HV channel. Values below the minimal value and above the maximal value are cut out. This range of values covers the backscatter range for most sea ice types, except of maybe some ridges, which may have a higher backscatter values, however this should not affect the lead detection. These thresholds are used for all scenes to ensure the consistency of the normalized images. Every block of the diagram on the
Figure 3 consists of a 50% dropout layer [
38] followed with two convolutional layers with 3x3 convolution kernel size and ReLu activation function [
39]. To retain the original image size, convolutional layers are applied with the
padding same option. Red arrows represent max pool layer, which decreases the image size by a factor of two and keeps only the maximal value in every block of 2x2 pixels. Green arrows show upscale convolution layers, where image size in increased twice. The output layer includes
kernel regularization [
40] and a softmax activation function and provides output class probabilities.
The training data annotations are converted to one hot labels, thus every pixel has a number for every label type representing weight of each label for the pixel. In order to balance the labels, one hot label values are weighted by the number of samples in the corresponding class. As the result, the input from each class to the loss function is the same, even in case the number of samples in each class differs. In addition to the three classes: “dark leads”, “bright leads”, and “sea ice”, we also introduce an extra, 4th class for unlabeled data. The one hot label value for the unlabeled data class is 0, therefore it has no influence on the loss value and therefore does not affect the training process. At the same time, it allows us to label irregular shape objects and provides a certain flexibility to the labeling process. On the one hand, not every pixel has to be labeled, on the other hand small leads can also be labeled even if they are smaller then the typical box size in case of box labeling.
Splitting images into tiles leads to appearance of edge effects at patch edges on the corresponding classified image. This happens due to lack of context around pixels at, or close to, tile edges. To reduce this effect, we apply classification four times to every image. Each of the four times the Sentinel-1 preprocessed image is split into tiles with an offset as shown in
Figure 4. Frames in black, red, green, and blue colors correspond to 0% (no offset), 25%, 50%, and 75% relative offset. Every pixel of a classified tile is weighted linearly between zero and one by its distance from an edge (weight 0) to the middle (weight 1) of the tile. This way, the classification result for a pixel in the middle of a tile has a higher weight compared to a pixel at an edge of the tile. Four weighted probabilities (one per offset) are summed up to the final classification result. For instance, a pixel shown in purple in the input scene in
Figure 4 will appear at the middle of the red tile (split into tiles with 25% offset), closer to an edge of the green and black tiles (split into tiles with 50% and 75% offset), and at the edge of the blue tile (75% offset). Therefore, the lead probability, produced by the model applied to the red tile, has the highest contribution to the final result, while the lead probability, produced by the model applied to the blue tile, has the lowest contribution. The influence of near-border pixels on the per-pixel classification is, thus, decreased.
Output of the model consists of three channels with probabilities for the three surface classes: sea ice, dark lead, and bright lead. Dark leads are the leads appearing dark on the HH channel due to their smooth surface. These represent leads under calm open water conditions or refreezing leads with thin smooth layer of sea ice. Bright leads appear bright on the HH channel, but typically dark on the HV channel. These leads correspond to wind-roughened surface or thin sea ice with rough surface. The training data consists of 21 manually labeled Sentinel-1 images, collected over Fram Strait, Beaufort Sea, Barents Sea, and Kara Sea, with dark and bright leads marked with a different label. The training data is splits with 80% of the data set is used at train subset and 20% is used as test subsets.