1. Introduction
Located in the northwest of China, Loess Plateau is a typical traditional agricultural area subject to less precipitation, where agricultural development is dominated by dry farming[
1]. Loess Plateau of eastern Gansu Province, as a typical rain-fed agricultural region, is located in a fragile zone of ecological environment in China, where the precipitation is unevenly distributed in time and space, with a large annual difference and unbalanced annual distribution. Corresponding to a gradual increase in precipitation from northwest to southeast, the crop distribution in this region is subject to an obvious gradient difference from southeast to northwest.[
2]To name a few, apples that need plenty of sunshine and moisture are mainly distributed in the southeast; buckwheat that is cold and drought-tolerant and has a short growing period is mainly distributed in the northwest; corn and wheat are the main crop types in the Loess Plateau of eastern Gansu Province, accounting for more than 50% of the total area. They are distributed across the plateau region, more in the southeast and less in the northwest. Fine classification of crop planting structure is of great significance for agricultural decision-making, sustainable agricultural development and food security[
3].
Boasting short detection period, low cost, wide coverage and many other advantages, remote sensing technologies have become one of the most popular methods for mapping crop spatial distribution at regional scale[
4,
5,
6].At present, the scholars at home and abroad have conducted in-depth and extensive studies on crop planting structure recognition mainly around key issues such as the number of remote sensing images, image object feature[
7,
8,
9,
10], and classifier selection [
11,
12].
Among the classification methods based on the number of remote sensing images, crop recognition methods are roughly divided into two categories based on single-temporal images and time-series images, respectively. The former is designed to identify the key growing season for crops[
13], and then obtain the spatial distribution information of crops[
14,
15,
16]. At present, medium spatial resolution images represented by Landsat and Sentinel are among the most widely used remote sensing data[
17]. With suitable spatial resolution and spectral band, they contribute to an increase in the accuracy of regional crop identification. However, as the spectrum of crop is affected by many factors such as crop type, soil background, farming activities, as well as cloudy and rainy weather and the information obtained from single-temporal images is limited, misclassification or absent classification in crop identification results frequently emerges, which can not meet the needs of high-precision crop mapping. Compared with single-temporal image classification, multitemporal remote sensing images can map crop classifications more accurately based on a full use of the phenological characteristics of crops. China has implemented a major project for a high-resolution earth observation system, and successfully launched a line of domestic high-resolution satellites (GF-1, GF-2and GF-6)[
18,
19]. Collaboration of domestic high-resolution satellites can effectively improve the ability to monitor crops, and provide great potential for fine identification of crops subject to fragmented distribution in complex areas of the Loess Plateau. For example, Zhou et al. [
20] found that it’s able to effectively extract the information of crops in rainy areas from multi-temporal GF-1 images; Xia et al. [
21]used multi-temporal GF-6 images for intensive observation of crops in cloudy and rainy areas, and worked out the rule of seasonal changes of typical crops, thus achieving high-precision mapping of crop planting structure.
Crop classification methods based on object features mainly include pixel-level classification and Object-Based Image Classification. Pixel-level classification focuses on locality and ignores the correlation between objects. Salt-and-pepper noise due to the spectrum difference of in-class pixels as well as “mixed pixels" as a result of interclass pixel proximity effect have compromised the classification accuracy of crops to a big extent, making it difficult to meet the need for practical applications[
22,
23]. With the segmented object as its basic unit, Object-Based Image Classification is more targeted and has the advantages of reducing the amount of data involved in the operation, smoothening the image noise, and further collecting the crop information to classify by introducing the features of object shape, texture, etc. For example, Su et al. [
24]made full use of 58 spatial feature factors of crops, and completed high-precision classification of crops in the region using the Object-Based Image Classification method. Karimi et al. [
25]enabled high-precision crop classification of multi-temporal Landsat remote sensing images with the object-oriented classification method. Common crop remote sensing classifiers include but are not limited to Maximum Likelihood Method, Support Vector Machine (SVM), and Random Forest [
26,
27,
28]. In recent years, deep learning has gradually become the mainstream algorithm in the image pattern recognition area by virtue of its hierarchical representation of features, efficient operation and end-to-end automatic learning[
29,
30]. Convolutional Neural Network (CNN), as one of the fastest-developing algorithms in deep learning, has been widely used in crop classification tasks [
31,
32,
33]. Both models demonstrate good applicability in interpreting crop planting structures, and the classification results largely correspond to the actual distribution. The classification results exhibit good separability, with fewer instances of misclassification and omission, allowing for better extraction of crop information in areas with higher heterogeneity.
As one of the prerequisites of object-based image classification, the optimal scale segmentation of crops has a direct impact on the size of generated objects and the precision of crop extraction[
34]. ESP2(Estimation of scale parameter 2, ESP 2) is the most commonly used optimal segmentation scale tool that is able to screen out the optimal segmentation scale of different crops[
35]. However, in light of the small spectral difference between different types of crops and the great influence of noise in the segmentation results, it is difficult to separate crops correctly only using the Multi Resolution Segmentation technology. Cannyedge detection is able to obtain more accurate edge information and filter noise with less time.[
36,
37]. In addition to combining multiple feature learning,, Object-Based Image Classification also increases the dimension of feature space and reduces the efficiency of data processing, of which secondary features may result in noise and even lower the classification accuracy. For this reason, the efficient construction of feature space has become a key factor of Object-Based Image Classification. For example, F. Low et al. [
38] used Random Forest (RF) to obtain the best features of crop classification, improving the computational efficiency and classification accuracy. Chen Zhulin et al. [
39]used three feature dimensionality reduction methods for dimensionality reduction of crop features, i.e., Random Forest (RF), Muti-Information (MI) and L1 Regularization. It’s found that L1 Regularization feature dimensionality reduction method performed best in crop classification.
In conclusion, object-based classification and multi-temporal feature classification of high-resolution images are the mainstream direction of studies on crop remote sensing recognition at present. How to integrate the two is the current key scientific problem and technical difficulty. Based on the time series remote sensing data sources of GF-1, GF-2 and GF-6, this paper conducted a case study on four representative test areas from southeast to northwest of the Loess Plateau of eastern Gansu Province, constructed the optimal segmentation scale set of different crops using the ESP2 tool, RMAS model and Canny Edge Detection, performed feature selection with L1 Regularization Logistic Regression Model and sophisticated classification of crops in the test areas using the object-based and random forest methods, and used Convolutional Neural Network to cross-verify the results, with a view to providing new ideas and methods for regional classification of rain-fed crops in the Loess Plateau of eastern Gansu Province.
3. Results and Analysis
3.1. Segmentation Results Combined with Canny Edge Detection
When the object information is complex, mutiresolution segmentation often has a poor effect. For example, the segmentation results of crop contain more noise and a fuzzy outline. To solve this, the edge information of objects detected by Canny operator is used in our study as a feature factor to participate in mutiresolution segmentation, and the segmentation results are compared with the single mutiresolution segmentation results without Canny detection.
This method preliminarily determines the optimal segmentation scale range of crops using the ESP2 tool prior to crop segmentation, and calculates the optimal segmentation scale of each crop with the RMAS model. The ESP2 tool mainly uses the object homogeneity Local Varane (LV) and its Rates 0f Change (ROC) as the quantitative evaluation indicator for screening, while considering the shape factor and compactness factor of 0.1 and 0.5, respectively. In order to evaluate the relative contribution of each band and edge detection of images, the weights of R, G, B, NIR bands and Canny Edge Detection are set to be 1. When the ROC of local variance reaches the maximum, more reasonable segmentation results can be obtained. Therefore, this paper selected the segmentation scale corresponding to the peak rate of ROC value as the optimal segmentation scale for the images, while giving priority to the LV indicator (
Figure 5). The optimal segmentation scales for preliminary screening were 35, 45, 70, 95, 105, 125, 135, 170 and 220. Within the range of the segmentation scale, it increases successively with an increment of 5. It’s determined after repeated tests that the range of crop segmentation scale value is between 35-95. Within this range, we successively calculate the RMAS value of different crops, and take the maximum RMAS value as the optimal segmentation scale, as shown in
Figure 6. Upon final calculation, the optimal segmentation scale of buckwheat is 35, that of wheat and apple is 65, and that of corn is 55.
Mutiresolution segmentation results before and after edge information detection by Canny are integrated based on the optimal segmentation scale (
Figure 7). It’s concluded from the comparison that under the same segmentation scale, the image objects obtained only by mutiresolution segmentation cannot effectively distinguish adjacent features. In the results with object edge participating in segmentation, the object is more complete, and has a clear outline, stronger separability and better segmentation quality, in addition to reducing salt-and-pepper noise.
3.2. Feature Factor Optimization
This study gathered statistics of the NDVI values of different crops and the NDVI ratios of adjacent months to learn about the growth status and change of crops, and took the changing trend of NDVI and the month with the largest change rate as the reference image for crop classification. As indicated by statistical calculation (
Figure 8), the reference images of wheat, corn, apple and buckwheat were selected in May, July, August and September, respectively. On this basis, NDVI ratios in different phenological periods was used as the feature factor in order to facilitate the classification and interpretation of crops and improve the classification accuracy.
As the features of a higher dimension are selected, it leads to redundancy among the features. In order to reduce the redundant computation of data, this paper uses the L1 regularized logistic regression model[
63]to conduct quantitative analysis of 39 features. It’s found after many tests, better results can be obtained by setting regularization parameter C to 0.9 and iteration times to 1000. In view of the fact that the feature factors vary between crops in the classification process, we optimize the feature factors of each crop, and make statistics of the contribution rate of feature factors of different categories of crops.
Figure 9 shows the feature factors of different crops screened using the L1 regularized logistic regression model.
It can be seen from statistics that the number of feature factors after optimization of the four crops is greatly reduced, with texture and spectral features accounting for a large proportion, followed by geometric and exponential features. Among them, the number of feature factors selected for corn, wheat, apple and buckwheat are: 9, 12, 10 and 16, respectively. As can be seen from the Figure, corn has a total of 9 auxiliary feature factors, dominated by texture and spectral feature factors, namely GLDV_Ang_2 and Mean_NIR, with weights of 3.89 and 1.24 respectively. Since corn ripens mainly in July and August, during this period, corn has a deep red false-color image and obvious texture features. GLCM Ang. 2nd Moment reflects the uniformity of texture in the image and can be used as a cofactor to identify corn crops together with the reflection characteristics of near-infrared band. There are a total of 13 auxiliary classification feature factors for wheat, dominated by spectral and texture factors. Its main spectral features include Mean_NIR and Max_diff, with weights of -3.95 and 1.69 respectively. During the growing season of wheat, which mainly ranges between March and May, wheat is mainly manifested as light red in color and has even texture, making it easier to distinguish than other crops. In addition to using high near-infrared band characteristics, we can effectively assist wheat classification with the maximum band difference value, and the texture features are mainly GLDV_Mean_, with a weight of -1.18. The features of apple are dominated by GLCM_StdDe and Standard_NIR, with the weights of -1.61 and -1.97, respectively. During the growing season of apple, which mainly ranges between August and September, the texture of apple is more significant than other crop features, and the information change in the band standard deviation constitute a good reflection of the information on apple. The dominant factors of buckwheat are RVI, GLDV_Mean_, Layer_NIR and GLCM_Dissi, whose weights are 3.23, 2.54, -2.14 and 2.82, respectively. This is because buckwheat mainly ripens in September and October, which are also the ripe season of apple and corn. Buckwheat, on the other hand, is more delicate in texture than other crops, looking pink in false-color images, while other crops are darker in color. RVI that reflects the growth state of crops has the highest weight. Besides, if participating in classification, Mean_NIR among spectral features has a better effect, because the reflectance of crops is usually higher in near-infrared bands.
3.3. Object-Based Crop Classification Results
The classification results from the four typical test areas (
Figure 10) indicate that both the RF and CNN models categorize crops into regular blocks, which aligns with the orderly distribution of crop planting structures in the test plots. To further assess the classification accuracy of different models, field sample data was used for accuracy validation. The confusion matrix evaluation method was employed, utilizing overall accuracy and Kappa coefficient as evaluation parameters to assess the accuracy of the Random Forest (RF) algorithm and Convolutional Neural Network (CNN). The accuracy validation results for the four test areas are shown in
Table 3.
According to Table 4, both RF and CNN models feature a high accuracy in the results of classification of crops in the test area Ⅰ. Overall, regardless of the Kappa coefficient or overall classification accuracy, in the four test areas, the classification results obtained by RF model are higher than those obtained by CNN model. In the test area I, it can be seen through calculating the Kappa coefficient of each category of crops that the one calculated by RF model is higher than that by the CNN model. Overall, the Kappa coefficients calculated by the RF model and CNN model are 0.89 and 0.87, respectively. The overall accuracy of the calculation by two models both exceeds 90%. Test areas Ⅱ and Ⅰ have the same category of crops, but the former has a smaller planting area of buckwheat than the latter. The Kappa coefficient of two models for buckwheat classification is smaller than that of the other two crops, because the buckwheat is planted in a small area where there are not enough samples for training. Due to too few samples, there emerges the phenomenon of model overfitting, resulting in low accuracy[
64]. However, the RF and CNN models have a high classification accuracy overall, which reaches 94.92% and 93.43%, respectively. The test areas Ⅲ and Ⅳ dominated by wheat, corn and apple feature a high classification accuracy of the two test areas overall. In the test area Ⅲ, the overall classification accuracy of RF and CNN models reached 89.37% and 88.94%, respectively, while that of the two models in the test area Ⅳ was 90.68% and 90.18%, respectively.
This paper conducts a comparative analysis of the accuracy of the results using the two methods and makes statistics on the area and proportion of four test areas (
Figure 11,
Figure 12,
Figure 13 and
Figure 14). In the test area I, among wheat, corn and buckwheat, corn has the largest planting area in the classification results, among which the classification area of RF model is 0.56 km2, accounting for 22.74%, and that of CNN model is 0.46 km2, accounting for 18.54%. The test area II has the identical crop planting structure as the test area I. Similarly, corn has the largest area in the classification results, among which the corn area of classification using RF model is 0.86 km2, accounting for 34.84%, and that of CNN model is 0.92 km2, accounting for 37.19%. Buckwheat has the smallest area, and the area classified by RF and CNN models is 0.08 km2 and 0.20 km2, accounting for 3.12% and 7.94%, respectively. In the test area III, corn has the largest area compared to any other crop. The area classified by RF and CNN is 0.73 km2 and 0.77 km2, respectively, and the area of apple classified using these two models is larger, which is 0.43 km2 and 0.33 km2, respectively. In the results of two models, the area of corn classified by RF model is larger than that of CNN model, while the area of apple classified by RF model is smaller than that of CNN model. According to field investigation statistics, there are more planting structures of corn and apple in this area. The main reason is that both corn and apples mature in August and September, both of which are similar in false-color remote sensing images during this period. Compared with RF model, the CNN model can better adapt to complex data patterns. In addition to the crops studied, other vegetation also exists in the test area. Due to the interference of diverse and complex vegetation on the classification process, the RF model is not effective in fully classifying crops. In the test area IV, apple has the largest area. The area classified by RF and CNN is 0.80 km2 and 1.02 km2, respectively. As can be seen from the classification result graph, the area of independent apple blocks classified by CNN model is larger. Seen from the field investigation and interpretation of remote sensing image satellites, the planting structures in this test area were regularly distributed in blocks. As the CNN model divided adjacent crops into a unified entirety, this may be the main reason why CNN model has a larger classification area for apples than RF model. Overall, the classification results of four test areas vary as two models use different algorithms. The difference is particularly significant in the test areas where corn and apple are planted. However, in terms of spatial distribution and quantity classification of crops, the classification results of CNN model and RF model are highly consistent and integral in spatial distribution.
Figure 1.
Phenological Periods of Different crops (Ⅰ, Ⅱ, Ⅲ, Ⅳ represent the test areas where staple crops are located).
Figure 1.
Phenological Periods of Different crops (Ⅰ, Ⅱ, Ⅲ, Ⅳ represent the test areas where staple crops are located).
Figure 2.
Study Area (a) Geographical distribution of Loess Plateau and Gansu Province in China; (b) Digital Elevation Model of Loess Plateau of eastern Gansu Province and distribution of representative test areas; Ⅰ, Ⅱ, Ⅲ and Ⅳ represent locations in Huanxian County, Zhenyuan County, Xifeng Hot Spring Town and Zaosheng Town, Ningxian County, respectively.
Figure 2.
Study Area (a) Geographical distribution of Loess Plateau and Gansu Province in China; (b) Digital Elevation Model of Loess Plateau of eastern Gansu Province and distribution of representative test areas; Ⅰ, Ⅱ, Ⅲ and Ⅳ represent locations in Huanxian County, Zhenyuan County, Xifeng Hot Spring Town and Zaosheng Town, Ningxian County, respectively.
Figure 3.
Hierarchical network diagram of image objects.
Figure 3.
Hierarchical network diagram of image objects.
Figure 5.
Evaluation of optimal segmentation scale using ESP2 tool.
Figure 5.
Evaluation of optimal segmentation scale using ESP2 tool.
Figure 6.
RMAS values of different crops.
Figure 6.
RMAS values of different crops.
Figure 7.
Segmentation map before and after integrating edge information.
Figure 7.
Segmentation map before and after integrating edge information.
Figure 8.
NDVI variation trend of different crops and its variation rate in adjacent months.
Figure 8.
NDVI variation trend of different crops and its variation rate in adjacent months.
Figure 9.
Preferred feature factors of different crops.
Figure 9.
Preferred feature factors of different crops.
Figure 10.
Classification results of RF and CNN models
Figure 10.
Classification results of RF and CNN models
Figure 11.
Area and proportion of classification results by two models in the test area I (Huanxian County).
Figure 11.
Area and proportion of classification results by two models in the test area I (Huanxian County).
Figure 12.
Area and proportion of classification results by two models in the test area II (Zhenyuan County).
Figure 12.
Area and proportion of classification results by two models in the test area II (Zhenyuan County).
Figure 13.
Area and proportion of classification results by two models in the test area III (Xifeng County).
Figure 13.
Area and proportion of classification results by two models in the test area III (Xifeng County).
Figure 14.
Area and proportion of classification results by two models in the test area IV (Ningxian County).
Figure 14.
Area and proportion of classification results by two models in the test area IV (Ningxian County).
Figure 15.
False-color images of different crops.
Figure 15.
False-color images of different crops.
Figure 16.
Segmentation results combined with Canny Edge Detection.
Figure 16.
Segmentation results combined with Canny Edge Detection.
Figure 17.
Statistics of texture features of corn and apple.
Figure 17.
Statistics of texture features of corn and apple.
Table 1.
Image Data of High-resolution Satellites (Ⅰ~Ⅳ represent 4 experimental areas respectively).
Table 1.
Image Data of High-resolution Satellites (Ⅰ~Ⅳ represent 4 experimental areas respectively).
Name of satellite |
Sensor (PMS) |
Spatial resolution (m) |
Image quantity |
|
|
|
Ⅰ |
Ⅱ |
Ⅲ |
Ⅳ |
GF-1 |
PMS1/PMS2 |
2 |
/ |
2 |
1 |
/ |
GF-2 |
PMS1/PMS2 |
1 |
4 |
1 |
1 |
4 |
GF-6 |
PMS1/PMS2 |
<2 |
/ |
1 |
2 |
/ |
Table 2.
Spatial feature factor information.
Table 2.
Spatial feature factor information.
Feature category |
Feature variable |
Total/number |
Spectral feature |
Mean_R, Mean_G, Mean_B, Mean_NIR, Max_diff、Briahtness and Standard Deviation (four bands) |
10 |
Texture features |
GLCM Mean, GLCM Ent, GLCM Homo, GLCM Std, GLCM Dissim, GLCM Contrast and GLCM Ang. 2nd Moment, GLCM Corr, GLDV Mean, GLDV Ent, GLDV Contrast and GLDV Ang. 2nd Moment |
12 |
Geometric features |
Area, length/Width, length, Width, Border Length, Shape lndex, Density, Asymmetry, Roundness, Boundary Index, Compactness, Ellipse Fitting, Rectangle Fitting |
13 |
Index features |
EVI, NDVI, R/G and RVI |
4 |
Table 3.
Accuracy validation of the classification results through random forest and deep learning.
Table 3.
Accuracy validation of the classification results through random forest and deep learning.
Test area |
Type of Crops |
Kappa coefficient of each crop |
Kappa coefficient of overall classification results |
Overall Accuracy |
RF Model |
CNN Model |
RF Model |
CNN Model |
RF Model |
CNN Model |
Ⅰ |
Wheat |
0.92 |
0.90 |
0.89 |
0.87 |
0.92 |
0.91 |
Corn |
0.85 |
0.81 |
Buckwheat |
0.96 |
0.93 |
Ⅱ |
Wheat |
0.93 |
0.89 |
0.91 |
0.88 |
0.95 |
0.93 |
Corn |
0.91 |
0.87 |
Buckwheat |
0.86 |
0.88 |
Ⅲ |
Wheat |
0.87 |
0.89 |
0.85 |
0.84 |
0.89 |
0.89 |
Corn |
0.84 |
0.81 |
Apple |
0.85 |
0.80 |
Ⅳ |
Wheat |
0.86 |
0.79 |
0.86 |
0.85 |
0.91 |
0.90 |
Corn |
0.78 |
0.86 |
Apple |
0.93 |
0.89 |