The land use/cover change (LUCC) is closely associated with human production and living, social and economic development, as well as ecological carrying capacity [
1,
2,
3]. With the continuous development and releases of remote sensing images and advancements in image processing techniques, many LUCC products were developed during the recent decades at regional, national and global scales [
4,
5], such as the Global Land Cover map (DISCover) for 1992 [
6], the Global Land Cover 2000 (GLC2000) [
7], the MODIS series products [
8], the 30-m global land cover data (Globeland30) [
9], the 10-m Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) products for 2017 [
10], the 30-m fine classification system global land cover product (GLC_FCS30) [
11], European Space Agency Climate Change Initiative (ESA-CCI) land cover product during 1992-2020 (300 m) [
12], and the ESRI annual map of Earth’s land surface for 2017-2023 [
13]. Due to the requirements of higher temporal and spatial accuracy data, many LUCC products were also produced specifically for China, such as China’s Land-Use/cover Dataset (CLUD) at 30-m resolution for the 1980s, 1995, 2000, 2005, 2010, 2015 and 2020 [
14,
15], annual China Land use/Land cover datasets (CLUD-A) [
16] and China Land Cover Dataset (CLCD) [
17]. Although these datasets have been validated with high accuracy, the intercomparisons indicated that there is a large discrepancy among these datasets, and none of the spatiotemporal patterns of these datasets match well with the China’s statistical or inventory data at both regional and national scales [
18]. Most datasets showed a slight increase or even decrease in forest area from the 1980s to present, and none of the datasets can match the temporal change trends of national statistical data. For example, Qin et al. [
19] has compared several LUCC products and indicated that the forest area of five datasets ranged from 174 × 10
4 km
2 to 227 × 10
4 km
2 in 2010. Yang and Huang [
17] reported that forest area in China has only increased by 4.34% during 1980-2019, significantly lower than the national forest inventory (NFI) released 77% increase from 1984-1988 (12.98% forest coverage) to 2014-2018 (22.96%). Yu et al. [
20] indicated that most of the cropland data in the existing LUCC products are not consistent with the statistical data by comparing over 10 existing cropland datasets. Similarly, the wetland area in the CLUD, MODIS, CLCD and CLUD-A has changed less than ±5% during 1980-2020, while reports have indicated that China’s wetland area has significantly reduced by 33% [
21,
22]. In addition, most of these existing datasets only targeted at a single LUC type, while few studies have comprehensively addressed the spatiotemporal patterns for all LUC types. Therefore, it is necessary to produce a more accurate and comprehensive long-term LUCC dataset for China.
Several attempts have been made to match areas based on statistical and field survey data. For instance, Xia et al. [
23] reconstructed a new forest cover data set (CFCD) from 1980 to 2015 by combining several existing LUCC datasets and NFI; however, this approach only matched the temporal change patterns but sacrifices the spatial accuracy. To match the statistical cropland area and change trends, Yu et al. [
20] developed a subpixel level cropland share dataset; however, this dataset only targeted at the cropland area and did not consider other LUC types. There are two major reasons for the misrepresentation of LUCC at spatiotemporal scales in China. The first reason is that most LUCC products were developed using the pixel-based classification methods [
20,
24]. In the pixel-based approach, each pixel is regarded binary value (either Boolean 0 or 1), i.e., each grid cell is completely occupied one land cover type [
20]. This approach is more suitable for high-resolution images [
25]. The small percentage of the pixels could be ignored based on this approach, resulting in an underestimation for the changes. Using forest as example, the forest area in China is defined as tree coverage greater than 10% within a minimum area of 0.5 ha. The pixels with tree coverage ranging from 10% to 100% are regarded as forest area, which will result in the failing reflection of the change of pixel-level tree coverage in the LUCC products and an underestimation of forest area increase is caused when the tree coverage increases from 10% to 100%. To develop a more accurate temporal change pattern of LUCC, it is necessary to produce a subpixel level LUCC dataset that can reflect the fractional shares of each LUC type within each pixel [
20]. The second reason is that most LUCC products did not simultaneously match the spatiotemporal patterns of all LUC types with statistical data. For example, Xia et al. [
18], Yu et al. [
20], Gong et al. [
26] and Niu et al. [
21] only targeted at match the forest, cropland, urban and wetland area with inventory data, respectively. None of current long-term LUCC products in China can comprehensively match all LUC types with statistical data. It is a challenge to harmonize the area and its temporal changes for all LUC types within each pixel. Recently, several long-term assisting geospatial datasets such as the normalized difference vegetation index (NDVI) and leaf area index (LAI) datasets have been developed [
27,
28,
29]. Based on the existing vegetation indices and LUCC products, it is possible to invert the real changes in LUC shares within pixels.
The northeastern China (NEC) covers about 15.3% of China’s territory. It is the main bases for crop and wood productions in China, and has the largest wetland area compared with other regions. With the rapid transitions of socioeconomic environment, this region has experienced dramatic and complex changes in various LUC categories during the recent decades, making this region an ideal case for developing approaches of LUCC products. Through the comparisons with existing LUCC products, we found that no LUCC products can comprehensively catch the actual changes in LUC area in the NEC during 1980-2020; therefore, it is also necessary to reconstruct a long-term and high-precision proportional LUCC dataset to accurately reflect the spatiotemporal patterns in major LUC types in the NEC. The objectives of this study are to: (1) construct a approach for tracking the changes of fractional shares of various LUC types by integrating multiple LUCC products, other geospatial datasets with statistical data using machine-learning and regular decision-tree methods; (2) evaluate the performance of this approach using the NEC as a case study area; (3) analyze the spatiotemporal patterns of LUCC in the NEC.