1. Introduction
Currently, China is experiencing rapid urbanization growth. However, urban development is encountering increasingly prominent issues, such as disordered spatial distribution, lack of synergy between regional development, and significant pressure on natural resources and the environment [
1]. Urban agglomeration in China serves as the primary form of new-type urbanization and the spatial carrier to address the new economic normal [
2]. It accommodates over 80 percent of the population and contributes nearly 90 percent of the GDP. City clusters will continue to play a crucial role in high-quality development in this new era. Therefore, scientifically identifying built-up area boundaries is essential for understanding the development patterns of urban agglomerations, diagnosing urban system problems, and achieving efficient layout of regional functional space through scientific development planning.
The concept of an urban built-up area refers to an area within an urban administrative region that has been developed and constructed with basic municipal public facilities available. In practical terms, it represents a densely constructed surface space with both municipal public facilities and supporting public facilities simultaneously present [
3]. When delineating the scope of urban built-up areas, factors such as natural topography, landforms, management boundaries of grass-roots administrative units should be considered while maintaining consistency with geographical population statistics where possible [
4]. The feature extraction method based on single dimension may have limitations when identifying built-up areas on city outskirts.
Domestic and foreign scholars have used different geographical spatial big data to study the boundary identification of built-up areas. It mainly includes the recognition method based on social and economic statistics data, the recognition method based on remote sensing image interpretation, and the fusion recognition method combining POI and remote sensing image [
5,
6]. However, at present, there is still a lack of urban identification analysis combining social economy, natural conditions and traffic network. In addition, most of the existing studies focus on a single city or global scale, and there is a lack of mesoscale studies for multiple urban agglomerations.
Traditional methods of urban agglomeration built-up area boundary research focus on the differentiation of urban levels, and select indicators related to urban development and establish an urban indicator system through qualitative evaluation with the help of basic geographical theories such as “pole-axis” theory, “center-periphery” theory and urban hinterland theory [
7,
8,
9,
10]. With the deepening of the research, other scholars determined the boundary from different angles through a combination of quantitative and qualitative methods, using POI data [
11], population data [
12] and land price [
13]. The advantage of this method is that the indicators and data are easily available, and the disadvantage is that it is highly subjective.
With the rapid development of remote sensing technology and the updating of research methods, the traditional top-down approach using only socio-economic indicators to identify small metropolitan areas has been replaced by more advanced methods. This requires the definition of large-scale urban agglomerations and interconnected metropolitan areas based on remote sensing images [
15]. Compared with traditional methods, remote sensing images provide better spatial characteristics of urban landscape and infrastructure [
16], which helps to characterize the scope of human activities or the physical distribution pattern of cities and towns. Recent studies have focused on extracting impervious surfaces from satellite images to represent actual urban areas because they have higher resolution and lower threshold dependence [
17]. In order to obtain a more objective understanding of urban built-up area boundaries, scholars have proposed morphological analysis methods based on remote sensing interpretation, such as identifying the actual scope of urban space through information such as night light intensity [
18], land vegetation coverage rate or building coverage rate [
19]. Indicators such as the density of economic activities, the intensity of economic ties with the central city, land use and building density are constructed as the reference basis for dividing the boundaries of urban built-up areas.
In general, the existing research has shifted from the identification of spatial scope to the detection of spatial pattern characteristics within urban agglomerations, and gradually shifted from single-dimensional feature analysis to the construction of multi-dimensional feature index. However, the research difficulties of multi-dimensional analysis, such as data fusion algorithm and multi-source data unification, have not been thoroughly developed [
25]. In addition, the relevant research of developed urban agglomerations such as the eastern coastal areas of China is relatively abundant, while the research of developing urban agglomerations in the central and western regions is still vacant for the time being. In this study, the YRD and CC city clusters have been selected for comparative analysis. The main reason for this selection is that the YRD is situated on the east coast of China, with a flat terrain and strong economic vitality. In contrast, the CC is located in the central and western inland of China, characterized by mountains and hills, and relatively insufficient economic power [
26,
27,
28,
29]. The comparison of these representative urban agglomerations can enrich existing regional economic theory analysis and provide valuable insights for promoting high-quality development of urban agglomerations in China. The main purpose of our research is to explore a strategy for the recognition of built-up area with a shorter time update cycle on the premise of ensuring a certain accuracy. Our main research topics are:
(1) Urban agglomeration built-up area identification. The built-up area is identified from three dimensions: social economy, natural coverage and traffic accessibility.
(2) Delineate the evaluation of recognition results. We conduct qualitative and quantitative assessments reliability in terms of consistency and integrity of built-up area delineation results.
(3) Summarize the built-up area characteristics of urban agglomerations in different regions. Using spatial analysis techniques, we conduct a comparative analysis of representative urban agglomerations in eastern and western China.
The main innovations of this study are as follows:
(1) Quantitative and comparative analysis of the development status of the Yangtze River Delta and Chengdu-Chongqing urban agglomerations has broadened the scope of existing research and improved the lack of scientific data support for qualitative research on urban agglomerations development.
(2) Three different technical routes were adopted to determine the built-up area boundaries of the Yangtze River Delta and Chengdu-Chongqing urban agglomeration from multiple perspectives, which improved the problem that a single index could not accurately reflect the internal heterogeneity of the urban edge.
(3) Multi-source data fusion method is used to improve the accuracy of urban agglomeration built-up area identification.
5. Conclusions
We aim to explore a method for extracting the built-up area of urban agglomerations based on multi-source data and constructing a rule that can integrate multi-source data reasonably, taking into account the characteristics of the built-up area. The Density Graph analysis of POI was utilized to extract the built-up area based on socio-economic levels, resulting in an overall accuracy rate (OA) of 91.27% in the Chengdu-Chongqing region and 80.90% in the Yangtze River Delta region. Additionally, by using NPP-VIIRS luminous remote sensing data combined with LST and NDVI index, we constructed a unique LVTD coefficient and analysed thresholds by combining urban statistical yearbooks. Taking into consideration population distribution, land cover, and human activities, we achieved an overall accuracy (OA) of 95.50% in the Chengdu-Chongqing region and 84.74% in the Yangtze River Delta region. Furthermore, utilizing network data from OSM and railway station data in China, we calculated minimum cumulative time costs using raster analysis algorithms to extract built-up areas according to accessibility values while considering accessibility and connectivity factors. This approach resulted in an overall accuracy rate (OA) of 88.51% in the Chengdu-Chongqing area and 73.42% in the Yangtze River Delta area. Through numerous experiments, adjustments were made to further explore methods for fusing multi-source data. By verifying algorithm accuracy on high-precision datasets, our fused method achieved an overall accuracy of 91.35% with a kappa coefficient of 0.75. This precision result is higher than the overall accuracy of 85.34% and the kappa coefficient of 0.7394 of the built-up areas of the three major urban agglomerations studied by Wang et al. [
37].
The method proposed in this paper partially addresses the limitation of using a single data source for built-up area extraction. It is particularly suitable for urban agglomeration scenarios that demand high extraction accuracy and scientific rigor. The method relies on heavy remote sensing imagery, requires access to recent luminous remote sensing data, depends on fast updates of POI data, and necessitates high openness of traffic network data to ensure the timeliness of built-up area extraction. The fusion extraction rule established in this paper offers a more comprehensive approach to extracting built-up areas, allowing for a more holistic reflection of regional construction, development, and public facilities. However, because this method is based on economic population distribution, land cover characteristics, and traffic network features to identify built-up areas, it not only enhances accuracy but also introduces challenges related to low efficiency in extraction. Additionally, there may be insufficient index analysis for urban agglomerations at different stages of development which could impact the applicability of this method in highly developed urban agglomerations. Future research should focus on refining integration rules and conducting application studies in urban agglomerations at different scales to further enrich the theory and methodology of urban built-up area extraction.