Logistics centers play a crucial role in the urban logistics structure. With the transformation of modern business models, logistics centers have evolved from traditional urban freight distribution centers to urban area freight hubs, becoming a key link in supporting last-mile logistics. This shift has enhanced the service capability and social image of enterprises. For last-mile community logistics centers, the freight dispatch support from advanced logistics centers is of paramount importance. These centers not only support the freight dispatching of multiple community centers but also handle the goods dispatching among large centers. The collaborative work of both types of center nodes is crucial for enhancing the urban logistics service capability of enterprises, making the site selection decision for logistics centers of significant strategic importance.
1.1. Logistics Center Site Selection
Researchers both domestically and internationally have conducted in-depth studies on the location of logistics centers and proposed a range of theoretical and practically valuable optimization models and algorithms. The reasonable site selection of urban logistics centers not only affects the operational efficiency of logistics activities but is also a key factor in constituting logistics costs. Considering that logistics costs are an important part of enterprise profits, researchers have paid special attention to transportation costs in logistics optimization. Emre and other researchers [
1] combined Geographic Information Systems (GIS) and Binary Particle Swarm Optimization (BPSO) algorithms, proposing a comprehensive solution for the site selection of urban logistics centers. GIS is used to generate spatial information required for the p-median model, while BPSO is utilized to determine the optimal result considering logistics costs. However, in the urban logistics industry, it often becomes necessary to consider the locations of multiple logistics centers. To address this issue, Ismail and Fahrettin [
2] adopted a spatial multi-criteria decision-making method that combines complex problem structures, expert opinions, geographical features, and mathematical modeling methods, aimed at analyzing the locations of multiple logistics centers and minimizing logistics costs. Additionally, Jun and other researchers [
3] introduced three socio-economic indicators – economic development, traffic congestion levels, and total logistics demand – and constructed a two-stage model that improved clustering algorithms and the centroid method, to deal with multi-facility issues in real cases. Maryam and Hyunsoo [
4] aimed to minimize transportation costs between nodes and applied an integrated meta-heuristic algorithm combining Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) to construct a model for solving the optimal site selection problem of logistics centers. Yingyi and others [
5] improved the existing one-dimensional target constraint location model, proposing a multi-factor constrained P-median model that considers operating costs, and used Particle Swarm Algorithm and Immune Genetic Algorithm to determine the optimal location. However, with rapid economic development and constantly changing customer demands, the optimality of the initial location of logistics centers may be affected. To address this challenge, Liying and others [
6] introduced transportation costs between adjacent stages, establishing a multi-stage dynamic location model. Meanwhile, Juan and others [
7] introduced a balanced learning strategy, improving the Cuckoo Search Algorithm, and Jeng-Shyang and others [
8] proposed an intelligent evolutionary algorithm based on the living habits of the Rafflesia – the Rafflesia Optimization Algorithm – to solve the site selection problem of logistics distribution centers.
In the modern urban logistics system, logistics costs are no longer the sole major factor for consideration. With the promotion of the concept of coordinated development of natural environment, technology, economy, and society, sustainability has become one of the important goals for the site selection of logistics centers. Therefore, traditional methods of constructing logistics centers are insufficient for the development of urban community logistics centers. Rémy and others [
9] employed a two-level multi-commodity network flow model to solve the urban center parcel distribution problem and assessed its impact on sustainable development by focusing on carbon emissions. Congjun and others [
10], considering sustainability, optimized the site selection of logistics centers based on the 2-tuple linguistic representation model decision method. Moreover, with greenhouse gas emissions as a key factor, Hongzan and others [
11] used truck trajectory data combined with DBSCAN clustering and an improved P-median model to determine the best location for urban logistics centers to reduce emissions. As an integral part of urban logistics, cold chain logistics by Siying and others [
12] was modeled using a two-level programming approach, and Xinguang and Kang [
13] employed a multi-objective location model to solve the site selection problem of cold chain logistics distribution centers considering carbon emissions.
In scenarios like natural disasters, the critical time window makes transportation efficiency a core consideration. Xenofon and Christos [
14] combined classic heuristic algorithms and forecasting models, as well as deep neural networks, to select distribution center locations to quickly distribute aid materials to disaster areas. Kuo-Hao and others [
15] proposed a two-stage stochastic programming model to ensure that logistics operations of relief materials could function most effectively at critical moments. Meanwhile, Zengxi and others [
16] combined Multi-Criteria Decision Making (MCDM) with Geographic Information Systems (GIS) to address the site selection problem for Emergency Logistics Centers (ELC), accelerating the delivery of relief materials.
Although current research primarily focuses on reducing costs, minimizing carbon emissions, and improving efficiency through reasonable site selection, the rapid development of urban logistics and the ongoing changes in urban distribution indicators mean that finding optimal solutions among multiple objectives may not be sufficient to meet new challenges. Therefore, this paper proposes the introduction of the concept of multi-data fusion, considering diverse data information, to construct a new logistics planning and design system.
1.2. Multi-data fusion
Multi-data fusion technology involves integrating datasets from different distributions, sources, and types into a unified global space to form a more consistent expression. This technology occupies an important position in modern information processing and has been widely applied in multiple fields. Compared to the independent processing of a single data source, the advantages of multi-data fusion are significant: it not only improves the detectability and credibility of targets but also broadens the spatio-temporal perception range, reduces the ambiguity of inference, and enhances detection accuracy. Additionally, it increases the dimensional complexity of target features, improves spatial resolution, and enhances the system's fault tolerance.
In practical applications, multi-sensor systems can obtain comprehensive information about experimental subjects from various information sources for real-time monitoring purposes. For example, Bai and others [
17] used a multi-data fusion method combining near-infrared spectroscopy and machine vision to analyze and assess the fermentation level of black tea. Addressing the issue of estimating the concentration of chlorophyll-a in eutrophic lakes, Cheng and others [
18] proposed a multi-source data fusion method based on Bayesian Inference (BIF), effectively combining the advantages of in-situ observations and remote sensing data. Similarly, Yongyun and others [
19], for real-time monitoring of dissolved oxygen changes in microbial fuel cell biosensors, constructed a low-cost, high-accuracy real-time dissolved oxygen biosensor based on iMFC and enhanced its performance through a multi-source data fusion strategy predictive model using multiple environmental indicators. Furthermore, Hui and others [
20] implemented online monitoring of the simultaneous saccharification and fermentation process of ethanol by merging a convolutional neural network (CNN) with a recurrent neural network (RNN) in a novel cross-perception multi-source data deep fusion model. Zhang Yi and others [
21] proposed an interactive platform architecture for provincial power grid voltage dips based on multi-source data fusion, addressing issues such as excessive monitoring, limited application, and lack of interaction in voltage dip-related systems. Lastly, Qilin and others [
22] proposed a multi-data fusion calibration method for all parameters of the Orbital Multi-View Dynamic Photogrammetry System (OMDPS), providing a more accurate spatial reference for spacecraft attitude measurement.
Due to environmental complexity, noise interference, instability of recognition systems, and the use of different identification algorithms, the feature information extracted during experiments often lacks precision, completeness, and reliability. To address these challenges, Yan and others [
23] proposed a collaborative strategy combining deep learning with machine learning theory for tracking quality differences in rice, aiming to improve the detection performance of the fusion system. Junyi and others [
24] developed a real-time target detection system for intelligent vehicles with enhanced real-time and accuracy, using a multi-source fusion method based on the ROS Melody software development environment and the NVIDIA Xavier hardware platform. Huice and others [
25], aiming to improve the prediction accuracy and efficiency of coal mine gas generation patterns, proposed a prediction method based on multi-source data fusion. Additionally, Yajie and others [
26] combined GNSS-IR technology with optical remote sensing and used a multi-data fusion method based on the Genetic Algorithm-Back Propagation Neural Network (GAP-NN) to improve the accuracy of soil moisture measurements.
Apart from real-time monitoring of target objects, Xueying and others [
27] proposed a multi-source data feature fusion method based on deep learning, aimed at solving the multi-feature contribution differential analysis problem in soil carbon content prediction using VNIR and HIS technologies. Given the complexity of vehicle driving conditions, Jihao and others [
28] constructed a slope estimation algorithm based on multi-model and multi-data fusion to enhance the vehicle's ability to real-time track actual road slope changes. In the context of maritime activities, Ye and others [
29] proposed an Adaptive Data Fusion (ADF) model based on multi-source AIS data for predicting ship trajectories in maritime traffic. In complex industrial processes such as sintering, Yuxuan and others [
30] proposed a sintering quality prediction model based on the fusion of industrial camera video data and process parameters. Additionally, for predicting water quality in urban sewer networks, Yiqi and others [
31] established a deep learning method based on multi-data fusion, considering environmental, social, water quantity indicators, and monitorable water quality standard indicators.
The development of multi-data fusion technology continues to evolve. For instance, Bo and others [
32] proposed a Digital Twin Model (DTM) based on Transfer Learning and Multi-Source Data Fusion (DTM-TL-MSDF). This method effectively integrates experimental and simulation data, aiming to construct an accurate digital twin model for real-time monitoring of structural strength. Sizhe and others [
33] developed a novel GeoAI research method that performs deep machine learning from multi-source geospatial data to effectively detect natural features. Moreover, Nan and others [
34] proposed a new architecture for a Trusted Execution Environment with integrated blockchain capabilities, aimed at improving the efficiency of multi-source data fusion processing under business scenario constraints.
In recent years, although the research on multi-data fusion mainly focuses on data monitoring or prediction under dynamic changes, there are relatively few discussions on the location of logistics centers. However, in the context of the rapid development of urban logistics industry, a single factor is no longer sufficient to meet the demand for the optimality of logistics center location. Therefore, in the complex solution environment structure of urban logistics center location analysis process, in order to make a reasonable logistics center location decision, it is necessary to evaluate different decision-making factors, especially geographic information data. Because geographic information is a technical means of acquiring, processing, and analyzing spatial data, it can be used to collect, analyze, and apply data on the location of logistics centers. Further, this paper adopts a multidimensional data fusion approach from the perspective of geographic information fusion in the study of logistics center location, focusing on geographically relevant indicators around the logistics nodes, such as logistics and distribution coverage, equilibrium, and urban congestion, in order to construct a more accurate logistics location model.
Moreover, when constructing logistics nodes, this paper also gives special consideration to transportation fluency. To this end, five selection objectives are determined: low operation rate, low rate of change of traffic congestion, high coverage rate of nodes at 3 kilometers, short distance between logistics parks and the nearest city-level nodes and high efficiency of cargo transportation. Meanwhile, from the perspective of the overall logistics system, focusing on the balance and rationality of the system operation, the decentralization of primary nodes and the aggregation of secondary nodes are established as the selection objectives. This study proceeds to establish a multi-objective node selection model and applies cluster analysis to simply cluster the entire region into small regions of similar size. Within each small region, a clustering center, i.e., the initially identified primary node, is identified. Then, the clustering center is dynamically adjusted by the K-mean algorithm, and the distance, i.e., the similarity, between the other nodes within each small region and the initially determined first-level node is calculated. The clustering results are optimized through repeated iterations to minimize the sum of squares of the distances of all categories to their respective category centers, thus determining the final first-level nodes and their jurisdictional areas within 3 km. Compared with other models, this model not only obtains the optimal solution faster, but also performs better in balancing the optimal distribution and coverage of logistics centers, which leads to a more reasonable site selection scheme.