3.2. Analysis of The Leading Countries
The national collaboration network reveals the importance of individual countries and the interconnections between multiple countries in a specific research field. The co-citation network of collaborative countries (
Figure 2), based on WoS Core Collection data, revealed that 48 countries have published research on groundwater DNAPL contamination modeling. The countries with 10 or more publications are listed in
Table 1.
In terms of publication volume, the United States leads the field of groundwater DNAPL contamination modeling with 307 publications, followed by Canada with 112. This is related to the fact that most DNAPL-contaminated sites are located in the United States and Canada [
13]. China ranks third with 82 publications but started later than the United States and Canada, with relevant research beginning in 2003. Among the top ten countries by publication volume, the United States, Canada, Germany, and the Netherlands started research early, while most of the other countries began their work after 2000.
Centrality is an indicator that reflects the importance of a country within a collaboration network. A centrality greater than 0.1 indicates that the country plays a crucial role in advancing the research field. In terms of centrality, the United States has a centrality of 0.95 in the national collaboration network, significantly higher than any other country. England follows with a centrality of 0.23. Other countries with a centrality greater than 0.1 include China (0.18), France (0.16), Turkey (0.16), Greece (0.15), and Scotland (0.11). The above countries with high centrality are marked with a purple outer ring in
Figure 2. Although Canada has a high publication volume, its influence is relatively weaker, with a centrality of 0.09.
Overall, the United States leads in DNAPL contamination modeling. This is due to its advanced chemical industry, which results in more contaminated sites and draws significant scholarly attention, along with its strong economy providing substantial support for related research. Therefore, referencing American scholars’ research can offer valuable guidance for mastering and applying DNAPL models.
3.3. Analysis of The Leading Institutions
A total of 365 institutions have published research related to DNAPL models, according to the analysis of selected publications. The top 25 institutions with more than 10 publications were selected to create a co-citation network map (
Figure 3) for visual analysis.
In terms of timeline and publication volume, the United States Department of Energy (DOE), Queen’s University, and the University of Waterloo are the top three institutions with earlier research and higher publication numbers, with 38, 29, and 27 papers, respectively. Moreover, they continue to produce relevant research in recent years. China and France started later but have recently focused more on DNAPL model research, achieving notable results. In DNAPL contamination modeling, China is represented by Nanjing University and Jilin University, with 24 and 22 publications, respectively.
Spatially, it is clear that most research institutions with higher output are located in the United States, followed by Canada, which aligns with the earlier analysis of publication by country. In contrast, only Nanjing University and Jilin University stand out in DNAPL model research in China, indicating that DNAPL models are not yet widely adopted. Additionally, there is significant collaboration between U.S. institutions, but international collaboration remains noticeably limited. Therefore, future research should focus on two key areas: (1) increasing global awareness of groundwater DNAPL contamination and promoting the application of DNAPL models; (2) enhancing international collaboration to leverage research experience and drive innovation in DNAPL modeling technologies.
3.5. Analysis of Research Hotspots and Trends Based on Keyword Clustering
3.5.1 Research Hotspots in modeling DNAPL contamination
The DNAPL model framework can be broadly divided into three parts: multiphase flow model, mass transfer model, and dissolved phase transport model. The model can be integrated with experimental findings to clarify DNAPL migration in groundwater and predict contamination distribution, aiding site remediation efforts. To better understand the research status in DNAPL contamination modeling, a keyword co-occurrence analysis was conducted, followed by clustering using the log-likelihood ratio (LLR) algorithm to reveal current research hotspots. The keyword clustering yielded a Q value of 0.4169 and an S value of 0.7499 (Q > 0.3 indicates significant structure; S ≥ 0.7 indicates full reliability). The 10 clusters related to DNAPL models (
Figure 5) highlight five prominent keywords each and exhibit interconnections between clusters.
Cluster 1 (#0 multiphase flow): The multiphase flow model describes DNAPL multiphase migration and is the foundation of DNAPL modeling research. Due to their high density and low viscosity, DNAPLs infiltrate deep into the subsurface, pass through the unsaturated zone, contaminate aquifers, and accumulate on the aquitard, forming pools [
14]. In this process, DNAPLs remain as a NAPL phase and slowly dissolve into the groundwater, becoming a source of dissolved DNAPLs. Many researchers use multiphase flow models to simulate this process and determine DNAPL distribution in the source zone. Research indicates that the key factors to consider when setting up multiphase models include pollutant release situations [
15,
16,
17], aquifer heterogeneity [
18,
19,
20], constitutive relations representing the permeability-saturation-capillary pressure (
) correlation [
21,
22,
23], and groundwater flow velocity [
14,
24].
Cluster 2 (#1 reductive dichlorination): On one hand, researchers focus on chlorinated organic solvents to enhance reductive dechlorination techniques. F. Fagerlund et al. [
25]used experiments and modeling to study the coupled process of PCE dissolution and dechlorination by nanoscale zero-valent iron in DNAPL source zones. On the other hand, given the common occurrence of PCE and TCE contamination sites, most studies focus on simulating and predicting these pollutants [
26,
27,
28].
Cluster 3 (#2 mass transfer): Understanding the mass transfer mechanism of DNAPLs from the NAPL to the dissolved phase and establishing an appropriate expression is crucial for source-sink terms in plume modeling. A typical empirical rate-limited expression based on dissolution kinetics is:
where
is the mass flux of dissolution from the NAPL phase to the aqueous phase, [ML
−3T
−1];
is the average mass transfer coefficient at the NAPL-water interface, [LT
−1];
is the effective specific interfacial area between the NAPL phase and the aqueous phase, [L
−1];
is the equilibrium aqueous phase concentration, also known as the effective solubility, [ML
−3]; and
is the aqueous phase concentration, [ML
−3].
Numerous studies have focused on optimizing mass transfer coefficients to improve the analysis and mathematical representation of the DNAPL dissolution process [
29,
30,
31,
32]. These coefficients are then incorporated into solute transport models to achieve accurate estimates of contaminant concentrations near source zones. To enable site-scale simulations, Parker and Park [
33] developed an empirical expression for effective mass transfer coefficients under pseudo-steady-state conditions, providing a valuable reference for subsequent research [
34,
35,
36,
37,
38].
Cluster 4 (#3 dense non-aqueous phase liquids): DNAPLs naturally form a cluster as a research focus. However, it is noteworthy that keywords such as “tomography”, “spectral induced polarization”, and “conductivity” appear under this cluster. This highlights that coupling geophysical multi-source data for DNAPL contamination modeling has become a key area of interest for scholars. Power et al. [
39] developed a DNAPL-ERT numerical model by integrating Electrical Resistivity Tomography (ERT). This model calculates the resistivity response to key hydrogeological parameters (hydraulic permeability, porosity, clay content, groundwater salinity and temperature, and air, water, and DNAPL contents evolving with time), which enhances the sensitivity to heterogeneity in DNAPL distribution and soil structure. Kang et al. [
40,
41,
42] coupled geophysics with DNAPL models and integrated them into various inversion frameworks, which improved source zone characterization.
Cluster 5 (#4 partial mass depletion): Mass depletion refers to the gradual dissolution and eventual depletion of the NAPL phase in the source zone over time. Similar to mass transfer, this cluster describes the conversion of the NAPL phase to the dissolved phase. However, the difference is that the mass transfer model links the multiphase flow model with the dissolved phase transport model, whereas the source strength function, which characterizes NAPL mass depletion, acts as a source term in the dissolved phase transport model. This simplifies the dissolution process in the source zone and reduces model complexity. The source strength function is typically a power-law relationship between the effluent concentration and the remaining DNAPL mass. The source strength function proposed by Falta et al. [
43]has been widely adopted:
where
and
correspond to the DNAPL concentration in the source zone and the residual DNAPL mass at time t, respectively;
and
are the DNAPL concentrations in the source zone and the residual DNAPL mass at the initial time; and
is a model parameter.
Cluster 6 (#5 source zone): This cluster focuses on the inversion and identification of DNAPL source zones in saturated aquifers. China has conducted extensive research in this area, with most efforts focused on improving the accuracy of DNAPL source zone inversion. For example, Kang [
40] proposed a joint inversion framework (CVAE-ESMDA) combining a convolutional variational autoencoder (CVAE) with the ensemble smoother with multiple data assimilation (ESMDA). This approach integrates multiple data sources (OHT, downstream DNAPL concentrations, and ERT) to more accurately estimate DNAPL saturation in the source zone. Wang et al. [
44] combined the ensemble Kalman filter with an improved butterfly optimization algorithm, improving inversion accuracy and effectiveness.
Cluster 7 (#6 uncertainty analysis): Inversion of DNAPL source or optimization of remediation strategies based on simulation-optimization methods often involves uncertainty, requiring repeated model runs and high computational costs. Therefore, many studies develop surrogate models to reduce computational load and conduct uncertainty analysis. Hou et al. [
45] developed an integrated surrogate model based on support vector regression (SVR), kriging, and kernel extreme learning machine (KELM). The homotopy-differential evolution (DE) algorithm was then combined with the surrogate for source inversion and uncertainty analysis, significantly improving identification accuracy. Du et al. [
46] developed a fast-running convolutional neural network (CNN) surrogate model to identify the optimal SEAR scheme under uncertainty, improving optimization speed by 99.8% in 3D numerical experiments.
Cluster 8 (#7 immiscible displacement): In DNAPL contamination scenarios, immiscible displacement describes the relative movement between the NAPL and water phases at a small-scale heterogeneous pore level, driven by differences in gravity and viscosity, leading to NAPL displacing water and migrating downward. At the macro level, this corresponds to the multiphase flow described in Cluster 1. However, while multiphase flow simulation based on continuous models can statistically characterize heterogeneity at the macro level, it is difficult to capture the displacement behavior between the NAPL and groundwater phases at the pore scale. Therefore, some researchers have developed models specifically for immiscible displacement at the pore scale. Trantham et al. [
47] developed a Stochastic Aggregation Model (SAM) using an improved DLA algorithm to simulate the displacement of groundwater by DNAPLs with both higher and lower viscosities than groundwater. Nsir et al. [
48] developed a numerical simulator based on a discrete network model, using pore body and throat size parameters from the particle size distribution of real porous media. The simulated NAPL-water immiscible two-phase flow results matched experimental data.
Cluster 9 (#8 back diffusion): Back diffusion is the process where dissolved DNAPLs migrate into low-permeability zones, accumulate, and then diffuse back into the aquifer after a concentration reversal [
49,
50,
51]. After source zone DNAPLs are depleted or isolated, back diffusion can occur once the contaminant concentration in the aquifer drops to a certain level, becoming a secondary pollution source and keeping plume concentrations above the MCL over time [
4,
52]. In recent years, back diffusion has gained attention due to its role in prolonging contamination persistence. Most studies simulate and explore factors influencing its occurrence, such as DNAPL solubility [
53,
54], soil heterogeneity [
49], adsorption-desorption [
55,
56], and biodegradation [
50,
57,
58,
59,
60]. Simulations of back diffusion are typically divided into two stages, marked by the removal or isolation of the source zone [
49,
52,
61]. In the first stage, contaminants accumulate in low-permeability zones through forward diffusion. In the second stage, after source removal, the simulation continues to study back diffusion by observing plume tailing.
Cluster 10 (#9 contaminant mass discharge): The study of contaminant mass discharge is often closely linked to the mass transfer models in Cluster 3. Considering the challenges and costs of simulating DNAPL dissolution at the field scale, researchers have developed upscaled mass transfer models with domain-averaged coefficients to approximate real-site dissolution processes [
34]. Simplified upscaled models, linked to mass discharge flux, can serve as an effective screening tool for evaluating source zone management strategies [
36].
In summary, by reviewing and analyzing the current state of research, the distribution of key research hotspots is presented (
Figure 6).
3.5.2 Assessment of Future Research Trends
The top 20 keywords with the highest burst strength from 1993 to 2023 are identified (
Table 3). Based on the analysis of publication volume, the 30-year period can be divided into three stages: the initial budding stage (1993-1999), the rapid development stage (2000-2010), and the continuous breakthrough stage (2011-2023). The main keywords during the initial budding stage were “two-phase flow”, “multiphase flow”, and “contaminant transport”, with research focusing on the simulation of DNAPL multiphase migration. During the rapid development stage, more detailed descriptions were considered, including shifts in contaminant types (from TCE to PCE), applications in heterogeneous sites, deeper exploration of mass transfer processes, and increased emphasis on NAPL depletion in the source zone. In the continuous breakthrough stage, the focus shifted toward contaminant removal and remediation, the impact of permeability, back diffusion in low-permeability zones, and the integration of geophysical techniques for source zone identification.
From a research trend perspective, DNAPL modeling has developed into a well-established system, with further studies primarily focused on optimizing model details based on experimental findings. Future research directions include: (1) investigating back diffusion mechanisms and exploring methods to reduce plume persistence; (2) applying models to real sites for source zone characterization and pollution distribution; (3) optimizing remediation strategies to enhance effectiveness and reduce costs.