Reservoir simulation software packages are continuously modernized based on the current needs for large data management and despite the availability of ever-growing computer power. However, simulations are still not fast and robust enough, in the context that they entail high computational costs, introducing the need for more time-efficient smart tools that can adapt and provide fast and competent predictions which mimic the real reservoir performance within an acceptable error margin. In this section, ML is employed as the suitable means to accelerate individual simulation runs that can assist any desired HM or production related calculations by using two approaches.
Firstly, fast proxy models (or else SRMs) of the reservoir simulator have been proposed which can be implemented to answer a wide range of engineering questions in a fraction of the time that it would otherwise be required. Secondly, ML has been utilized to accelerate specific CPU time-intense sub-problems while maintaining the rigorous differential equation-solving method. The most pronounced application in this category is the handling of the phase equilibrium problem in its black oil or compositional form which needs to be solved numerous times during the course of the reservoir simulation run.
SRMs using ML and pattern recognition methods to fully replace the non-linear solver were firstly proposed by Mohaghegh and his associates who developed SRMs that could fully reproduce the traditional black oil or compositional reservoir simulation results (i.e., high-fidelity models) on a cell basis without sacrificing the physics or the order of the system under investigation, as is the case of RFM and ROM methods respectively. What they did instead is that they built grid-based and well-based proxy models. Grid-based models usually provide pressure and saturation predictions for the fluid phases at the grid level based on information from the surrounding grid blocks rather than the whole reservoir. This way, the very weak dependency of the state of a cell on the ones far away from it is ignored while allowing at the same time the disengagement of the cell state. Well-based models are developed similarly to predict well-related parameters, such as gas, oil and/or water production rates, Bottom Hole Pressures (BHPs), etc.
2.2. Machine Learning Methods for Handling the Stability and Phase Split Problems
For the case of compositional simulations, several ML methods have been developed, aiming at reducing the excessively long time required for solving stability and flash calculations. In the first case, the phase stability problem is expressed as a classification one to determine the number of phases for any given composition and pressure and temperature values. For flash calculations, ML applications are oriented toward predicting the k-values needed for those calculations in a more robust, efficient and rapid way.
The phase stability-targeted methodology was firstly proposed by Gaganis et al. [
47] who used Support Vector Machines (SVMs) to generate a discriminating function that emulates/replicates the phase boundary. This discriminating function is set to zero at the boundary, positively signed (+1) inside the phase envelope and negatively signed (-1) outside that. The authors obtained the dataset to train the classifier by running regular stability tests for various uniformly drawn random combinations of composition (selected to run over the whole compositional space) and pressure and temperature values. The training data needed were obtained in an automated offline way based on sample runs. The classifier was trained using labels of stable/unstable mixtures obtained by running regular stability tests, using composition and pressure and temperature values. That way, they obtained fast stability predictions which are the same as those obtained by the conventional minimum Tangent Plane Distance (TPD) ones since the classifier provides correct answers for both classes based on the sign of the predicted discriminating function. Later, Gaganis et al. [
7,
48] expanded their research and answered both phase stability and phase split problems by combining SVMs for classification and ANNs for regression, respectively, in a single prediction system. A single-layer ANN to predict the k-values is used only if the classifier predicts an unstable mixture. To further accelerate calculations, reduced variables were used to shrink the output. This way the number of outputs to be predicted was at least equal to three and definitely less than that of mixture components. The ANN-predicted reduced variables are then back-transformed to regular k-values. The results demonstrated that the proposed methodology is very efficient, with respect to the accuracy and the computational cost reduction and its applicability can be expanded reservoir simulation to any kind of fluid flow simulation that demand numerous phase behavior calculations. After that, Gaganis [
49] proposed an even more efficient treatment of the stability problem by means of two custom discriminating functions, d
A and d
B, each single-sided correct. If d
A is positive, the sample is definitely stable. If d
B is positive, the sample point is definitely unstable. No concrete answer can be obtained if either of the two is negative. However, as d
A and d
B are built so that the ambiguous space, called “the grey area” (where none discriminating function is positive) is as narrow as possible, the need to run a conventional stability test is hugely reduced. Furthermore, kernel functions are utilized to allow for simple curved, non-linear discriminating functions which can be evaluated rapidly. The method is greedy in that d
A and d
B can replace the lion’s share of the required stability calculations in a simulation run. Conventional, iterative calculations are only needed for points lying within the grey area.
Kashinath et al. [
50] moved in the same direction as Gaganis et al. [
7,
48], treated the stability problem as a binary classification one and tailored it to CO
2 flooding simulations. They developed two SVMs, one to determine whether the fluid under study is in the supercritical phase and a second one to predict the number of unstable phases when in the subcritical region. If the second classifier predicts an unstable phase, an ANN model was used to predict the prevailing k-values. Therefore, the authors divided the problem into three categories, 1) supercritical phase determination, since this entails a large calculation burden by using EoS, 2) sub-critical phase stability, and 3) the phase-split problem. By applying this method, the authors utilized a negative flash algorithm to create a phase diagram that differentiates the subcritical and supercritical areas to determine the fluid properties of the latter. The anticipated composition phase diagrams are then used to generate a training data set for the ML models. SVMs are employed to build two classifiers by utilizing composition and pressure inputs, where the first classifier determines if conditions are met for the supercritical region, and the second identifies the number of stable phases in the sub-critical region. Finally, the phase-split problem is handled by predicting k-values for sets of pressure and composition data using an ANN. The results showed that the models can effectively cut down the overall CPU time required for compositional reservoir simulations, causing a very limited decline in accuracy. Schmitz et al. [
51] developed a classification method using ANN models to extend the previous approach and solve the multiphase phase stability problem. The authors examined two classification models, a feed-forward and a probabilistic ANN. The training set for the models’ training was collected for pressure and temperature ranges corresponding to liquid–liquid, vapor–liquid–liquid, vapor–liquid and homogenous liquid and vapor so that the trained models can distinguish these five regions. The results showed that the proposed models could predict the equilibrium state with high precision. Gaganis et al. developed a similar technique to solve rapidly the multiphase stability problem using SVMs [
52]. Wang et al. [
53] developed two ANN models to treat the stability (ANN-STAB) and phase split (ANN-SPLIT) problems, in a process similar to that of Kashinath et al. For the ANN-STAB model to learn if a given mixture at given conditions is stable or unstable, the authors generated two auxiliary models, one for predicting the upper saturation curve and one for the lower. That way, they could compare the prevailing pressure with the mixture’s saturation one to determine if the mixture lies inside or outside the two combined saturation pressure curves. If the ANN-STAB model indicates instability, the ANN-SPLIT model is called to predict the mole fractions and k-values, which are utilized as initial values in conventional phase split calculations. The results showed that the proposed models provide initial estimates of high accuracy, while they also achieve significantly shorter computational time.
Apart from the simple ANN models that have been reviewed so far, there are several proposed approaches based on Deep Learning (DL) methods. DL is a subset of the ML family widely used in cases of extremely large reservoir fields. Roughly speaking, ANNs are considered DL networks if they consist of more than three layers, including the input, hidden and output ones. Unlike regular ANNs, DL ANNs can digest unstructured data in its raw form, like text and images, and they can automatically determine the set of variables that can distinguish the desired output for regression, classification and clustering tasks. By observing patterns in the data, a DL model can cluster inputs appropriately, by discovering hidden patterns without the need for the user’s intervention. Most DL ANNs are feed-forward meaning that the information is transferred from the input to the output. Back-propagation is used to calculate and attribute the error associated with each neuron to adjust and fit the algorithm appropriately.
Li et al. [
54] developed a DL ANN to accelerate binary component (methane/ethane) flash calculations and compared that model against three classic methods (Successive Substitution-SS, Newton’s and sparse grids method). The input consisted of critical pressure, critical temperature and acentric factor for both mixture components, as well as temperature and pressure values and the output consisted of the mole fraction of the first component in the liquid and vapor phase. The proposed DL model was found to be significantly more efficient and faster than the SS, Newton’s and sparse grids methods.
In another study, Li et al. [
55] developed a single DL ANN to approximate multicomponent stability test and phase split calculations using results obtained from a conventional iterative NVT flash calculator (specified moles, volume and temperature) as a training dataset. To achieve an integrated stability and phase split DL ANN, the authors used a training dataset that incorporated compositional properties (critical pressure and temperature, acentric factor, etc.), overall mole fractions, overall molar concentration and temperature as input and the number of phases and mole fraction of vapor and liquid components as output. Therefore, by using a single trained DL ANN, they were able to solve simultaneously the phase stability and phase split problems in a way that the phase state can be identified without an additional stability test. The proposed model can successfully estimate the different phase states in the subcritical region of a given mixture and can make significantly faster predictions, as compared with the conventional NVT flash calculator.
For the case of very low permeability, unconventional reservoirs, flash calculations are coupled with substantial capillary pressure effects (very narrow pore throat, thus large capillary pressure on the vapor-liquid phase interface) and they tend to be extremely computationally burdensome, as well as unstable. In that case, conventional compositional simulations can become a difficult task. Wang et al. [
56] worked in the DL field and developed two multi-layered stochastically trained ANN models to predict the phase behavior of hydrocarbon mixtures in such unconventional reservoirs. The first ANN is used to classify the phase state of the system (stable/unstable) and the second, if the first leads to an unstable mixture, to predict the k-values and the capillary pressure. The training dataset for the ANN models was generated from a standalone flash calculator and consisted of composition, pressure and temperature values, as well as pore radius data, all normalized to [0,1] scale before entering the networks. It was shown that the models were very efficient and, subsequently, the predicted k-values were used as initial estimates in a conventional reservoir simulator, whose speed was significantly increased. In addition, Zhang et al. [
57] developed a DL ANN, similar to the one of Li et al. [
55], to predict phase states and phase compositions for hydrocarbon multicomponent mixtures in complex reservoirs with large capillary effects. The authors generated the training dataset using the results of an NVT flash calculator which is developed based on the diffuse interface theory with a thermodynamically stable evolution algorithm for a wide range of reservoir conditions. They also used the same input parameters as in their previous study (Li et al. [
55]), however, they modified the output in a way that almost half of the parameters were replaced by a coefficient ϕ (mole fraction of vapor phase), aiming at securing the material balance. The only parameters remaining are the mole fractions of the vapor components. This is considered by the authors to significantly improve the model’s training, particularly for highly complex fluids with many components. In addition, the model’s hyperparameters are adjusted to optimize its architecture and, hence, its efficiency. Results show that the model can provide precise predictions with the authors claiming that the proposed workflow can be utilized for various mixtures, substantially accelerating flash calculations.
Zhang et al. [
58] were the first to develop a self-adaptive DL ANN to predict the number of phases present in multicomponent mixtures and their equilibrium thermodynamic properties (component mole fractions in each phase) under various reservoir conditions. As in their previous studies (Li et al. [
55], Zhang et al. [
57]) the authors used the results of an NVT flash calculator to generate the model’s training dataset which consisted of the fluid’s composition, overall molar concentration and temperature values as input and the total number of phases at equilibrium and component mole fractions in each phase as output. The authors also used the critical properties of each component of the fluid under investigation as additional input to generalize the model’s capability. The authors developed a two-network structure to accelerate flash calculations for any number of components a user might select each time a new run is performed. The first network transforms the input of various numbers of components of the mixture under investigation into a unified space before the second network is put in motion. “Ghost components” of zero concentration are introduced to complete the input vector in the case of components which do not naturally appear in the mixture under study so as to honor the fixed input vector size. The above proposed network structure makes the model self-adaptive when a different number of components (i.e., different model dimensionality) is considered. The results showed that the proposed model is capable of producing accurate predictions, while also reducing the computational burden that is usually imposed by the conventional methods.
Reservoir systems such as gas condensates or systems where reinjection operations take place are characterized by extremely time-consuming reservoir simulations due to the complex phase behavior phenomena taking place, especially in dry gas reinjection plans where gas recycling takes place inside the reservoir and, thus, the reservoir composition is constantly updated. Samnioti et al. [
19] employed an ML approach using ANNs to accelerate those complex calculations by supplying the k-values at each time step and at each pair of prevailing pressure-temperature conditions to solve the flash problem at a fraction of the time needed by conventional iterative methods. The ANN was trained using an ensemble of pressure, temperature and composition data as input and k-values as the output, all obtained by running offline conventional reservoir simulations on a simplistic reservoir model (sugarbox). Although this process sounds straightforward, the reservoir composition displays large variability in the case of gas reinjection, thus imposing the need for a more extended compositional space compared to the typically used fixed composition one. To handle that, they proposed training the ANN with an extensive dataset obtained from the simulation of various gas recycling schemes, covering any possible composition changes that might occur inside the reservoir. As a result, the computational expenses of the flash calculations were reduced by more than one order of magnitude, compared to the conventional iterative ones. Recently, Anastasiadou et al. [
32] moved in a similar way by trying to solve the phase stability problem, this time for an acid gas reinjection system where the required phase behavior calculations are more complex and time-consuming since they need to be repeated for an even broader compositional space to cover for the acid components (H
2S and CO
2) and the hydrocarbon contaminants that are being reinjected into the reservoir. The authors proposed three classification ML approaches, ANNs, Decision Trees (DTs) and SVMs to solve the phase stability problem, which is crucial in acid gas reinjection designs, at a fraction of the time needed by conventional iterative methods. A large ensemble of training data was obtained by offline running the stability problem using a conventional method and the dataset was then introduced to the classifiers. As a result, the recommended methodology was shown to be able to adapt to all types of acid gas flow simulations.
In cases where complicated systems are under investigation (i.e., CO
2-EOR), the iterative algorithm in conventional reservoir simulators may fail to converge since there are cases where the flash and the nonlinear solver cannot agree over which phase (gas or liquid) is present when a stability test labels the fluid as stable. For that reason, Sheth et al. [
59] used stability test results and developed two ANN models, one classifier and one regressor, to accelerate EOR simulations, such as dry gas and CO
2 re-injection by predicting the fluid’s critical temperature. Hence, the authors’ main goal was to devise an efficient way to accurately predict that crucial value so as to determine the fluid phase state, hence the correct viscosity and relative permeability values to utilize, thus preventing any problems that may arise when simulating the phase behavior of complex fluids. They run several simulation scenarios and generated a relatively small compositional training dataset using a linear mixing rule between the injected and the in-situ fluid compositions, consisting of the final composition, pressures and temperatures as input and the corresponding critical temperatures as output. The first model (classifier) is used to identify if a sequence of iterations will diverge and the second model (regressor) is used to predict the critical temperature for those iterations. The results showed that the proposed model presents critical temperature values comparable with the ones obtained from conventional simulators, while also significantly reducing the computational burden.
2.3. Machine Learning Methods for Predicting Black Oil PVT Properties
Correct reservoir fluid PVT properties, such as saturation pressures, volumetric factors and solubility, are crucial for all kinds of black oil reservoir calculations (material balance, oil production forecast, etc.), where a relatively small error can lead to a considerable error regarding the development of the reservoir model, future operations, etc., which can subsequently lead to inferior prediction performance. Although there are readily available empirical correlations for the determination of those properties [
60], they are usually not accurate enough, imposing the need for ML model development instead.
The most important volumetric parameter of dry gases and condensates is the gas compressibility factor (Z-factor), a property needed for Bg estimations, since it is responsible for flow and volumetric calculations, between reservoir and surface conditions. Most of the time, the Z-factor can be easily determined using empirical correlations fitted on the classic Standing-Katz (S-K) chart. These correlations are not always accurate enough or even valid as they have been generated based on specific pressure and temperature conditions and can sometimes produce poor results when used outside of the predetermined range. Additionally, low accuracy estimates can be obtained when ‘’unusual’’ compositions are considered as is the case with acid or polar components.
Various recent studies have appeared making use of ML methods to predict the Z-factor from the S-K chart. Moghadassi et al. [
61] developed ANN models to predict the Z-factor for pure gases using reduced temperature and pressure as input, thus replacing the hand-fitted models by Beggs & Brill [
62] and Dranchuk & Abou-Kassem (DAK) [
63]. The authors used various training back-propagation algorithms for comparison reasons, namely Scaled Conjugate Gradient (SCG), Levenberg–Marquardt (LM) and Resilient Back Propagation (RBP), with the LM providing the best results. Similarly, Kamyab et al. [
64] build an ANN for the estimation of the Z-factor of natural gas by utilizing a training data set directly digitized from the S-K chart. The results showed that the trained ANN required less computational effort, was more precise than the iterative DAK algorithm, and can be used for the whole pressure and temperature range of the S-K chart. Moving in a slightly different direction, Sanjari and Lay [
65] built an ANN to calculate the Z-factor which, however, was trained against experimental Z-factor values rather than ones extracted from the S-K chart. The efficiency and the accuracy of the proposed ANN was compared to the most well-known empirical correlations and to the Peng & Robinson EoS. The results showed that the model is more accurate compared to the other methods. Furthermore, Irene et al. [
66] and Al-Anazi et al. [
67] developed an ANN model to estimate the Z-factor using PVT data points extracted from the available literature. The authors performed quantitative and qualitative evaluations to examine the models’ efficiency and overall accuracy and the results showed that the developed models were compatible with experimental data upon which they weren’t trained, thereby verifying generalization capability, and, that the models are more accurate compared to the results of numerous EoS and correlations.
Mohamadi et al. [
68] developed a similar approach using experimental PVT data sets of gas condensates, but this time the authors developed three ML models, namely an ANN, a Fuzzy Interface System (FIS) and an Adaptive Neuro-Fuzzy Inference System (ANFIS). The trained models were shown to perform considerably better than the available empirical correlations, with the ANN outperforming the other models. Two more research groups, Fayazi et al. [
69] and Kamari [
70] built SVMs to predict the Z-factor of rich gases by training their model with experimental data corresponding to a plurality of compositions, including sour gases. The former approach utilized Least Square Support Vector Machines (LSSVMs) together with the Coupled Simulated Annealing (CSA) optimization algorithm and the Z-factor was predicted as a function of gas composition, Molecular Weight (MW) of the heavy components, and pressure and temperature values. The LSSVM method [
71] is an advancement of the SVM one, in the sense that the solution can be more easily found using a set of linear equations instead of convex quadratic programming problems associated with the classic SVMs. The results of both groups showed that the ML models were more efficient and precise than the empirical correlations. Chamkalani et al. [
72] used the Particle Swarm Optimization (PSO) [
73] and Genetic Algorithms (GA) to perform an optimization process for the weights and biases of an ANN by minimizing the network’s error function against data derived from the S-K chart, in a sense of avoiding getting trapped in some local minimum. The developed model presented high efficiency and precision, as compared to empirical correlations, but, when optimization methods were used, the performance was enhanced significantly, with the PSO-ANN outperforming all of the other models, both from the accuracy and computational time point of view.
Although the above methods are considered quite an improvement for the Z-factor calculation, almost all of them are suitable only for limited pressure ranges. Some of them exhibit an oscillating behavior that is attributed to the fact that the models are driven by the available data, thus leading to unrealistic derivatives of Z-factor which in turn cannot be mapped to normal fluid compressibility values. Gaganis et al. [
74] developed a hybrid ML model using the Kernel Ridge Regression (KRR) method, and more specifically the truncated regularized KRR algorithm [
75], together with a linear-quadratic interpolation method to predict the Z-factor, vanquishing the disadvantages of the above techniques. The model is generated using a data set digitized from the S-K chart. The results presented smooth, in a sense of Z-factor derivative continuity, and physically solid predictions of the Z-factor, while also achieving great accuracy. The novelty of this approach is that it can be straightforwardly used to determine the Z-factor for hydrocarbon mixtures of any composition, even when impurities are present, and at any possible pressure and temperature reservoir conditions. The model can be considered as an excellent tool for estimating gas density in many reservoir simulation applications to reduce the computational time required, such as the estimation of reserves, fluid flow inside the reservoir and the wellbore, the surface pipeline system and processing equipment, etc. The proposed methodology is, however, only applicable for compositions similar to those the S-K chart was created for, and it might present significant errors when used for mixtures with significant amounts of non-hydrocarbon and/or polar compounds.
Apart from natural gases, many hydrocarbon reservoirs around the world contain a considerable amount of acid components, that is usually a mixture of light hydrocarbons, H
2S and CO
2, known as sour gases. Engineers should be able to obtain accurate thermodynamic information on those gases to successfully conduct techno-economical evaluations and make predictions about future production. Furthermore, due to the economically unattractive sulphur market price, and the increasingly strict air emission standards and regulatory authorities, many oil and gas operators are in search of environment-friendly and cost-effective methods for dealing with that kind of gases, such as acid gas re-injection for EOR or sequestration purposes, where extensive thermodynamic knowledge of the associated fluids and their interactions is needed [
76]. Considering the above, Kamari et al. [
77] developed an LSSVM model, coupled with the CSA optimization method, to predict the Z-factor of natural and sour gasses, as well as of pure acid substances. Due to the shortage of experimental studies on sour gases, the authors used pseudoreduced pressure and temperature values from the literature as input to the model and performed a comparative study with several empirical correlations and EoS models to validate the performance of their proposed approach. The results showed that their model is significantly more reliable and efficient, as compared to the available correlations and EoS for estimating the compressibility factor of sour and natural gases.
Saturation pressure (bubble/dew point pressure) is another important parameter for accurate black oil reservoir simulations. Saturation pressure is an extremely important fluid property in reservoir simulation since it marks the distinction between single and multiphase state, thus providing a phase stability indication. Two kinds of methods for estimating the saturation point pressure can be identified. The first is through experimental procedures using laboratory samples (e.g., Constant Volume Depletion-CVD), which are highly expensive and time-consuming. The second method concerns the use of empirical correlations or an iterative procedure based on an EoS. Although an EoS is effective for classic hydrocarbon systems without many impurities, fitting it to efficiently predict the phase behavior of complicated systems (e.g., volatile oils, gas condensates, oils with too many impurities, etc.) is not a trivial task. Furthermore, most of the correlations existing in the literature and appearing in commercial software, although very accurate for the range of parameters they were tuned against, they exhibit poor performance outside these bounds.
Researchers have tried to devise fast and efficient ways to predict those values using ML-based methods. Seifi et al. [
78], developed a feed-forward multilayer ANN model, trained with fluid properties (e.g., composition) to predict reliable initial values for the saturation pressure of given mixtures that would decrease the total time required by the iterative calculations. Gharbi et al. [
79] built ANN models to directly predict saturation pressure and B
o using real field crude oil data (i.e., GOR, gas and oil specific gravity and temperature). Similar models were developed by Al-Marhoun et al. [
80] and Moghadam et al. [
81], although each research group utilized different real field input parameters. Their results showed that the proposed approach presents a significantly higher accuracy, as compared to Al-Marhoun’s previous correlations [
82,
83], developed also for the same crudes. As a general conclusion, all the above ANN models provide quality predictions, significantly improving the accuracy of the most commonly used, hand-developed correlations (e.g., Standing, Vasquez and Beggs, Al-Marhoun, etc.) [
84].
Rather than ANNs, Farasat et al. [
85] developed an SVM model to predict the saturation pressure using reservoir temperature, hydrocarbon and impurity compositions, and MW and specific gravity of the heavy fraction. El-Sebakhy et al. [
86] developed SVMs to predict saturation pressures and B
o using PVT data obtained from the literature, such as reservoir temperature, oil and gas gravity and solution GOR. The results of both studies demonstrated that the proposed models are significantly more precise than most well-known correlations.
For gas condensate reservoirs the accurate prediction and constant monitoring of dew point pressure is very important for many engineering calculations, especially for the prediction of future production and for the design of operations where liquid condensation should be avoided. Numerous ML methods have been proposed to predict the dew point pressure such as the one by Akbari et al. [
87] and Nowroozi et al. [
88] who developed ANN and ANFIS models, respectively, to predict the dew point pressure of gas condensate systems using compositional and thermodynamic parameters. Similarly, Keydani et al. [
89] generated a conventional back-propagation ANN to estimate the dew point of lean retrograde gas condensates using experimentally obtained PVT data (e.g., reservoir temperature, moles fractions of volatile and intermediate gases, etc.). Gonzales et al. [
90] used an ANN model to estimate the dew point in retrograde gas reservoirs using experimental CVD data (gas composition, MW, specific gravity of the heavy fraction, reservoir temperature). Their results showed that the proposed model was more efficient than straight run or mildly tuned Peng-Robinson EoS models, as well as other empirical correlations. Similarly, Majidi et al. [
91] developed an ANN model to estimate the dew point pressure in gas condensate reservoirs using a set of experimental data, such as compositional analysis up to C
7+ and concentration of impurities (N
2, CO
2, H
2S), reservoir temperature and C
7+ specific gravity and MW. The results showed that the proposed approach is more efficient than all existing methods thanks to the enhanced, more informative input. Furthermore, the proposed model can predict the physical trend of the dew point pressure-temperature curve among the cricondenbar and cricondentherm on the phase envelope.
The continuous improvement and the emerge of new, high-end ML technologies has led researchers to utilize them in the fluid properties domain as well. Rabiei et al. [
92] developed a Multi-Layer Perceptron (MLP)—GA model to estimate the dew point pressure using reservoir temperature, mole percentage of gas components and heavy fractions properties, whereas Ahmadi et al. [
93] developed a coupled ANN-PSO model to estimate the dew point for gas condensate reservoirs using compositional and thermodynamic parameters. Ahmadi et al. [
94] devised a LSSVM approach, as developed by Suykens et al. [
71], coupled with a GA to determine the dew point pressure in condensate gas reservoirs. For comparison reasons, a classic feed-forward ANN has also been developed and, according to the results, the proposed LSSVM model exhibited superior performance. Arabloo et al. [
95] developed LSSVMs to estimate the dew point pressure for gas retrograde reservoirs, coupled with the CSA optimization algorithm for the model’s hyperparameters. The authors used the same experimental data as Majidi et al. [
91] to form the model’s input, thus arriving to a new approach which is more efficient than all existing methods. Furthermore, the LSSVM-CSA model can predict the physical trend of the dew point pressure against temperature for a constant composition fluid, to form a part of the phase envelope.
Along a similar line, Ikpeka et al. [
96] built ML models, namely MLPs, SVMs and DTs (Gradient Boost Method-GBM and XGB), to predict the dew point pressure for gas condensates using fluid composition, specific gravity, MW of the heavier component and compressibility factor as input. A classic multiple linear regression model was developed to compare the efficiency of the proposed models. The SVM model outperformed the other models, however, for large complicated data more support vectors are utilized for the same accuracy level, thus, resulting in extended computational time. Zhong et al. [
97] developed an SVM model, utilizing a mixture of kernel functions, coupled with a PSO algorithm to predict dew point pressure. The authors used real compositional and thermodynamic data as input, same to those by Majidi et al. [
91] and Arabloo et al. [
95] and they arrived to a more efficient model than all of the well-known empirical correlations, with enhanced generalization ability.