1. Introduction
Oil production plays a pivotal role in national economic progression, energy assurance, and global geopolitical equilibrium [
1]. As the energy demand continues to rise, the industry has turned to unconventional reservoirs, particularly those with low-permeability and low-porosity formations. Recent advancements in hydraulic fracturing have introduced innovative methodologies to optimize the exploration and production of these challenging reservoirs. An integrated geological-engineering design, such as volume fracturing with a fan-shaped well pattern, has been proposed to enhance hydrocarbon recovery by optimizing well placement [
2]. This approach, backed by in-depth studies into the dynamics of hydraulic fractures in multi-layered formations, offers insights into fracture propagation and its subsequent impact on reservoir performance [
3].
However, with the introduction of hydraulic fracturing and other advanced extraction techniques, predicting post-fracture oil production capacity becomes an intricate endeavor fraught with multifaceted challenges [
4]. Accurate post-fracture production prediction is paramount in enhancing well operations, mitigating operational hiatuses, and ensuring a higher return on petroleum investment. The integration of machine learning with advanced numerical techniques, like the 3-D lattice method, has emerged as a promising solution for understanding fracture dynamics in various shale fabric facies, as evidenced by a case study from the Bohai Bay basin, China [
5]. Moreover, the advent of physics-informed methods for evaluating feasibility emphasizes the need to comprehend the intrinsic properties of shale reservoirs for efficient production [
6]. When combined with sustainable extraction techniques, accurate production prediction can lead to more effective oil development, minimizing environmental degradation [
7]. This approach provides a pathway toward a more sustainable and environmentally friendly energy economy. Nonetheless, predicting oil production capacity is an intricate endeavor with multifaceted challenges [
8,
9,
10].
The oil and gas field typically employs several ways to predict production. There are numerous methods for production prediction. One prevalent strategy involves forecasting oil production metrics for a specific well by extrapolating anticipated oil yield. This extrapolation leans heavily on historical production metrics and pertinent reservoir data [
8,
11,
12]. Another strategy is based on the production potentialities of adjacent wells [
13,
14], representing the extant geological and geophysical data on the target reservoir. This approach evaluates the economic viability of implementing new drillings and finding the sweet spots within the reservoir's geological and engineering landscape [
15,
16].
Reservoir simulation is a widely used technique that employs physics-based models to predict oil production under different scenarios. However, the simulation requires extensive reservoir geological and engineering data, which may not always be available. A deep understanding of the geology and engineering aspects of the complex reservoir is also challenging for people without professional knowledge [
17]. Moreover, the assumptions made in the simulation model can impact the predictive accuracy. Reservoirs with low permeability and porosity are distinct due to their constrained pore spaces and fluid flow capabilities [
18,
19,
20,
21]. This nature inherently complicates numerical computations for production rates [
22]. The reduced permeability demands a more detailed grid division, amplifying computational intricacy and extending calculation durations [
23]. Such environments can also generate numerical dispersion, causing potential mismatches between simulated outcomes and actual field data [
24]. Additionally, fluid flow in these settings might not always adhere to Darcy's law, requiring more nuanced modeling approaches [
23]. The limited permeability further complicates the flow dynamics of oil, gas, and water, and obtaining precise rock and fluid properties becomes a challenge, introducing more significant model uncertainties [
25]. Therefore, numerical calculations for these reservoirs demand a particularly cautious and accurate approach [
26].
Machine learning has recently gained popularity in production prediction [
13,
14]. The estimated ultimate recovery (EUR) prediction model for shale gas wells is established based on the multiple linear regression method. The key factors controlling productivity are analyzed using the Pearson correlation coefficient and maximum mutual information coefficient analysis method [
9]. In contrast, The EUR prediction model for fractured horizontal wells in tight oil reservoirs employs a deep neural network (DNN) and demonstrates significantly higher prediction accuracy than the traditional multiple linear regression model [
10].
The present study provides a novel method to predict each sample's production in different depths, exclusively using well logging parameters. This approach is applied to find the relationship between rock physics, geo-mechanics, and production performance. Specifically, our research focuses on forecasting the post-fracture yield of the developed wells. In these wells, fracturing has already induced multi-fractures and facilitates the migration of hydrocarbons into production conduits. By harnessing well logging parameters, we make our efforts to focus on predicting the post-fracture yield of the designated stratum. Given that each well has a unique productivity, collecting productivity data for hundreds of wells can be challenging for most researchers. To address this, we propose a method of segmenting productivity. We divide the gas production during well testing on the first day based on the formation coefficient, which is the product of permeability, gas saturation, and porosity. Building on the foundations laid by previous studies, this study incorporates machine learning techniques to predict hydrocarbon production potential [
27,
28].
Utilizing the Sci-kit learn approach, we employ pipelines with GridSearchCV for production prediction. It is paramount to emphasize that a singular machine learning algorithm might not yield the optimal training model. Therefore, the pipelines are adopted to simplify the intricate workflow consolidate data preprocessing, feature selection and transformation, and algorithmic modeling. This method not only facilitates code development for researchers but also prevents model data leakage, ensuring the integrity of the modeling process [
29,
30,
31]. Subsequently, the integration of GridSearchCV provides a robust framework for hyperparameter optimization within the pipeline, significantly enhancing the workflow's efficiency [
32,
33,
34,
35]. Integrating pipelines and GridSearchCV can simplify the modeling process, making it possible to identify the most favorable production sweet spots by utilizing multi-class parameters (reservoir properties, rock mechanics parameters, well logging parameters, and other oilfield data).
This paper is structured as follows: In
Section 2, we outline the technical path for production prediction, providing some details on the process of pipeline establishment and the techniques used for model evaluation.
Section 3 introduces our proposed production subdivision algorithm, presenting both the actual logging parameters and the subsequent data distribution for a specified block. In
Section 4, we present and discuss the comparative analysis between predicted outcomes and actual results from three pipelines, all using consistent data preprocessing and feature extraction methods. Additionally, we delve into an analysis of potential correlations between logging parameters and productivity. Finally,
Section 5 wraps up our findings, offering key conclusions and recommendations geared towards enhancing the forecasting efficiency for sustainable capacity development.
2. Material and Method
The overall technical approach for predicting post-fracturing production involves ten steps, which is shown in
Figure 1:
2.1. Data collection and processing
Two data types need to be collected: the well logging data based on the depth range of perforated wells that have undergone hydraulic fracturing and the gas production during well testing on the first day. Since the sample size provided by the perforated segment is too small to carry out reliable and stable production prediction, the gas production during well testing is divided into ‘production segmentation’ at each logging sample.
2.2. Sample preparation
The data collected from the fractured wells are organized samples for machine learning-based production prediction. The features of the samples include nine logging parameters (depth, p-wave velocity, density, natural gamma, neutron porosity, resistivity, porosity, permeability, and water saturation).
2.3. Data Splitting
To ensure that the machine learning model is able to generalize the unseen data, the collected data is divided into a training set and a testing set. Usually, 80% of the data is used for training the model, and the remaining 20% is reserved for final testing to assess the model's performance. The basic process, including data collection, sample preparation, and data splitting, is shown in
Figure 2.
2.4. Data Scaling and feature extraction
Effective data preprocessing involves handling variations in data scales and units. This is usually performed through data standardization and normalization. Our study incorporates two scaling techniques: StandardScaler, which normalizes data to have a mean of zero and a standard deviation of one, and RobustScaler, which uses the median and interquartile range, making it less prone to outliers.
For the StandardScaler, it follows the function [
36]:
where, x is the original dataset, μ represents the mean of the dataset, σ denotes the standard deviation of the dataset, and z is the processed dataset.
For the RobustScaler, it follows the function [
37]:
where, v' is the processed value, v is a value from the original dataset, median represents the median of the dataset, and IQR denotes the interquartile range of the dataset.
We employ Principal Component Analysis (PCA) for feature extraction to reduce the data dimension while preserving critical information. The Polynomial Features is also used to generate complex polynomial combinations of features. We comparatively analyze three preprocessing combinations (StandardScaler with PCA, RobustScaler with PCA, and StandardScaler with Polynomial Features), aiming to discern the optimal preprocessing strategy for our dataset.
2.5. Machine Learning Algorithm
We employ a combined approach of three robust models: XGBoost, Random Forest, and Neural Networks. XGBoost is a sequential model that iteratively corrects errors from previous predictions, thus enhancing the predictive accuracy. Random Forest, containing multiple decision trees, excels in handling intricate datasets, offers resistance to overfitting, and highlights essential features. Neural Networks are adept at deciphering complex patterns using interconnected layers of neurons, making them suitable for high-dimensional datasets. Our ensemble approach taps into the distinct strengths of each model, ensuring a comprehensive and efficient prediction mechanism.
2.5.1. XGBoost
XGBoost is an ensemble gradient boosting algorithm abbreviated from eXtreme Gradient Boosting. The method has been proven effective in handling geophysical data, especially with small sample sizes. We use XGBoost to build a regression model for the production prediction in this paper. The algorithm begins by training a weak learner, typically a decision tree. This tree starts with a root node encompassing all training data, then optimally splits the dataset based on features. The dataset partitioning continues until the data is correctly classified. Successive learners iteratively refine prediction errors from their predecessors using gradient descent on the loss function's second-order Taylor expansion. The final model aggregates these weak learners' results. For in-depth mathematical insights, we have provided a brief introduction based on Chen et al. [
38].
Given a dataset
, the tree ensemble model in XGBoost is represented by a set of
functions
from an input space to the output space. Each
corresponds to a tree structure
that maps an instance to the corresponding leaf index, yielding a predictive value from the set of leaf weights
. Specifically, an instance
is mapped as:
The objective function to be minimized during the training of XGBoost is given by the sum of a differentiable convex loss function
and a regularization term
. Formally:
For the loss function, a second-order Taylor expansion provides an approximation:
where,
and
represent the first and second-order gradients, respectively.
The regularization term, which penalizes the complexity of the tree ensemble, is defined as:
where,
denotes the number of leaves in the tree;
is the L2 regularization term on the leaf weights.
2.5.2. Random Forest
The Random Forest algorithm, introduced by Leo Breiman in 2001, is a robust machine learning technique that leverages the power of multiple decision trees for making predictions [
39]. This ensemble method operates by constructing a multitude of decision trees at training time and outputs the classes (classification) or mean prediction (regression) of the individual trees1. Random decision forests can correct the habit of overfitting to the training set caused by decision trees. The algorithm's name comes from the fact that it introduces randomness into the tree building to ensure each tree is different. This randomness originates from two main aspects. Firstly, whenever a tree split is considered, a random sample of 'm' predictors is selected from the full set of 'p' predictors. Secondly, each tree is constructed using a bootstrap sample instead of the entire original dataset.
The application of Random Forest in petroleum exploration has been gaining traction in recent years. For instance, Roubícková et al. [
40] utilized Random Forest to reduce ensembles of geological models for oil and gas exploration. They found that the Random Forest algorithm effectively identifies the critical grouping of models based on the most essential feature. This work highlights the potential of Random Forest in handling large datasets and complex geological models.
2.5.3. Neural network models(MLP Regressor)
The Multi-layer Perceptron (MLP), a foundational element in neural network architectures, has witnessed substantial advancements across various applications due to rigorous research efforts. Recent studies [
41,
42] highlight the adaptability of MLPs in varied tasks, from phishing URL detection to pinpointing functionals' extrema. Newer innovations, such as the MC-MLP [
43], have broadened the scope of MLP applications, particularly in computer vision. Moreover, research into heterogeneous multilayer networks [
44] and the exploration of the universality of equivariant MLPs [
45] have further deepened the comprehension and application potential of MLPs across multiple sectors. This study establishes nine distinct pipelines by integrating the methodologies from sections 2.4 and 2.5. A flow chart detailing these pipelines will be provided in
Table 1.
2.6. GridsearchCV
To enhance code maintainability, we've developed nine distinct pipelines, each representing a combination of one preprocessing technique with one machine learning algorithm. This systematic exploration aims to identify the most effective pairing for optimal predictive performance. Implementing this structured pipeline approach not only streamlines the process but also effectively mitigates data leakage risks—a prevalent issue in machine learning model development. To ascertain the best configurations for our algorithms, the GridSearchCV method is employed. GridSearchCV stands for "Grid Search and Cross-Validation." This procedure rigorously scans numerous parameter settings while instituting cross-validation, ensuring the identification of the most performative configuration. Through this rigorous approach, we can harness the full predictive capacity of ensembled models and reach optimal accuracy and reliability.
Table 2 presents the parameters under consideration for tuning.
2.7. Evaluation Criteria
The performance of the model is assessed using appropriate evaluation metrics. For regression tasks such as production prediction, common metrics, including Standard Deviation (SD), Root Mean Squared Error (RMSE), and R-squared (R²) score are used.
where,
is the k
th real production, in m
3/d;
is the k
th predicted production, in m
3/d;
is the mean real production of all wells, in m
3/d;
is the mean predicted production of all wells, in m
3/d;
is the standard deviation of real production, in m
3/d;
is the standard deviation of pipelines predicted production, in m
3/d;
n is the total count of specimens.
2.8. Optimal pipeline
In this study, we comprehensively evaluated nine distinct processing pipelines. Leveraging a suite of benchmark evaluation metrics, we systematically compared their performance to discern each pipeline's relative strengths and weaknesses. The analysis and empirical comparisons have enabled us to ascertain the most effective pipeline. The results from the comparative study can offer valuable insights into the optimal pipeline configuration for this particular context and contribute to a broader understanding of pipeline performance in the field.
2.9. Generalization Ability Verification
After our preliminary analyses, we dedicated the residual 20% of the dataset to execute production predictions. To depict the precision of our model's forecasting outcomes, we employed Taylor diagrams—an illustrative tool renowned for representing the R2, SD, and RMSE of model predictions. Readers can effortlessly discern the veracity of the model's predictions through this visual representation and assess the degree of uncertainty associated with stochastic forecasting outcomes.
2.10. Result Demonstration
Finally, the findings are elucidated through visual representations. The comparison of predicted outcomes against real results is undertaken for 33 wells, enabling a clear delineation of model accuracy. At the same time, the performance metrics are chosen to provide a granular understanding of the model's predictive capacity. The comprehensive analysis of training results facilitates comprehension of our predictive framework's inherent attributes and highlights potential ways for refinement. The study can provide a complete understanding of the model's utility, illuminating its merits and areas demanding further exploration.
4. Results and Discussions
4.1. Model Performance Analysis
Optimal model parameters are determined utilizing GridSearchCV. Initially, 80% of the dataset is randomly allocated for training. Within this subset, a comprehensive set of 5,130 sensitivity parameters is established across the nine pipelines. GridSearchCV implicates a five-fold cross-validation strategy. This method entails training the model on 80% of the subset during each iteration (four out of five times) and validating the residual 20% (once). This procedure is iterated five times to include all samples, with the averaged outcome of validations serving as the performance metric. In total, 25,650 models undergo training to ascertain the optimal configuration.
Table 3 presents the results of post-parameter optimization corresponding to each pipeline.
Figure 5 shows the prediction results of validation. The horizontal axis represents the real production calculated by Eq.11, and the vertical axis represents the predicted production. If the predicted production of a sample is equal to the real production, the blue dots will fall on the black diagonal line. The overall prediction results of high-productivity data samples are smaller than the real productivity value, but most of the samples are close to the diagonal. It indicates a good correspondence between predicted production and real production. Therefore, the adopted model is considered to have strong predictive ability.
The data normalization is executed within the processing pipeline, while it did not perturb the intrinsic range of our predictive outcomes. Specifically, the production capability in the designated operational zone is expected to span between 0 to 2000 m³/d. The formula delineated in
Section 2.7 is employed to compare the actual production capacity against the forecasted values, and the outcomes are illustrated in
Figure 6. The analysis reveals that the standard deviation of baseline data stands at 192.2. Notably, every pipeline, post-training, manifests a standard deviation smaller than that of the original dataset. This suggests that the variance within the trained data across all pipelines is less pronounced than that observed in the raw data. Given the significant deviation inherent in the dataset, the RMSE values for all pipelines exceed those from previously reported data; however, they remain substantially below the observed standard deviation. This context suggests that the predictions of the model operate within an acceptable margin. In subsequent sections, the validation against the remaining 20% of the data across 33 Wells will be elucidated.
Table 4 shows the evaluation indices for all pipelines.
4.2. Comparative Analysis of PS-XGB vs. PS-RF vs. PS-NN
The remaining 20% of the dataset is used for forecasting, with real and projected productivity values for each well illustrated using well log mapping. The black curve represents the segmented production of the perforation interval, and the red curve represents the predicted production. By fitting the two curves, it can be seen that the model can make a reliable production prediction on many perforation intervals (Wells 2, 5, 14, 17 and 29). These results indicate that the model has good predictive ability.
Notably, significant discrepancies are observed in Wells 3, 6, and 25. Such deviations might be attributed to the insufficient training data of these wells, suggesting the need for refined parameters and enhanced machine learning models. Assuming that the range of parameter optimization is enlarged, the model could move into the overfitting territory, which is detrimental to reservoir productivity forecasting.
Among the three pipelines, PS-XGB exhibits the most severe overfitting, whereas PS-NN demonstrates the least overfitting. This is particularly evident in Well 3, where the predicted curve of PS-XGB displays considerable variability in contrast to the observed smoother curve of the actual data. Although PS-RF exhibits fewer fluctuations compared to PS-XGB, overfitting is still not avoided. When compared with PS-NN, PS-XGB shows superior results.
Figure 7.
Production prediction of the remaining 20% of 33 Wells using PS-XGB.
Figure 7.
Production prediction of the remaining 20% of 33 Wells using PS-XGB.
Figure 8.
Production prediction of the remaining 20% of 33 Wells using PS-RF.
Figure 8.
Production prediction of the remaining 20% of 33 Wells using PS-RF.
Figure 9.
Production prediction of the remaining 20% of 33 Wells using PS-NN.
Figure 9.
Production prediction of the remaining 20% of 33 Wells using PS-NN.
4.3. Comparative Analysis of PS-XGB vs. PFS-RF vs. PFS-NN
In general, the fluctuation of the predicted production capacity data of these three pipelines is smaller than that of the three pipelines of PS-XGB, PS-RF, and PS-NN. Simultaneously, the differences in production prediction trends among the three pipelines are minimal, but there is a significant difference in the production prediction of some Wells in the pipeline. For example, between 1717 and 1720 m in well 3, the PFS-XGB and PFS-RF forecasts are good. Still, in the PFS-NN, the productivity forecast surged to 2,000 m3/d, which may be related to the strategic optimization of the neural network parameters. In well 25, an elevated predicted productivity value emerges at shallower depths, yet the predictions from all three pipelines align closely at greater depths. The shallow depth predictions for well 30 significantly deviate from real values. An observable production fluctuation between depths of 1690 and 1692 meters in well 4 eludes the predictive capabilities of all three pipelines. In well 21, the predicted values substantially exceed the real production. Conversely, the trio of pipelines demonstrates commendable accuracy in forecasting the productivity for well 20 and 27. This could be attributed to a more straightforward relationship between logging parameters and productivity for this particular well.
Figure 10.
Production prediction of the remaining 20% of 33 Wells using PFS-XGB.
Figure 10.
Production prediction of the remaining 20% of 33 Wells using PFS-XGB.
Figure 11.
Production prediction of the remaining 20% of 33 Wells using PFS-RF.
Figure 11.
Production prediction of the remaining 20% of 33 Wells using PFS-RF.
Figure 12.
Production prediction of the remaining 20% of 33 Wells using PFS-NN.
Figure 12.
Production prediction of the remaining 20% of 33 Wells using PFS-NN.
4.4. Comparative Analysis of PR-XGB vs. PR-RF vs. PR-NN
The predictions from the three pipelines showed notable discrepancies from the actual results, particularly in Wells 1 to 11. This deviation might stem from the use of the RobustScaler in data preprocessing. While it adeptly handles outliers through the median and interquartile range, it may not suit near-normal datasets and can miss crucial outlier information due to its non-zero mean. Among the three pipeline configurations, PR-XGB provides a more volatile representation of production. For Wells 6 and 7, the predictions from all three pipelines were off-mark, whereas PR-NN demonstrated notable accuracy. In particular, the production forecast for Well 21 between 2150.5 and 2154 m closely matched the actuals, contrasting with the significant deviations seen in PR-XGB and PR-RF predictions. While PR-RF generally provided the most accurate forecasts, it faltered in certain instances, notably with Well 31.
Figure 13.
Production prediction of the remaining 20% of 33 Wells using PR-XGB.
Figure 13.
Production prediction of the remaining 20% of 33 Wells using PR-XGB.
Figure 14.
Production prediction of the remaining 20% of 33 Wells using PR-RF.
Figure 14.
Production prediction of the remaining 20% of 33 Wells using PR-RF.
Figure 15.
Production prediction of the remaining 20% of 33 Wells using PR-NN.
Figure 15.
Production prediction of the remaining 20% of 33 Wells using PR-NN.
4.5. Correlation coefficient comparison
Pearson correlation coefficients were computed for each well across all pipelines and visualized via heat maps. This linear correlation between actual and predicted yields serves as a measure of prediction accuracy. The findings indicate suboptimal prediction accuracy for Well 6 across all pipelines.
For Well 1, PS-NN exhibited a correlation of 0.651, while PR-XGB registered at 0.501, suggesting room for improvement. In the case of Well 7, PR-NN's predictions negatively correlated with those of Well 33, and PR-XGB's predictions also showed poor alignment with actual results. Notably, PR-XGB consistently underperformed in predictive correlation across all wells, whereas PFS-NN emerged as the top performer in terms of determination coefficient.
Overall, the production predictions across all pipelines exhibit a strong correlation. The trained model demonstrates a reliable performance for production prediction.
Figure 16.
Pearson correlation coefficients of different models on different wells.
Figure 16.
Pearson correlation coefficients of different models on different wells.