1. Introduction
The Wind energy, as one of the most critical renewable energy sources, plays a significant role in the planning and scheduling of power systems. However, the intermittency, high variability, and strong stochastic nature of wind power generation present substantial challenges and adverse impacts on the integration of wind power into the grid [
1]. Consequently, accurate wind power forecasting has become one of the key strategies to address the issues related to wind power integration [
2].
To tackle this issue, current research primarily focuses on improving feature input and constructing precise models to enhance the accuracy and reliability of wind power forecasting. In terms of feature extraction, researchers utilize feature processing techniques to provide models with richer feature information. For instance, Ju, Y, et al. [
3] constructed a novel feature set by analyzing the characteristics of time series raw data from wind farms and neighboring wind farms, proposing the use of Convolutional Neural Networks (CNN) to extract information from input data. Zhao, Y, B, et al. [
4] were the first to apply the NeuralProphet model to decompose wind power time series data, accurately capturing the complex nonlinear patterns hidden within wind power time series.
In the realm of model construction, machine learning and deep learning have made remarkable progress as the primary statistical models in the field of wind power forecasting in recent years [5-8]. Liao, S, L, et al. [
9]introduced a Light Gradient Boosting Machine (LightGBM) model with strong nonlinear fitting capabilities, which can fully exploit the valuable information in historical wind power operation data. Wang, Y, S, et al.[
10]proposed a wind farm output power forecasting model based on the Sliding Time Window (TSW) and Long Short-Term Memory Networks (LSTM) [
11,
19], effectively fitting the output power curve of the wind farm and achieving accurate wind power prediction. Yu, C, Q, et al. [
12] combined Graph Attention Networks (GAT), Gated Recurrent Units (GRU), and Temporal Convolutional Networks (TCN) to effectively extract features from wind power time series data, significantly improving the model's accuracy and robustness. Yang Guohua et al. [
13] proposed a short-term wind power forecasting model based on Complementary Ensemble Empirical Mode Decomposition-Sample Entropy (CEEMD-SE), Convolutional Neural Networks (CNN), and Long Short-Term Memory-Gated Recurrent Units (LSTM-GRU)[
23], demonstrating that the model effectively enhanced forecasting accuracy, reducing the error by 15.06%. However, deep learning models require the setting of numerous parameters, and hyperparameters determined by expert experience often differ from the optimal parameters needed by the model. Moreover, as the volume of data increases, especially when the model becomes overly complex, more computational resources and training time are required.
To address the aforementioned issues, this paper employs the Orthogonal Maximum Information Coefficient to analyze the correlation between meteorological features and wind power for feature extraction [
14], meeting the timeliness requirements of wind power forecasting tasks. Considering the long-range dependence characteristics of wind power [
15,
16]. an Adaptive Fractional-order Generalized Pareto Motion (fGPm) forecasting model is proposed to cope with the randomness and volatility of wind energy.The main contributions of this paper are as follows:
Since the correlation between variables in industrial data is often nonlinear, using the Orthogonal Maximum Information Coefficient to analyze the correlation between meteorological features and wind power allows for more accurate capture of the nonlinear relationships between variables, thereby improving feature input and meeting the timeliness needs of wind power forecasting tasks.
Research on random sequences with long-range dependence (LRD) characteristics in the field of wind power forecasting remains relatively limited [
17,
18]. Therefore, the Adaptive Fractional-order Generalized Pareto Motion (fGPm) model is introduced, which considers long-term dependency uncertainty, fully accounts for the influence of past and current states on future states, and more effectively forecasts a series of non-smooth stochastic processes[ 19]. As the environment changes, the rate of variation in the degradation process also changes accordingly. The proposed adaptive diffusion coefficient fGPm demonstrates superiority and robustness in handling data with high jumpiness.
To verify the practicality and effectiveness of the proposed method, a case analysis of wind power data from a wind farm in Northwest China is conducted [
20]. Two scenarios are selected for the experiment: one with significant meteorological feature fluctuations in winter from December 12, 2019, 7:00 to December 14, 2019, 19:00 (Case 1), and another with smaller fluctuations in summer from July 12, 2019, 7:00 to July 14, 2019, 19:00 (Case 2), serving as historical feature data[
21,
22]. These data are used to predict the optimal power change trends of wind power within the next 12, 24, 36, and 48 steps. Compared with previous methods, the model shows higher precision in power forecasting.
The structure of this paper is arranged as follows:
Section 2 analyzes the input feature variables that affect wind power, introducing the principles and steps of the Orthogonalized Maximal Information Coefficient (OMNIC) method in data preprocessing[
24].
Section 3 discusses the characteristics of the fractional generalized Pareto distribution (GPD)[
25] and the physical significance of its parameters, focusing on the analysis of the distribution's long-range dependence (LRD) and heavy-tailed characteristics[
26]. Additionally, the mathematical expression of the adaptive fractional generalized Pareto motion (fGPm) model and its incremental form are introduced, exploring the LRD characteristics of fGPm and the process of generating numerical sequences.
Section 4 proposes an adaptive fGPm iterative differential forecasting model based on Langevin-type stochastic differential equations (SDE)[
27,
28], describing its parameter estimation methods.
Section 5 conducts experimental validation using measured wind power data from a wind farm in Northwest China, demonstrating the efficiency and applicability of the model through comparisons with other models. Finally,
Section 6 summarizes the main research findings of this paper and discusses future research directions.
Innovation: In the field of wind energy, time series forecasting is essential for efficient wind energy resource management, improving the grid's acceptance capacity, and enhancing public safety and quality of life. However, due to wind speed fluctuations, wind power generation always has uncertainty and complexity, necessitating the capture of long-range correlations and predictive models. Existing methods often struggle to address both challenges simultaneously[
29,
30]. This paper delves into these issues, employing the Orthogonal Maximum Information Coefficient (OMI-Coherence) for feature extraction to improve the accuracy of correlation analysis, and proposes a novel Adaptive fGPm model that can autonomously learn different inter-sequence associations at each time scale. Moreover, by parameterizing the uncertainty inherent in the system during the iteration process, better system performance and convergence are achieved. Experiments applied to multiple real datasets from a wind farm have proven the robustness, accuracy, and effectiveness of the model.
2. Feature Extraction
Within wind farms, a multitude of factors can influence the power output of wind energy, with meteorological characteristics being the most critical, such as wind speed, direction, temperature, humidity, and air pressure[
31,
32]. Selecting features based solely on the correlation between two variables may lead to significant redundancy among them [
33,
34]. In this study, we employ the Orthogonalized Maximal Information Coefficient (OMI-Coherence) to analyze the correlation between meteorological characteristics and wind power, which more accurately captures the nonlinear relationships between variables[
35]. The specific implementation steps are as follows:
Step 1: Define Mutual Information
For a two-dimensional dataset
composed of variables
X and
Y, divide the space along each axis into intervals
and
, forming a grid
of size
[
36]. Based on the distribution of
, the mutual information between
X and
Y is defined as:
where
is the joint probability distribution of X and Y, and
and
are the marginal probability distributions of X and Y, respectively [
37].
Step 2: Calculate and Normalize Mutual Information
From the grid
,identify the maximum value of mutual information
as the output. Construct a normalized feature matrix
based on this value[
38]. The normalized mutual information matrix
for a total sample size of n is given by:
Step 3: Compute OMI-Coherence
For the set of variables
and the target variable
,calculate the maximal information coefficient value between the feature information of
, which is independent of the selected variable set
, and the target variable
[
39]. This is expressed as:
Here,
represents the mutual information between the transformed variables[
40].
Step 4: Variable Selection and Ranking
Based on the OMI-Coherence values, select the variable with the highest OMI-Coherence value, then incrementally add other variables according to the ranking rules established [
41]. The feature vector selected after the
-th choice is used to construct the final feature sequence:
This approach effectively selects the most relevant features from meteorological characteristics for wind power forecasting, reduces redundancy among features, and enhances the performance of the predictive model.
6. Conclusions
The adaptive fractional-order generalized Pareto motion (fGPm) iterative differential forecasting method proposed in this study demonstrates superior performance and high accuracy in short-term wind power forecasting, showcasing significant practical application potential. The main conclusions are as follows:
Key Role of Data Processing: To address the impact of the complex temporal characteristics of wind power generation on model accuracy, we employed the Orthogonalized Maximum Information Coefficient feature selection method. This significantly enhanced the correlation between wind power and the selected features, validating the effectiveness of the feature extraction approach.
Long-Range Dependence (LRD) Characteristics of Wind Power Series: By analyzing the relationship between the Hurst parameter and feature indices, we revealed the LRD characteristics of wind power series. This finding provides a theoretical basis for modeling with the adaptive fGPm, particularly improving trend prediction accuracy under optimized sample lengths and forecasting horizons.
Parameter Estimation of New Feature Function Method: We successfully applied a new feature function method to estimate parameters such as stability index, skewness index, drift coefficient, and diffusion parameter within the adaptive fGPm model, laying a solid foundation for building a reliable forecasting model.
Superiority of the Adaptive fGPm Forecasting Model: The adaptive fGPm iterative differential forecasting model effectively addresses uncertainties in wind power generation by introducing tail parameters, thereby enhancing the flexibility of LRD characterization. This model demonstrated advantages in forecasting high-volatility data, with its diffusion coefficient adapting to environmental changes to more accurately reflect dynamic wind power characteristics.
Comparison with Other Models: Comparative analyses with mainstream forecasting models such as CNN-GRU and CNN-LSTM demonstrated the superiority and versatility of the adaptive fGPm model in describing wind power data, achieving higher prediction accuracy.
In summary, This research provides an innovative adaptive fGPm iterative differential forecasting model for accurate short-term wind power predictions, aiding power departments in optimizing generation planning and scheduling. Furthermore, it serves as an effective reference for other time series forecasting scenarios, such as wind speed, photovoltaic generation, and precipitation.