1. Introduction
Remaining Useful Life (RUL) prediction, as a significant research domain in Prognostics and Health Management (PHM) [
1], offers the potential to forecast the future degradation trajectory of equipment based on its current condition. Transforming scheduled maintenance into proactive operations substantially mitigates the risks of personnel casualties and economic losses resulting from mechanical failures.
With the increasing complexity and sophistication of equipment, conventional PHM methods by dynamic models, expert knowledge, and manual feature extraction have become increasingly limited. Nowadays, fueled by rapid advancements in technologies such as sensors, the Internet of Things, and artificial intelligence, attention has been drawn to the DL-based techniques with remarkable performance for RUL prediction [
2,
3,
4]. Therefore, with the industrial data accumulation, conducting DL based RUL prediction research for equipment, which possess powerful feature extraction capabilities, has not only emerged as a hot research topic in academia but also holds significant practical implications for the industry.
DL-based methods enable the construction of deep neural network architectures, endowing them with more powerful feature extraction capabilities compared to shallow machine learning algorithms. Consequently, these methods can directly learn and optimize features from raw data obtained from complex equipment, and infer the RUL, thereby enhancing the accuracy and robustness of RUL estimation. Among various DL techniques, neural networks (NNs) have emerged as state-of-the-art models for addressing RUL prediction problems, attracting significant attention from researchers [
5,
6,
7,
8].
Recurrent neural networks (RNN) makes the input data and historical data as the finally input matrix which is different from other NNs. Based on this unique design, RNN is well-suited for processing sequential data and has been successfully applied in RUL prediction [
9,
10]. However, RNN also has its own limitations, such as long recursion time, which indirectly increases the depth and training time of the NN, and the issue of vanishing gradients that frequently occurs [
11]. Long Short-Term Memory (LSTM) is deduced by Hochreiter and Schmidhuber to address above issues in 1997 [
12], which can mitigate the problem of long-term dependencies in RNN and has gained widespread application [
13,
14].
By comparing the aircraft engines prediction performance of the vanilla RNN, LSTM, and Gated Recurrent Unit (GRU), Yuan et al. [
15] concluded that LSTM and GRU outperformed traditional RRNs. For solving the degradation problem in deep LSTM models, a residual structure is adopted [
16]. Zhao et al. [
17] conducted an empirical evaluation of an LSTM-based machine tool wear detection system. They applied the LSTM model to encode raw measurement data into vectors for corresponding tool wear prediction. Wu et al. [
18] found that the fusion of multi-sensor inputs can enhance the long-term prediction capability of DLSTM. Guo et al. [
19] proposed a novel artificial feature constructed from temporal and frequency domain features to boost the prediction accuracy of LSTM. As a commonly used variant of LSTM, GRU has attracted significant attention due to its simplified gating mechanism, which reduces the training burden without compromising the regression capability. Zhao et al. [
20] presented a GRU model by local features for machine health monitoring. Zhou et al. [
21] introduced an enhanced memory GRU network that utilizes previous state data for predicting bearings RUL. He et al. [
22] employed a fault mode-assisted GRU method for RUL prediction to guide the initiation predictive maintenance time of machines. Que et al. [
23] developed a combined method by stacked GRU, attention mechanism, and Bayesian methods for predicting the bearings RUL. A deep multiscale feature fusion network based on multi-sensor data for predicting the RUL of aircraft engines is proposed by Li et al. [
24], with GRU replacing the commonly used fully connected layers for regression prediction. Ni et al. [
25] used GRU for predicting the RUL of bearing systems and adaptively adjusted the optimal hyper-parameters using Bayesian optimization algorithm. Zhang et al. [
26] proposed a dual-task network structure based on bidirectional GRU and multi-gate expert fusion units, which can simultaneously assess the health condition of aircraft engines and predict their RUL. Ma et al. [
27] introduced a novel deep wavelet sequence GRU prediction model for predicting the RUL of rotating machinery, where the proposed wavelet sequence GRU generates wavelet sequences at different scales through wavelet layers.
CNN exhibits powerful spatial feature extraction capabilities and is suitable for classification tasks such as fault diagnosis [
28]. However, it is rarely adopted alone for RUL prediction. To enhance the model’s ability of extracting temporal and spatial information in RUL prediction task, combining the CNN with RNN or adopting the convolution operators to replace the operations in RNN is the common approach. Some researchers combine theses two classical models serially and parallelism to construct the novel models. Wang et al. [
31] replaced the conventional fully connections of forward and recurrent process of GRU with convolutional operators. Similarly, Ma et al. [
32] further replaced the fully connection on the state-to-state transitions of LSTM as convolution connection to boost the feature extraction ability. For improving the RUL prediction accuracy, Li et al. [
33] presented a combination method by the ConvLSTM and self-attention mechanism. Cheng et al. [
34] introduced a new LSTM variant for predicting RUL of aircraft engines by combining autoencoders and RNNs. The proposed method made the pooling operation with LSTM’s gating mechanism while retaining the convolutional operations, enabling the ability of parallel processing. Dulaimi et al. [
35] proposed a parallel DL framework based on CNN and LSTM for extracting the temporal and spatial features from raw measurements. For solving the inconsistent problem of inputs, Xia et al. [
36] proposed a CNN-BLSTM method, which has the different time scales processing ability. Xue et al. [
37] introduced a data-driven approach for predicting the RUL, which incorporates two parallel pathways: one pathway combines multi-scale CNN and BLSTM, while the other pathway only utilizes BLSTM.
Researches based on LSTM variants and convolution operator have achieved significant success in RUL prediction, but they still have some gaps. The convolutional kernel exhibits redundancy in the channel dimension, and the extraction features lack the ability to adapt flexibly based on the input itself [
38]. And the ability to capture flexible spatiotemporal features not only saves computational resources but also enables the extraction of rich features, thereby improving the accuracy of mechanical RUL prediction. Additionally, the computation burden is also an important requirement for mechanical RUL prediction. Therefore, it is worth investigating how to enhance the spatiotemporal capturing capability of prediction models while minimizing model parameters to improve prediction speed.
Consequently, considering the aforementioned limitations, a lightweight operator with adaptive feature capturing capability named involution GRU (InvGRU) is proposed, and a deep learning framework is constructed based on this operator for predicting the RUL of aircraft engines. The RUL prediction results of C-MAPSS data set [
24] demonstrate that the proposed method outperforms other publicly available methods in terms of prediction accuracy and computational burden.
The bellows are the contributions of the article:
(1). A novel operator by replacing the connection operator in GRU as Involution, called INVGRU, is proposed, which has the ability to adaptively capture spatiotemporal information based on the input itself. Compared to other models for spatiotemporal information extraction, INVGRU has fewer parameters.
(2). Based on c, a deep learning framework with higher prediction accuracy is constructed. Experimental results of aircraft engines RUL prediction demonstrate the outperformance of the proposed InvGRU based DL framework.
The outline of the article is as bellows.
Section 1 provides an introduction to the research topic.
Section 2 presents a concise explanation of the fundamental principles of GRU and involution. In
Section 3, the novel operator InvGRU, which has the adaptively spatiotemporal information extraction ability, is introduced. Then, the proposed methods are thoroughly validated and compared through experiments on C-MAPSS data set in
Section 4. Finally,
Section 5 presents the conclusion.
Figure 1.
Principle of involution (G=1).
Figure 1.
Principle of involution (G=1).
Figure 2.
Schematic diagram of GRU.
Figure 2.
Schematic diagram of GRU.
Figure 3.
Schematic diagram of InvGRU.
Figure 3.
Schematic diagram of InvGRU.
Figure 4.
InvGRU-based DL framework.
Figure 4.
InvGRU-based DL framework.
Figure 5.
The curves of the two evaluation indexes.
Figure 5.
The curves of the two evaluation indexes.
Figure 6.
Diagram of the aircraft engine.
Figure 6.
Diagram of the aircraft engine.
Figure 7.
Processing of data segmentation.
Figure 7.
Processing of data segmentation.
Figure 8.
RUL prediction performance on FD001.
Figure 8.
RUL prediction performance on FD001.
Figure 9.
RUL prediction performance on FD002.
Figure 9.
RUL prediction performance on FD002.
Figure 10.
RUL prediction performance on FD003.
Figure 10.
RUL prediction performance on FD003.
Figure 11.
RUL prediction performance on FD004.
Figure 11.
RUL prediction performance on FD004.
Figure 12.
RUL prediction performance of engines of FD001 ((a) engine # 46, (b) engine # 58, (c) engine # 66, and (d) engine # 92).
Figure 12.
RUL prediction performance of engines of FD001 ((a) engine # 46, (b) engine # 58, (c) engine # 66, and (d) engine # 92).
Figure 13.
RUL prediction performance of engines of FD002 ((a) engine # 9, (b) engine # 45, (c) engine # 150, and (d) engine # 182).
Figure 13.
RUL prediction performance of engines of FD002 ((a) engine # 9, (b) engine # 45, (c) engine # 150, and (d) engine # 182).
Figure 14.
RUL prediction performance of engines of FD003((a) engine # 25, (b) engine # 38, (c) engine # 75, and (d) engine # 92).
Figure 14.
RUL prediction performance of engines of FD003((a) engine # 25, (b) engine # 38, (c) engine # 75, and (d) engine # 92).
Figure 15.
RUL prediction performance of engines of FD004 ((a) engine # 35, (b) engine # 68, (c) engine # 100, and (d) engine # 151).
Figure 15.
RUL prediction performance of engines of FD004 ((a) engine # 35, (b) engine # 68, (c) engine # 100, and (d) engine # 151).
Table 1.
the hyper-parameters of the prosed DL framework based on InvGRU
Table 1.
the hyper-parameters of the prosed DL framework based on InvGRU
Sub layer |
Hyperparameter value |
Sub layer |
Hyperparameter value |
InvGRU |
70 |
Regression (Linear) |
1 |
FC1 (Relu) |
30 |
Learning rate |
0.005 |
FC2 (Relu) |
30 |
Dropout1 |
0.5 |
FC3 (Relu) |
10 |
Dropout2 |
0.3 |
Table 2.
The details of dataset C-MAPSS
Table 2.
The details of dataset C-MAPSS
Subset |
FD001 |
FD002 |
FD003 |
FD004 |
Total number of engines |
100 |
260 |
100 |
249 |
Operating condition |
1 |
6 |
1 |
6 |
Type of fault |
1 |
1 |
2 |
2 |
Maximum cycles |
362 |
378 |
525 |
543 |
Minimum cycles |
128 |
128 |
145 |
128 |
Table 3.
Sensors of C-MAPSS
Table 3.
Sensors of C-MAPSS
number |
symbol |
description |
unit |
trend |
number |
symbol |
description |
unit |
trend |
1 |
T2 |
Total fan inlet temperature |
ºR |
~ |
12 |
Phi |
Fuel flow ratio to Ps30 |
pps/psi |
↓ |
2 |
T24 |
Total exit temperature of LPC |
ºR |
↑ |
13 |
NRf |
Corrected fan speed |
rpm |
↑ |
3 |
T30 |
HPC Total outlet temperature |
ºR |
↑ |
14 |
NRc |
Modified core velocity |
rpm |
↓ |
4 |
T50 |
Total LPT outlet temperature |
ºR |
↑ |
15 |
BPR |
bypass ratio |
-- |
↑ |
5 |
P2 |
Fan inlet pressure |
psia |
~ |
16 |
farB |
Burner gas ratio |
-- |
~ |
6 |
P15 |
Total pressure of culvert pipe |
psia |
~ |
17 |
htBleed |
Exhaust enthalpy |
-- |
↑ |
7 |
P30 |
Total outlet pressure of HPC |
psia |
↓ |
18 |
NF_dmd |
Required fan speed |
rpm |
~ |
8 |
Nf |
Physical fan speed |
rpm |
↑ |
19 |
PCNR_dmd |
Modify required fan speed |
rpm |
~ |
9 |
Nc |
Physical core velocity |
rpm |
↑ |
20 |
W31 |
HPT coolant flow rate |
lbm/s |
↓ |
10 |
Epr |
Engine pressure ratio |
-- |
~ |
21 |
W32 |
LPT coolant flow rate |
lbm/s |
↓ |
11 |
Ps30 |
HPC outlet static pressure |
psia |
↑ |
|
|
|
|
|
Table 4.
The RUL prediction comparisons of different methods on subset FD001and FD002.
Table 4.
The RUL prediction comparisons of different methods on subset FD001and FD002.
Model |
FD001 |
FD002 |
Score |
RMSE |
Score |
RMSE |
Cox’s regression [34] |
28616 |
45.10 |
N/A |
N/A |
SVR [39] |
1382 |
20.96 |
58990 |
41.99 |
RVR [39] |
1503 |
23.86 |
17423 |
31.29 |
RF [39] |
480 |
17.91 |
70456 |
29.59 |
CNN [40] |
1287 |
18.45 |
17423 |
30.29 |
LSTM [42] |
338 |
16.14 |
4450 |
24.49 |
DBN [41] |
418 |
15.21 |
9032 |
27.12 |
MONBNE [41] |
334 |
15.04 |
5590 |
25.05 |
LSTM+attention+handscraft feature [20] |
322 |
14.53 |
N/A |
N/A |
Acyclic Graph Network [43] |
229 |
11.96 |
2730 |
20.34 |
AEQRNN [34] |
N/A |
N/A |
3220 |
19.10 |
MCLSTM-based[4] |
260 |
13.21 |
1354 |
19.82 |
SMDN [14] |
240 |
13.72 |
1464 |
16.77 |
Proposed |
238 |
12.34 |
1205 |
15.59 |
Table 5.
The RUL prediction comparisons of different methods on subset FD003 and FD004.
Table 5.
The RUL prediction comparisons of different methods on subset FD003 and FD004.
Model |
FD003 |
FD004 |
Score |
RMSE |
Score |
RMSE |
Cox’s regression [34] |
N/A |
N/A |
1164590 |
54.29 |
SVR [39] |
1598 |
21.04 |
371140 |
45.35 |
RVR [39] |
17423 |
22.36 |
26509 |
34.34 |
RF [39] |
711 |
20.27 |
46568 |
31.12 |
CNN [40] |
1431 |
19.81 |
7886 |
29.16 |
LSTM [42] |
852 |
16.18 |
5550 |
28.17 |
DBN [41] |
442 |
14.71 |
7955 |
29.88 |
MONBNE [41] |
422 |
12.51 |
6558 |
28.66 |
LSTM+attention+handscraft feature [20] |
N/A |
N/A |
5649 |
27.08 |
Acyclic Graph Network [43] |
535 |
12.46 |
3370 |
22.43 |
AEQRNN [34] |
N/A |
N/A |
4597 |
20.60 |
MCLSTM-based[4] |
327 |
13.45 |
2926 |
22.10 |
SMDN [14] |
305 |
12.70 |
1591 |
18.24 |
Proposed |
292 |
13.12 |
1020 |
13.25 |
Table 6.
The comparisons of different methods for RUL prediction b based on C-MAPSS dataset
Table 6.
The comparisons of different methods for RUL prediction b based on C-MAPSS dataset
Model |
Mean performance |
RMSE |
Score |
Cox’s regression [34] |
49.70 |
596603 |
SVR [39] |
32.335 |
108277 |
RVR [39] |
27.96 |
11716 |
RF [39] |
24.72 |
29553 |
CNN [40] |
24.42 |
7006 |
LSTM [42] |
21.25 |
2797 |
DBN [41] |
21.73 |
4461 |
MONBNE [41] |
20.32 |
3225 |
LSTM+attention+handscraft feature [20] |
20.80 |
2985 |
Acyclic Graph Network [43] |
16.80 |
1716 |
AEQRNN [34] |
19.85 |
3908 |
MCLSTM-based[4] |
17.40 |
1216 |
SMDN [14] |
15.36 |
900 |
Proposed |
13.58 |
689 |