1. Introduction
The Duffing equation is a significant nonlinear second-order differential equation commonly employed to model general oscillations in nonlinear systems. This equation captures various nonlinear vibration phenomena, including those that arise in circuits with specific inductance and capacitance configurations when subjected to electric current. It also describes dynamic friction effects, such as in oscillator systems, where nonlinearity leads to divergent behavior from linear systems at large vibration amplitudes. Furthermore, the Duffing equation finds application in analyzing vibrations in aircraft wings and automobiles, which, if severe, may cause damage to machine components. Therefore, estimating and understanding the vibration conditions is crucial for preventing potential harm. Among the studies related to the Duffing equation, Feng [
1] employed the variational method and the theoretical approach of Dimensional Analysis to derive a novel analytical frequency function. This analytical function accurately describes the vibration frequency of the undamped Duffing equation. The validity and effectiveness of this frequency solution were confirmed through numerical simulations and experiments, providing a deeper understanding of the oscillation characteristics of the Duffing equation without damping. In another study, Rao [
2] investigated the behavior of the Duffing equation in undamped free vibration and damped forced vibration scenarios. For undamped free vibration, Rao explained that varying initial conditions lead to convergence results switching between 1 and -1 within a specific interval, exhibiting irregular changes. In the case of damped forced vibration, when specific initial conditions are fixed, the solution of the Duffing equation demonstrates various convergence patterns, such as single-cycle, double-cycle, and four-cycle, among others. Akhmet et al. [
3] studied the Bifurcation behavior of Duffing Oscillator in damped forced vibration, and used Bifurcation diagram and Lyapunov exponent to analyze the chaotic phenomenon of the system when the external force amplitude increased. Then the Ott-Grebogi-Yorke (OGY) control method was used to stabilize the chaotic behavior and reduce its amplitude. They demonstrated the effectiveness of the OGY control method with numerical simulations, and confirmed that the types of multiple solutions in nonlinear equations have the potential to be studied. Therefore, this research is inspired to explore the types of various solutions of the Duffing equation. Compared with general analytical or numerical analysis, since the nonlinear solution takes a lot of computing time, this research will try to use machine learning to judge the value of multiple solutions of the Duffing equation.
Machine learning has experienced significant growth and has become widely utilized in various domains, leading to its rapid development in modern times. This powerful technology finds applications in diverse fields such as image recognition, search engines, industrial analysis, and artificial intelligence itself. The utilization of machine learning, along with the advancements in artificial intelligence, has shown substantial improvement in recent years, as indicated by statistics from Hao [
4]. The trend of artificial intelligence application has evolved over time, progressing from the use of code command systems to the impact of machine learning and the widespread adoption of neural networks in the late 1990s and early 2000s. The continuous development of sophisticated algorithms and architectures, such as the three-layered model proposed by Hinton et al. [
5], has paved the way for the progress of machine learning and artificial intelligence as a whole. In 2007, Hinton [
6] introduced deep learning as an addition to the backpropagation method, enabling the adjustment of weights for improved training effectiveness. While this development brought significant advancements in weight updates and training, it did not fully address the issue of overfitting. To tackle this problem, Hinton et al. [
7,
8] proposed Dropout in 2012. This technique involves randomly dropping out neurons during the learning process, preventing excessive reliance on specific neurons and effectively mitigating overfitting. In 2015, Kingma et al. [
9] presented the Adam optimizer, which combines the strengths of various previous optimizers. This method guides the training of each parameter in the loss function towards the most favorable direction, thus avoiding obstacles like saddle points and enhancing training speed and accuracy significantly. Furthermore, in 2018, Potdar et al. [
10] conducted a study evaluating six different encoding techniques across six diverse datasets. These techniques included binary encoding, label encoding, and One-hot encoding, among others. The performance of each technique was assessed based on classification accuracy, and the impact of dataset size and the number of categories on each technique's performance was analyzed.
In 1995, Mathia et al. [
11] used Multilayer perceptron (MLP) and Hopfield Neural Network (HNN) to find approximate solutions to approximate nonlinear equations by means of forward propagation. Predictions are made by a Recurrent Neural Networks (RNN) model. Not only has there been a great breakthrough in finding approximate solutions to nonlinear equations, but also the method of supervised learning has made great contributions to the prediction of nonlinear solutions. This also strengthens the confidence of using machine learning to analyze the solution of the Duffing equation in this study. Hochreiter and Schmidhuber [
12] proposed the Long short term memory (LSTM) method. They improved RNN and improved the short-term memory ability of the original RNN to be able to deal with long-term and short-term memory problems. It not only improves the training speed, but also solves more complex and long-term dependent problems. In the area of high-dimensional partial differential equations, Han et al. [
13] proposed a new method for solving high-dimensional partial differential equations (PDEs) using deep neural networks. In the study of Wang et al. [
14], the flutter speed was analyzed by the K-method, and the deep learning method of machine learning was used to predict the occurrence of flutter. They compared the performance of DNN and LSTM methods for flutter speed prediction, and attempted to predict aeroelastic phenomena in various flight conditions. Gawlikowski et al. [
15] first gave an overview of various sources of uncertainty in deep learning, including data noise, model error, and model structure uncertainty. Applications of uncertainty estimation to deep learning were also discussed, including applications to active learning, decision making, etc... Hua et al. [
16] used time series forecasting in fields such as finance, weather forecasting and traffic forecasting. They highlighted the potential of LSTM networks for time series forecasting and provided insights into the design and optimization of LSTM networks for their task. Hwang et al. [
17] proposed a method called Single Stream RNN (SS-RNN). SS-RNN achieves single-stream parallelization by using a single GPU stream at each time step to process multiple sequences. They also proposed a method called Blockwise Inference for efficiently performing inference of RNNs on GPUs. In addition, Zheng et al. [
18] proposed a compiler called Echo, which aims to reduce GPU memory usage during LSTM RNN training. Echo also provides an optimization method based on calculation graphs to reduce the number of data transmissions.Tariq et al. [
19] proposed a method using multivariate convolutional LSTM (MC-LSTM) and mixed probabilistic principal component analysis (MPPCA) to detect anomalies in spatiotemporal data collected from space. This method was based on the LSTM model, and by adding convolutional layers, spatial and temporal features can be extracted from spatiotemporal data. Memarzadeh and Keynia [
20] proposed a short-term electricity load and price forecasting algorithm based on LSTM-Neural Network (LSTM-NN). That algorithm was based on the combination of LSTM and Neural Network, and improved the prediction performance by optimizing the parameters of LSTM and Neural Network.
The present work used Duffing equation to simulate the general nonlinear vibration situation and divided it into the special case of undamped free vibration and the condition of damped forced vibration. When the Duffing equation is in undamped free vibration, with the different initial conditions, the convergence result changes between 1 or -1 within a certain interval, and such changes are irregular. In addition, when the Duffing equation is in damped forced vibration, its solution will vary depending on the initial conditions, external forces, etc., and the final vibration condition will be different, and there may be multiple solutions. This study will observe the single-cycle convergent solution of the Duffing equation, especially the periodic solution of this system when the convergent trajectory is a single closed curve.
In order to collect the convergence results of the Duffing equation with two different types, this study uses the fourth-order Runge-Kutta method to collect the results of two different types, and changes the initial conditions and other parameters for analysis. Then, using the supervised learning method in machine learning, the undamped free vibration type is guided by the Labeled method to guide the machine to analyze the target object cognitively. This study uses data preprocessing, training and validation to analyze the problem, and selects two algorithms to predict this nonlinear vibration phenomenon to analyze its accuracy. The two algorithms are: Deep Neural Networks (DNN) and Long short term memory model (LSTM). Among them, DNN is a supervised learning method, which is suitable for processing Labeled data. LSTM is a branch of Recurrent Neural Networks (RNN), which solves the performance of time recurrent neural networks in long sequences, and also solves the problems of gradient disappearance and gradient explosion, and is suitable for dealing with problems highly related to time series. Finally, use the Long short term memory model to add a small number of neuron hidden layers (LSTM-NN) to find the best learning model. This study aims to explore suitable learning models for predicting convergence results of nonlinear vibration equations in specific cases. To accomplish this, two different types of the Duffing equation will be examined as examples. The goal is to identify a learning model that can effectively predict the convergence outcomes of these nonlinear vibration equations.
Figure 1.
Schematic diagram of the convergence results of the Duffing equation (undamped free vibration).
Figure 1.
Schematic diagram of the convergence results of the Duffing equation (undamped free vibration).
Figure 2.
The form of chaos in Duffing equation.
Figure 2.
The form of chaos in Duffing equation.
Figure 3.
Convergence graph of single limit cycle.
Figure 3.
Convergence graph of single limit cycle.
Figure 4.
Schematic diagram of artificial neuron.
Figure 4.
Schematic diagram of artificial neuron.
Figure 5.
Schematic diagram of deep neural network.
Figure 5.
Schematic diagram of deep neural network.
Figure 6.
Time recurrent neural network architecture (LSTM).
Figure 6.
Time recurrent neural network architecture (LSTM).
Figure 7.
The results of basic LSTM model training.
Figure 7.
The results of basic LSTM model training.
Figure 8.
Accuracy results of different hidden layers of LSTM.
Figure 8.
Accuracy results of different hidden layers of LSTM.
Figure 9.
Loss of different hidden layers of LSTM.
Figure 9.
Loss of different hidden layers of LSTM.
Figure 10.
Training results of different neurons of LSTM.
Figure 10.
Training results of different neurons of LSTM.
Figure 11.
Loss of different neurons of LSTM.
Figure 11.
Loss of different neurons of LSTM.
Figure 12.
LSTM best trained model.
Figure 12.
LSTM best trained model.
Figure 13.
The unstable situation of LSTM model accuracy.
Figure 13.
The unstable situation of LSTM model accuracy.
Figure 14.
The unstable situation of LSTM model loss.
Figure 14.
The unstable situation of LSTM model loss.
Figure 15.
LSTM-NN model architecture.
Figure 15.
LSTM-NN model architecture.
Figure 16.
The accuracy of the 3-LSTM hidden layer + 2-NN layer model.
Figure 16.
The accuracy of the 3-LSTM hidden layer + 2-NN layer model.
Figure 17.
The loss of the 3-LSTM hidden layer + 2-NN layer model.
Figure 17.
The loss of the 3-LSTM hidden layer + 2-NN layer model.
Figure 18.
The accuracy of LSTM-NN architecture with different numbers of neurons.
Figure 18.
The accuracy of LSTM-NN architecture with different numbers of neurons.
Figure 19.
The loss of LSTM-NN architecture with different numbers of neurons.
Figure 19.
The loss of LSTM-NN architecture with different numbers of neurons.
Figure 20.
LSTM-NN model architecture for final use.
Figure 20.
LSTM-NN model architecture for final use.
Figure 21.
The accuracy of the final training results of the LSTM-NN architecture.
Figure 21.
The accuracy of the final training results of the LSTM-NN architecture.
Figure 22.
The loss of the final training results of the LSTM-NN architecture.
Figure 22.
The loss of the final training results of the LSTM-NN architecture.
Figure 23.
Results of the first training.
Figure 23.
Results of the first training.
Figure 24.
The accuracy of different hidden layers of LSTM.
Figure 24.
The accuracy of different hidden layers of LSTM.
Figure 25.
The loss of different hidden layers of LSTM.
Figure 25.
The loss of different hidden layers of LSTM.
Figure 26.
The accuracy of LSTM with different numbers of neurons.
Figure 26.
The accuracy of LSTM with different numbers of neurons.
Figure 27.
The loss of LSTM with different numbers of neurons.
Figure 27.
The loss of LSTM with different numbers of neurons.
Figure 28.
LSTM model architecture for final use.
Figure 28.
LSTM model architecture for final use.
Figure 29.
LSTM-NN model training accuracy.
Figure 29.
LSTM-NN model training accuracy.
Figure 30.
LSTM-NN model training loss.
Figure 30.
LSTM-NN model training loss.
Figure 31.
The accuracy of LSTM-NN with different numbers of neurons.
Figure 31.
The accuracy of LSTM-NN with different numbers of neurons.
Figure 32.
The loss of LSTM-NN with different numbers of neurons.
Figure 32.
The loss of LSTM-NN with different numbers of neurons.
Figure 33.
LSTM-NN model architecture for final use.
Figure 33.
LSTM-NN model architecture for final use.
Figure 34.
Training time of the first five epochs using CPU.
Figure 34.
Training time of the first five epochs using CPU.
Figure 35.
Training time of the first five epochs using GPU.
Figure 35.
Training time of the first five epochs using GPU.
Figure 36.
LSTM model prediction results.
Figure 36.
LSTM model prediction results.
Figure 37.
LSTM-NN model prediction results.
Figure 37.
LSTM-NN model prediction results.
Figure 38.
The training time of the first five epochs of the LSTM model.
Figure 38.
The training time of the first five epochs of the LSTM model.
Figure 39.
The training time of the first five epochs of the LSTM-NN model.
Figure 39.
The training time of the first five epochs of the LSTM-NN model.
Figure 40.
Max. velocity prediction results by LSTM and LSTM-NN models.
Figure 40.
Max. velocity prediction results by LSTM and LSTM-NN models.
Figure 41.
Min. velocity prediction results by LSTM and LSTM-NN models.
Figure 41.
Min. velocity prediction results by LSTM and LSTM-NN models.
Figure 42.
Max. displacement prediction results by LSTM and LSTM-NN models.
Figure 42.
Max. displacement prediction results by LSTM and LSTM-NN models.
Figure 43.
Min. displacement prediction results by LSTM and LSTM-NN models.
Figure 43.
Min. displacement prediction results by LSTM and LSTM-NN models.
Figure 44.
The training time of the first five Epochs of the LSTM model.
Figure 44.
The training time of the first five Epochs of the LSTM model.
Figure 45.
The training time of the first five Epochs of the LSTM-NN model.
Figure 45.
The training time of the first five Epochs of the LSTM-NN model.
Table 1.
The parameters and labels used in the Duffing equation (no damping).
Table 1.
The parameters and labels used in the Duffing equation (no damping).
|
Feature |
Label |
|
Initial Displ., x0
|
Initial |
Linear Spring Const., k
|
|
|
Range |
[0.010~5.250] |
[0.001~2.000] |
[-1.72 ~ -0.2] |
[0.2~1.72] |
[1; -1] |
Table 2.
Parameters used in Duffing equation (damped forced vibration).
Table 2.
Parameters used in Duffing equation (damped forced vibration).
|
|
Feature |
|
|
|
Initial Displ., x0
|
Initial |
Damping Coef., c
|
Linear Spring Const., k
|
|
Force, Q
|
Range |
[0.30~1.26] |
[0.30~1.26] |
[0.03~0.24] |
[0.45~1.75] |
[0.45~1.75] |
[0.30~0.72] |
Table 3.
Convergence results of Duffing equation (damped forced vibration).
Table 3.
Convergence results of Duffing equation (damped forced vibration).
|
Output data |
MaxDispl., xmax
|
Max |
MinDispl., xmin
|
Min |
Range |
[0.38814 ~ 1.75069] |
[-1.74609 ~ -0.25300] |
[0.30012 ~ -3.95563] |
[-3.95583 ~ -0.34118] |
Table 4.
Accuracy and loss percentage of different hidden layers.
Table 4.
Accuracy and loss percentage of different hidden layers.
|
1 LSTM Layer & 1 NN layer |
1 LSTM Layer & 2 NN layer |
1 LSTM Layer & 3 NN layer |
Accuracy (%) |
93.20 |
82.70 |
77.74 |
Loss (%) |
4.32 |
15.81 |
14.13 |
|
2 LSTM Layer & 1 NN layer |
2 LSTM Layer & 2 NN layer |
2 LSTM Layer & 3 NN layer |
Accuracy (%) |
86.67 |
87.56 |
82.81 |
Loss (%) |
11.35 |
8.70 |
11.57 |
|
3 LSTM Layer & 1 NN layer |
3 LSTM Layer & 2 NN layer |
3 LSTM Layer & 3 NN layer |
Accuracy (%) |
86.78 |
90.90 |
89.86 |
Loss (%) |
9.87 |
6.50 |
7.09 |
|
4 LSTM Layer & 1 NN layer |
4 LSTM Layer & 2 NN layer |
4 LSTM Layer & 3 NN layer |
Accuracy (%) |
92.74 |
91.97 |
86.89 |
Loss (%) |
5.20 |
4.96 |
8.61 |
|
5 LSTM Layer & 1 NN layer |
5 LSTM Layer & 2 NN layer |
5 LSTM Layer & 3 NN layer |
Accuracy (%) |
92.64 |
58.22 |
58.12 |
Loss (%) |
4.68 |
24.34 |
24.34 |
Table 5.
Accuracy and loss percentage of different hidden layers.
Table 5.
Accuracy and loss percentage of different hidden layers.
|
1 LSTM Layer & 1 NN layer |
1 LSTM Layer & 2 NN layer |
1 LSTM Layer & 3 NN layer |
Accuracy (%) |
97.57 |
87.47 |
87.40 |
Loss (%) |
1.22 |
1.90 |
1.14 |
|
2 LSTM Layer & 1 NN layer |
2 LSTM Layer & 2 NN layer |
2 LSTM Layer & 3 NN layer |
Accuracy (%) |
98.13 |
98.00 |
98.24 |
Loss (%) |
0.29 |
0.87 |
0.29 |
|
3 LSTM Layer & 1 NN layer |
3 LSTM Layer & 2 NN layer |
3 LSTM Layer & 3 NN layer |
Accuracy (%) |
98.14 |
97.58 |
95.12 |
Loss (%) |
0.07 |
0.34 |
0.37 |
|
4 LSTM Layer & 1 NN layer |
4 LSTM Layer & 2 NN layer |
4 LSTM Layer & 3 NN layer |
Accuracy (%) |
97.61 |
97.10 |
97.62 |
Loss (%) |
0.08 |
0.12 |
0.54 |
|
5 LSTM Layer & 1 NN layer |
5 LSTM Layer & 2 NN layer |
5 LSTM Layer & 3 NN layer |
Accuracy (%) |
96.80 |
97.37 |
97.33 |
Loss (%) |
0.09 |
0.13 |
0.23 |