3.1. Creating a database
A database must be created to train the neural network. To create a database, a differential equation describing the dynamic phenomena of an electric drive system with dynamic loads and discrete masses is used [
38].
The following designations were used:
where
,
are the moments of inertia of the rotor of the motor and the mechanism, respectively;
is the initial torque of the resistance of the mechanism;
is the rigidity of the connection between the mechanism and the motor;
is the angular displacement of the motor shaft;
is the movement of the mechanism reduced to the motor shaft;
is the electric rotation speed of the rotor;
is the synchronous motor speed;
is the angular velocity of the actuating link of the mechanism;
is the motor internal angle;
is the electromagnetic torque of the synchronous motor rotor;
is the initial value of the torque of the mechanism;
is the rigidity of the synchronous motor characteristics;
is the critical slip of the motor;
are the coefficients describing the nature of the change in the resistance torque of the mechanism.
The details for determining the stability condition from the system of equations (1) are presented in [
38]. The system of equations (1) was modified and solved, leading to the following characteristic equation [
38]:
where
One of the roots of characteristic equation (2) is equal to zero
, which means that the system is not asymptotically stable. In order for the system to be stable, it is necessary that all other real parts of the roots
of the characteristic equation be negative, i.e.,
. According to the Vishnegradsky criterion, it is necessary and sufficient to ensure the following conditions:
As a result, the following stability conditions were obtained
where
The following designations were used:
The solution was implemented under the following initial conditions:
Condition (4) was obtained for the initial value of . Condition (5) is obtained at the initial value, that is,.
The details of the database generation algorithm are described below.
The input data of the system were generated using the principle of randomness. Then, stability conditions (4) were checked for various values of the input parameters, and condition (5) was checked only when they were satisfied. The obtained results were registered in the database.
A database containing more than two million data points was created, consisting of 10 input features and one response.
To increase the efficiency of the database, the influence of the input data on stability conditions was evaluated.
The degree of importance of the influence of the input data on the output signal was evaluated using an ANOVA algorithm [
39]. In this case, the input data are estimated by digits from 0 to infinity, and the higher the score, the more this factor will affect the performance of the system.
As shown in
Table 1, the resistance torque and angular velocity of rotation created by the mechanism as well as the angular displacement of the motor shaft play an important role.
The influence of the coefficients (
,
) on the resistance torque of the mechanism, angular velocities of the electric motor, and mechanism of the output signal of the system were investigated (
Figure 1 and
Figure 2).
The obtained results prove that the stability state of the system is strongly related to the resistance torque of the mechanism. For the system to be in a stable state, it is necessary that the value of coefficient exceeds coefficient
by ten times (
Figure 1). It was established that after a certain level of angular velocity of the mechanism, the system operates in a stable mode.
3.2. Building a neural network
There are no clear rules for choosing the neural network architecture and training method, and the algorithm for selecting the activation function is unknown. Therefore, it is necessary to study the architecture of the network and the possibility of selecting a training method for the task under consideration. The successful choice of network structure, activation function, and training algorithm will improve the accuracy and performance of the artificial neural network.
First, it involves the use of various teaching methods. The neural networks trained using different methods were compared. To ensure comparability of the results, training was performed for a network with homogeneous parameters.
To solve this problem, the simplest structure of the neural network was considered:10 input neurons, one hidden layer, and one output neuron (
Figure 3).
The results were tested using the example of a two-mass synchronous electric drive system. The test data are presented in
Table 2.
The research was carried out using the TensorFlow and Keras packages [
40], using various algorithms for optimizing network training. The following algorithms were used: Adam, RMSProp, Stochastic Gradient Descent (SGD), AdaDelta and Nadam. A total of 1,008,422 data elements were used, 30% of which were used for the network validation.
In the hidden layer, the neuron activation function was chosen by ReLU, and in the output layer by sigmoid [
41].
The training rate was assumed to be 0.1, and only for SGD - 0.3. At a given training rate, the errors in detecting the system instability and accuracy for various training algorithms were estimated (
Figure 4,
Figure 5,
Figure 6,
Figure 7 and
Figure 8).
The results show that the AdaDelta optimization algorithm is unsuitable for use because unacceptably poor results are obtained, whereas the other algorithms show acceptable results (
Figure 4,
Figure 5,
Figure 6 and
Figure 7) and can be used to assess the state of stability.
A change in the number of neurons in the hidden layer does not lead to any significant changes in the case of the Adam, RMSProp, and Nadam algorithms, or in the case of SGD. Changing the number of neurons has a significant impact on the AdaDelta algorithm (
Figure 8), but because this algorithm provides results with low accuracy, it is unsuitable for use in this case.
Subsequently, the network architecture is considered. The network was observed using the following data.
Number of hidden layers: 1.
Number of neurons in hidden layer: 5, 10 and 20.
Activation function in hidden layer: ReLU.
Activation function in the output layer: Softmax.
The data classification algorithm is disabled.
After training the network, the accuracy of the network was only 50%, and when using new input data that were not used in training the network, the accuracy of the network further decreased to 48.8% (
Table 4). To improve the results, an attempt was made to increase the number of neurons in the hidden layer, increasing it first to 10 and then to 20, but the results show that this approach is not as effective because almost the same results were obtained. This is because, in the example under consideration, the input data values were unevenly distributed. Some data values exceeded the numerical values of the others by 100 times. A Data Standardization algorithm [
42] was used to select an efficient network architecture. Among the well-known algorithms for data standardization, a high-performance Z-score algorithm was selected [
43]. Using this method:
where
is the number of parameters to be standardized,
is the average value of the
- th parameter, and
is the standard deviation of the
- th parameter.
The performance of neural networks with different numbers and activation functions in the hidden layer was studied using the Data Standardization algorithm (
Table 4).
The results show that the number of neurons in the hidden layer does not have a significant impact on the accuracy of the network. The use of input data standardization algorithms has a significant impact on network performance.
Various algorithms have been successfully used for classification problems. Taking into account the fact that obtaining information about the stability of the system relates to classification tasks, we considered the Decision Tree, Linear Discriminant Analysis, Logistic Regression, Naïve Bayes and Ensemble algorithms [
44,
45,
46].
One of the algorithms successfully used in classification problems is the Decision Tree algorithm, which is based on rules and does not require any assumptions. Its advantage is its high degree of interpretability.
The Linear Discriminant Analysis algorithm is not only a dimensionality reduction tool but also an effective classification method. Despite its simplicity, this algorithm can produce acceptable and interpretable classification results and requires little time for training and testing.
Logistic Regression is a parametric method with high reliability. The problem in this case is that a certain set of assumptions is required for the algorithm to operate correctly.
The Naïve Bayes algorithm can perform a large number of probabilistic calculations in a short period. It can process a large amount of data.
The Ensemble algorithm was based on a combination of predictions obtained using a set of different models. Training is performed with a stochastic training algorithm, which means that different weights can be found with each training, which can lead to different predictions.
The characteristic data obtained from the Decision Tree, Linear Discriminant Analysis, Logistic Regression, Naïve Bayes and Ensemble algorithms are shown in
Table 5.
The evaluation of the neural networks considered by us, which showed the best results in terms of accuracy, is shown in
Figure 9 and in terms of training time in
Figure 10.
The results obtained show that the best indicator was recorded by the neural network models with the structure shown in
Figure 3. Among them, “Model 4”, “Model 5”, “Model 6” and “Model 8” were identified with an accuracy of 98.1% and higher. Among the classification models, the highest accuracy is provided by models using the “Logistic Regression” algorithm with an accuracy of 96.2%. Simultaneously, the training time of Model 4 was much shorter (168 s) than the other best indicators. When analyzing the results obtained, the advantage of the neural network model over the classification model was noted. In addition, it is noted that to improve the control system of electric drives operating with a dynamic load, it is most appropriate to use a model with one hidden layer, five neurons, and a model architecture with a ReLU activation function in this layer.