1. Introduction
The fault diagnosis of the pumping unit in the process of petroleum collection has been a critical research topic. Due to the complex underground environment, during the reciprocate movement of the sucker rod, there are many unknown factors, which are prone to result in the failure of the pumping machine and then form a safety hazard. Load (P) and displacement (S) are the parameters generated when the donkey head of the pumping unit moves up and down, and the closed curve formed by them is the indicator diagram. It can reflect the influence of gas, oil, water, sand, wax and other factors on pumping unit in real time [
1]. If the pump is in the fault state for a long time, the wear of the pump will be aggravated and the service life of the equipment will be further reduced.
The traditional method for fault diagnosis of pumping unit is to measure the load change with displacement at the suspension point, draw the suspension point indicator diagram, and then diagnose the working condition of the pumping unit according to the shape of the indicator diagram. The disadvantages of the traditional method are as follows: first, the fault of the pumping unit is judged mainly by the method of manual identification of indicator diagram, which has great influence on human factors and low recognition accuracy. Secondly, due to the large number and wide distribution of pumped Wells, manual detection of Wells is time-consuming and laborious. However, as the complexity and high cost of pump units, there is less tolerance for performance degradation, productivity decrease, and safety hazards, therefore, it is necessary to detect and identify all potential faults rapidly [
2,
3]. It means that it is imperative to replace the manual fault diagnosis of the pumping unit with a computer.
With the continuous progress of deep learning technology, using deep learning technology for fault diagnosis has become a new trend. For example, Convolutional Neural Network [
4] , Generation Adversarial Network [
5], and Long Short Term Memory[
6] et.al show superior performance in fault diagnosis.
At present, the deep learning technology used in fault diagnosis of pumping unit is to classify the indicator diagram of different kinds of faults. Automatic feature extracted from raw data is an outstanding advantage of deep learning technology, and it will not depend on the diagnostic knowledge of specialists [
7].
In 2018, Y.Duan[
8] proposed an improved Alexnet model to realize the automatic recognition of indicator graphs, and compared it with the current common neural network model. In 2019, J. Sang [
9] proposed a PSO-BP neural network algorithm, aiming at the problems of slow convergence and unstable results of the traditional BP neural network algorithm, designed the adjustment rules of the inertia weight and learning factor of the PSO algorithm, and adjusted the weight coefficient of the output layer and the hidden layer of the BP neural network algorithm. In 2020, L.Zhang [
10] used Freeman chain code and differential code to extract the characteristics of dynamometer card data of pumping unit group. Then a diagnosis model based on BP neural network was proposed, and the fault type of pump group can be automatically identified according to dynamometer card. In 2022, H.Hu[
11] proposed a model based on the ResNet-34 residual network to identify the indicator diagrams, which added a residual block structure to the traditional convolutional neural network to establish a direct connection between the upper layer input and the lower layer output and achieved the recognition and classification of six power diagrams through parameter adjustment. In the same year, T.Bai[
12] proposed a fault diagnosis method based on time series transformation generative adversation network (TSC-DCGAN).
Because of the complexity of pumps working conditions, there are different shapes of indicator diagrams in different working states. The indicator diagrams of different kinds of faults are similar in a certain degree thus indistinguishable samples are produced. This will lead to poor generalization ability of deep learning models and difficulty in between indistinguishable samples. The function of activation function is to carry out nonlinear transformation of data and solve the problem of insufficient expression and classification ability of linear model. If the network is all linear transformation, then the multi-layer network can be directly converted into a layer of neural network through matrix transformation. Therefore, the existence of activation function can make the deep learning model perform better with the increase of the number of layers. Therefore, we will propose a new activation function to imporove the generalization performance of the deep learning model, so that the faults of the pumping unit can be distinguished in a high dimensional space.
Rectifying linear unit(ReLU)[
13], which has low computational complexity and fast convergence speed, can solve the problems of gradient disappearance and gradient saturation. In recent years, there have been many improved versions of ReLU (rectified linear unit). To solve the Dead ReLU phenomenon, the negative part of ReLU is substituted for a non-zero slope and Leaky ReLU [
14] is proposed. Hence Leaky ReLU is more inclined to activate in the negative area.
In deep learning, the selection of activation function is generally determined according to the specific situation, and there is no fixed choice. As the adaptive activation function can be automatically adjusted to adapt to the network structure and practical problems, it has been widely developed. Parametric Rectified Linear Unit (PReLU) [
15]is also used to solve the Dead ReLU phenomenon. The slope of the negative part can be obtained by learning from the data, rather than from defined fixed values. Therefore, PReLU has all the advantages of ReLU in theory and is more flexible than Leaky ReLU. In 2017, the Swish activation function was proposed. It has the characteristics of lower bound, no upper bound and non-monotonic. It is very smooth with its first derivative [
16], and its performance is better than ReLU in many aspects. In 2021, H.Hu [
17] proposed a new scheme to explore the optimal activation function with greater flexibility and adaptability by adding only a few parameters on the basis of traditional activation functions such as Sigmoid, Tanh and ReLU. This method avoids local minima by introducing a few parameters into a fixed activation function. In the same year, M.Zhao [
18] used the specially designed subnetwork of Resnet-APReLU as an embedded module in order to adaptively generate the multiplicative coefficient in nonlinear transformation.
Based on the above discussions, an adaptive activation function combined with the gate-controlled channel transfer unit module (GCT)[
19] is designed in this paper. The main contributions are as follows:
We propose an improved adaptive activation function. Each layer of deep learning generates different activation functions, improves the generalization performance of deep learning models, and has strong adaptability to different deep learning models.
We apply the proposed activation function to the fault diagnosis of pumping unit, so as to better extract features from the contours of the indicator diagram. The proposed activation function improves the accuracy of fault diagnosis and has a better search ability, which is verified and comprared with AlexNet[
20], VGG-16[
21], GoogleNet[
22], ResNet[
23] and DenseNet[
24].
The propose activation function is extended to the public datasets CIFAR10 which proves that the proposed activation function is suitable and universal.
The rest of this paper is organized as follows. In section 2, we introduce the pumping unit data set. In section 3, we introduce the common adaptive activation function, and propose the composition of our adaptive activation function. In section 4, the experimental analysis and the discussion on the pumping unit failure data set and the public dataset are presented. In section 5, we conclude the paper.
5. Conclusions
In this paper, a new adaptive activation function is designed and applied to five models of neural networks. Specifically, the adaptive activation function improves the negative semi-axis slope of the ReLU activation function by combining the gated channel conversion unit to enhance the performance of the deep learning model. The activation function in each layer of neural network is unique, thus the input signal of each layer has a unique nonlinear transformation. Therefore, compared with the traditional fixed activation function, our activation function has a better nonlinear transformation ability and it can be well embedded in five models. Such as through the fault diagnosis data set of pumping unit, it is proved that our activation function can effectively display the mapping relationship between displacement and load in the indicator diagram, thus extract the features of the indicator diagram and solve the sparsity problem of the indicator diagrams. Indistinguishable samples are correctly classified. Through CIFIAR10 dataset, it verifies the superiority and universality of our adaptive activation function.
In short, the proposed adaptive activation function increases the accuracy of fault diagnosis and has a better generalization performance and search ability. Moreover, the proposed adaptive activation functions also can be well embedded in other models of neural networks.