3. Methodology
In recent years, deep learning has achieved good results in the fields of speech recognition, image recognition, and natural language processing. Deep learning can extract abstract high-level features from original features without the need for feature selection based on expert experience. Due to its strong learning ability, scholars at home and abroad have tried to apply deep learning technology to the field of network security. Although the above method has achieved good results, it only uses the official training set in model training and testing, which has certain limitations. In addition, there are 17 more attack methods in the official test set than in the official training set. Therefore, this paper adopts the official training set and the official test set, respectively, in the training and testing of the model, which is conducive to improving the robustness of the model.
In this paper, a convolutional neural network-based intrusion detection model (CNN-Focal) is proposed. In this model, threshold convolution and Soft-max in Convolutional Neural Network (CNN) are applied to the field of intrusion detection for multi-classification, and the Focal Loss function is used to optimize unbalanced data sets, effectively improving the accuracy of intrusion detection.
3.1. CNN-Focal Intrusion Detection Model
1. Basic principles of convolutional neural networks
In the field of deep learning, a Convolutional Neural Network (CNN) is an efficient neural network model that has become a research hotspot in many fields. The basic structure of the convolutional neural network consists of the input layer, convolutional layer, pooling layer, fully connected layer, and output layer, as shown in
Figure 2. In a model, there can be one or more convolution layers, pooling layers, and fully connected layers, in which the convolution layer and pooling layer generally appear alternately, that is, the convolution layer connects the pooling layer, or the convolution layer connects the pooling layer after the pooling layer, and so on. The fully connected layer generally follows the pooled layer.
Each neuron in the fully connected layer is fully connected with all the neurons in the previous layer and is located at the end of the CNN structure. Usually, after the whole connection layer, the output layer is the classification layer. The output layer classifies the features extracted by the convolutional neural network, and the output of the classification is the result.
Softmax regression is a generalization of the Logistic regression model and is mainly used for multiple classifications. Softmax regression degenerates into Logistic regression when two classifications are performed. In the multi-classification problem, the class label y takes more than two values; that is, the class label y has k(an integer greater than 2) different values. Given data set {(x1, y1), (x2, y2),... , (xn, yn)}, yi denotes the class i label, i ∈ {1,2,... k}. For a given x, Softmax regression estimates the probability of x in each class of class k labels.
Loss function (Loss function) is used to evaluate the difference between the predicted value f(x) and the actual value Y of the model, usually expressed as L(Y, f(x)). The loss function reflects the robustness of the model; that is, the smaller the loss function value, the better the robustness of the model.
2. Network structure
Intrusion detection is a classification problem that can be trained by supervised learning and then used to predict unknown data. When using convolutional neural networks, the input data of the input layer is usually two-dimensional, and the intrusion record is one-dimensional data. Therefore, in terms of convolutional operation selection, this paper adopts the one-dimensional convolution method to carry out the convolution operation on the intrusion record data. According to the NSL-KDD label imbalance and the actual classification performance of the model, the CNN-Focal model was designed in this paper, and its structure is shown in
Figure 3. The CNN-Focal model has ten layers, including one input layer, three convolutional layers, 3 Dropout layers, 1 Max-pooling layer, one complete connection layer, and 1 Softmax layer.
3. The model is described as follows:
1) Input layer: The first layer is the input layer. Intrusion records are one-dimensional data. After data standardization and one hot preprocessing, the data of a single intrusion record is converted from 1 × 41 to 1 × 122.
2) Convolution layer: Layers 2, 4, and 6 are convolution layers. The concept of threshold convolution is used in the convolution layer, which is divided into two parts: one is the activation value of the convolution, namely B; The other part is directly linear, and you get the convolution, which is A. The two parts, A and B, are multiplied together to get the corresponding convolution value. Many literatures have proved that smaller convolutional nuclear energy can obtain better local features and classification performance. Therefore, CNN-Focal adopts a small convolutional kernel strategy for the design of the size of convolutional nuclei. The size of convolutional nuclei is 1 × 3, 1 × 2, and 1 × 1, and the number of convolutional nuclei is 16, 32, and 64, respectively. In addition, the small convolution kernel can cluster the learned features, alleviating the influence of convolution redundancy on model performance to a certain extent.
3) Dropout layer: The convolutional neural network model is prone to overfitting during training, which significantly affects the actual performance of the model. To mitigate this problem, Layers 3, 5, and 7 of the CNN-Focal models employ Dropout. The sizes of Dropout values are set to 0.6, 0.5, and 0.4, respectively, based on the actual classification effect of the CNN-Focal model.
4) Max-Pooling layer: The pooling layer can reduce the calculation amount. The 8th layer of the CNN-Focal model is the Max-Pooling layer, with a stride of 2; that is, the number of parameters is reduced to half of the original.
5) Fully connected layer: The number of fully connected layer neurons in CNN-Focal is 200.
6) SoftMax layer: In deep learning, SoftMax regression is often used as a standard classifier for multi-classification or binary classification problems. CNN-Focal uses SoftMax regression as a multi-classifier.
3.2. Experimental Design
In this paper, the Focal Loss function is applied to the model. To verify the effectiveness of Focal Loss, this paper conducted an experimental comparison between Focal Loss and the cross-entropy loss function commonly used in deep learning; that is, the loss function of the CNN-Focal model was replaced with a cross-entropy loss function, and the model with the changed loss function was recorded as CNN-Cross. In addition, to the better classification effect of the CNN-Focal model, this paper compares CNN-Focal with SVM, Random Forest, Decision Tree, and GaussianNB.
In the field of intrusion detection, NSL-KDD data sets are widely used. In this paper, the data preprocessed NSL-KDD data sets are used for CNN-Focal and comparison model training and testing. In addition, this paper selected precision, accuracy, recall rate, F1 score, and other indicators to evaluate the model.
1) Data set introduction. In the research of intrusion detection, the KDD CUP 99 dataset is the most widely used dataset. There are about 5 million records in the training set and about 300,000 records in the test set in the KDD CUP 99 data set. The amount of data in this data set has high requirements on the experimental hardware environment. In addition, various statistical analyses show that there are a large number of redundant records in the KDD CUP 99 dataset, which will cause the model to overfit and require more computer resources in the training process, and the model convergence is slow.
Table 1.
Data set.
Dataset Name |
Training Set Size |
Test Set Size |
Number of Features |
Label Categories |
NSL-KDD |
125973 |
22543 |
41 |
Normal, Probe, Dos, U2L, R2L |
NSL-KDD |
125973 instances |
22543 instances |
41 |
Normal, Probe, R2L, U2R, DoS |
NSL-KDD |
Large |
Small |
41 |
Normal, Probe, Dos, R2L, U2R |
NSL-KDD |
125973 records |
22543 records |
41 |
Normal, Probe, DoS, U2R, R2L |
NSL-KDD |
Comprehensive |
Limited |
41 |
Normal, Probe, Dos, U2L, R2L |
|
|
|
|
|
The NSL-KDD dataset solves the problems existing in the KDD CUP 99 dataset, and many research results are based on the NSL-KDD dataset. The NSL-KDD dataset contains 41 columns of features and 1 column of labels. The label columns are divided into five categories: Normal, Probe, Dos, U2L, and R2L. The order of magnitude of this data set does not require high requirements for the experimental hardware environment, and experiments can be carried out on ordinary machines. In addition, the training set and the test set contain different attack methods, so the model trained using this data set has a better detection effect for new attacks.
3.3. Data Preprocessing
Two data types, nominal and numerical, are generally used. The 41 columns of feature attribute values in the NSL-KDD dataset have both nominal and numerical values. The attribute value types of protocol_ type, service, flag, and label in the data set are nominal, and the rest are numerical. Normalization of data is the scaling of data to a specific interval so that it falls from a large interval into a cell. The standardized data can shorten the convergence time of the model and improve its accuracy. In this paper, the numerical features are standardized and scaled to between [0,1]. One-hot coding, also known as single-heat coding, can not only deal with the features of non-continuous numerical values but also make the distance between the features more reasonable. In this paper, the OneHotEncoder in the scleral package is used to one-hot encode the protocol_ type, service, flag and label four columns of nominal data.
Evaluation index. To evaluate the model, we need not only a practical and feasible experimental scheme but also an evaluation index to measure the generalization ability of the model, which is the performance measurement. In the unbalanced classification task, the most commonly used performance measures are Accuracy, Precision, Recall, and F1 score.
3.4. Experimental Resul
Model |
Dataset Used |
Training Method |
Evaluation Metrics |
Results Summary |
CNN-Focal |
NSL-KDD |
Train-test split (70%-30%) |
Accuracy, Precision, Recall, F1-score |
Achieved high accuracy and balanced performance across all metrics. The model effectively addressed class imbalance using Focal Loss. |
CNN-Cross |
NSL-KDD |
Train-test split (70%-30%) |
Accuracy, Precision, Recall, F1-score |
Compared performance with CNN-Focal using Cross Entropy Loss, showing differences in effectiveness in handling class imbalance. |
SVM |
NSL-KDD |
Train-test split (70%-30%) |
Accuracy, Precision, Recall, F1-score |
Provided benchmark for traditional machine learning approach in intrusion detection, showing competitive results. |
RandomForest |
NSL-KDD |
Train-test split (70%-30%) |
Accuracy, Precision, Recall, F1-score |
Demonstrated ensemble learning’s effectiveness in handling complex feature relationships. |
DecisionTree |
NSL-KDD |
Train-test split (70%-30%) |
Accuracy, Precision, Recall, F1-score |
Showed basic decision-making capability with moderate performance metrics. |
The experiments on the NSL-KDD dataset highlighted the effectiveness of the CNN-Focal model in improving intrusion detection accuracy through advanced deep learning techniques. Here are the key conclusions:
1. Model Performance: CNN-Focal outperformed traditional methods like SVM, Random Forest, and Decision Tree in terms of accuracy, precision, recall, and F1-score. It showcased robustness in handling the dataset’s class imbalance using Focal Loss.
2. Comparison with CNN-Cross: Comparing CNN-Focal with CNN-Cross (using Cross Entropy Loss) showed that Focal Loss significantly enhanced the model’s ability to classify minority classes, which is crucial for real-world intrusion detection scenarios.
3. Dataset Suitability: NSL-KDD dataset’s partition into training and testing sets facilitated robust evaluation of model performance across different attack types, enhancing model generalization.
4. Future Directions: Future research could explore hybrid models combining CNN architectures with traditional machine learning algorithms for enhanced accuracy and efficiency in intrusion detection.
This conclusion synthesizes the experimental findings, emphasizing CNN-Foal’s suitability and superiority in handling intrusion detection tasks compared to traditional methods.
4. Conclusion
With the rapid development of information technology, the issue of network security has increasingly become a focal point of global attention. This paper examines the current landscape and challenges of network security, explores the application of deep learning and artificial intelligence in defense strategies, and proposes future-oriented approaches. The continual evolution of cyber threats presents new challenges to traditional cybersecurity defenses. Signature-based methods are inadequate against novel and unidentified network attacks, necessitating the development of advanced defense technologies. Deep learning, as an advanced machine learning technology, has made significant strides in fields like image recognition and natural language processing. Its application in network security enhances detection and defense capabilities against threats such as malware and phishing attacks.
Furthermore, artificial intelligence offers automated network security defense capabilities, enabling real-time monitoring, analysis of network traffic, and prompt responses to security incidents. AI also aids security experts in gathering and analyzing threat intelligence, thereby improving the efficiency and accuracy of network security defense. This paper proposes an integrated model of deep learning and artificial intelligence for network security defense, forecasting future trends in defense strategies. Future network security defenses will be more intelligent, automated, and adaptable to evolving network environments. As technology continues to advance, new defense technologies will emerge, bolstering network security.
In conclusion, deep learning and artificial intelligence hold immense potential for enhancing network security defenses. Continued optimization and advancement of these technologies will help establish a more secure and reliable digital environment, supporting the growth of the digital economy. Addressing cyber security challenges requires collaborative efforts from governments, enterprises, academia, and other stakeholders to safeguard cyberspace’s security and stability.”
This revision improves grammar, coherence, and logical flow while maintaining the original content’s meaning and emphasis on the application of deep learning and artificial intelligence in network security.