1. Introduction
Anomaly detection can be termed as the detection of the patterns which deviate from the expected normal behavior [
1]. Anomaly detection is essential when such abnormality in the datasets can provide sufficient system information [
2]. An anomaly may be malicious activities, instrumentation errors, human errors etc. It is an emerging research area with applications in fields like fraud detection in banking or financial transactions, fault finding in manufacturing, intrusion detection in computer network, etc. With the advancement of computer and networks and their extensive uses, the organizations are becoming vulnerable to the malicious activities. Although the existing defense mechanism can provide protection up to a reasonable extent, the malicious attackers are becoming more sophisticated in intruding across the networks. In case of internal attack, it might be interesting to identify the anomalies.
Intrusion Detection Systems (IDS) are the security tools for preventing the systems or network from the illegitimate action that can jeopardize the integrity, privacy or accessibility. In general, there exist two categories of IDS viz.
anomaly detection-based and
signature recognition-based schemes. The former used to discover the network’s misuse and computer’s misuse or intrusions by keeping track of the systems and then classifying the activities into normal or anomalous. The consequent system is called anomaly-based intrusion detection system [
3,
4]. Anomaly-based intrusion detection can be effectively applied as a risk mitigation tool for computer and associated network.
Many anomaly detection techniques are proposed over the previous few decades [
5,
6,
7,
8,
9,
10]. The classification-based technique is one such. The classification [
11] is a data processing tool to classify the objects into pre-defined classes which has been applied in several areas like, anomaly detection, fraud identification, pattern recognition, prediction, etc. In [
12], the authors proposed a neighborhood rough set-based classification algorithm for the detection of anomaly in mixed attribute dataset. In [
13], the authors proposed a decision tree- based anomaly detection in the results of computer assessment to improve the quality of educational management. In [
14], the authors presented a Bayesian network-based anomaly detection algorithm. In [
15], the authors designed a single deep RBF network for predicting control actions and detecting adversarial attacks in Cyber Physical System. In [
16], the authors proposed a rough set attribute reduction approach for anomaly detection. In [
17], the authors proposed an intuitionistic fuzzy set approach for anomaly detection in network traffic.
The problem similar to classification approach is also addressed using clustering approach [
18,
19,
20,
21,
22]. In [
23], the authored proposed a complex method for detecting anomaly from real-time data using recurrence and fractal analysis. In [
24], the authors conducted a comparative analysis of five time-series anomaly detection models. In [
25], an ensemble learning model is applied to analyze and forecast anomaly of the enormous system logs. In [
26], the authors suggested a strategy for anomaly detection that permits the use of state of art feature selection techniques to idea representation meta-features. A novel framework focusing on real-time anomaly detection based on data technologies is proposed in [
27], which uses streaming sliding window factor corset clustering algorithm. In [
28], the authors proposed a mixed clustering algorithm for anomaly detection of real time data.
Most of the aforesaid methods addressed only accuracy of the anomaly detection and a few addressed the False Positive Rates of the methods. Since the increase in the False Positive Rate decreases the detection rates and so the efficacy of any classifier, it is required to minimize the False Positive Rates. Again, the normal and anomalous behavior of the system are difficult to predict as there is no precise boundary differentiating one from another. In this case, either fuzzy set theory or rough set theory or the combination of both can effectively be utilized.
L. A. Zadeh [
29], introduced fuzziness in the realm of Mathematics by formally defining fuzzy set as a generalization of ordinary set. Atanassov [
30] defined intuitionistic fuzzy sets as a generalization of fuzzy sets in terms of membership and non-membership functions. In [
31], the authors proposed the formula for correlation coefficient of intuitionistic fuzzy sets whose value lies between 0 and 1. Fuzzy relation, α-cut of a fuzzy relation and fuzzy equivalence relations were introduced in [
32,
33].
Pawlak [
34] introduced, the rough set theory, to address uncertainty and vagueness that exist in any datasets. In [
35], the rough set-based classification is applied nicely to discrete datasets which uses the properties of equivalence relation. In [
36], the authors proposed an efficient algorithm based on fuzzy neighborhood rough set for the detection anomaly in large datasets. In [
37], the authors put forwarded an NN classification algorithm which uses the fuzzy-rough lower and upper approximations to classify test objects, or to predict their decision value.
Thivagar et al [
38], proposed the definition of nano topological space with respect to a subset
X of universe
U in terms lower and upper approximation of
X. In [
39], the authors not only introduced a new structure of nano topology but also applied it in medical diagnosis. In [
40], the authors introduced three new topologies namely the covering-based rough fuzzy nano topology, the covering-based rough intuitionistic fuzzy nano topology, and the covering-based rough neutrosophic nano topology. Most of classification-based anomaly detection algorithm developed till today used different well known measures to differentiate between classes and very few works were reported using the statistical measures like correlation coefficient. Secondly, most of fuzzy rough approaches considers the corresponding fuzziness in
Zadeh’s sense [
29]. However, if we extend the approach to the intuitionistic fuzzy set, then the detected anomalies can provide more information about the system.
In this article, a hybrid approach consisting of intuitionistic fuzzy set and rough set has used in the classification algorithm for the anomaly detection of network datasets. The objective of the paper is as follows:
First of all, an α-relation (for a pre-assigned value of α) and an equivalence relation [
41,
42,
43] using the correlation coefficient of intuitionistic fuzzy sets are generated.
Secondly, with the help of α-relation on conditional attributes and equivalence relation on decision attributes, the intuitionistic fuzzy nano lower approximation space, and intuitionistic fuzzy nano upper approximation space along with boundary regions are found.
Thirdly, the certain and possible fuzzy rules are generated from two approximations.
Furthermore, the proposed algorithm (IFRSCAD) is implemented using Matlab with two well-known datasets KDDCUP’99 Network anomaly detection dataset [
44] and Kinsune Network attack dataset [
45]. The classification results are compared with other classification-based methods namely Cuijuan et al [
16], Wang et al [
17], Deep-RBF Network [
15], Bayes Network [
14] and Decision Tree [
13]. It is found that the proposed algorithm is comparatively efficient than others in terms of parameters like True Positive Rates and False Positive Rates.
The article is prescribed as follows. The current developments in this field are described in the
Section 2. The problem definition is given in
Section 3. The algorithm and the flowchart explaining the system is given in
Section 4. The complexity analysis is given in
Section 5. The experimental analysis and results are given in
Section 6, and finally, the Conclusions, Limitations and lines for future work are given in
Section 7.
2. Related Works
Anomaly detection is termed as finding of such patterns which deviate from previously observed one [
1]. It can be useful to get sufficient system information [
2]. Anomaly Detection is one of the vital area of modern research which is getting attention to the researchers. A couple anomaly detection system has already been developed till today [
3,
4]. Classification based anomaly detections system is one of the many. Using classification-based labeling technique Abdullah
et al [
5], presented a method of anomaly detection in cellular network. In [
6], the authors used negative selection algorithm for detecting anomalies in multi-dimensional data.
Taha et al [
7] reviewed the different anomaly detection methods for categorical data.
Mazarbhuiya et al [
12] proposed a neighborhood rough set-based classification algorithm for the detection of anomaly in mixed attribute dataset. For assessment of computer and to improve the quality of educational management a decision tree-based anomaly detection was proposed [
13]. A Bayesian network- based algorithm for anomaly detection and offering correction hints was presented in [
14]. In [
15], the authors designed a single deep RBF network for predicting control actions and detecting adversarial attacks in Cyber Physical System. In [
16], the authors proposed a rough set attribute reduction approach for anomaly detection. Wang et al [
17], designed an efficient intuitionistic fuzzy set-based approach for anomaly detection in network traffic. Maroune et al [
36] proposed anomaly detection-based method on highly scalable approach to compute the nearest neighbor of object using rough set theory.
Anomaly detection using clustering approach was also studied by many researchers. In [
18], the authors proposed an agglomerative hierarchical algorithm for anomaly detection in network dataset. An anomaly detection method based on fuzzy c-means clustering algorithm was proposed in [
19]. In [
20], a mixed algorithm consisting of features of both k-means and hierarchical algorithm was presented. Retting et al [
21], proposed an algorithm of online anomaly detection in big data streams. Similar works were reported in [
22]. Using fractal and recurrence analysis a real-time anomaly detection algorithm was presented in [
23].
Kim et al [
24], conducted a comparative analysis of five time-series anomaly detection models. In [
25], the authors applied an ensemble learning model to analyze and forecast anomaly of the enormous system logs. Halstead et al [
26], devised a strategy for anomaly detection, which permits the use of latest feature selection techniques to idea representation meta-features. Habeeb et al [
27], presented a new framework focused on real-time anomaly detection based on data technologies, which uses streaming sliding window factor corset clustering algorithm. Mazarbhuiya et al [
28], proposed a mixed clustering algorithm for anomaly detection of real time data.
Fuzzy set was formally introduced by Zadeh [
29] to deal uncertainty and vagueness occurring in any dataset. Generalizing the concept of fuzzy set Atanassov [
30] defined intuitionistic fuzzy sets in terms of membership and non-membership functions. Gerstenkorn et al [
31] proposed the definition correlation coefficient of intuitionistic fuzzy sets. In [
32], the authors introduced the details of fuzzy similarity relations. In [
33], the concepts of α-cut of a fuzzy relation and fuzzy equivalence relations were introduced in detail.
Rough set theory was introduced by Pawlak [
34] to address uncertainty and vagueness that exist in any datasets. Nowicki et al [
35], proposed a rough set-based classification method on discrete datasets which uses the properties of equivalence relation. Yuan et al [
37], put forwarded an NN classification algorithm using the fuzzy-rough lower and upper approximations to classify test objects, or to predict their decision value.
Thivagar et al [
38,
39], not only proposed the structure of nano topological space in terms lower and upper approximation but also applied it in medical diagnosis. Shumrani et al [
40], first introduced the concept of the covering-based rough fuzzy nano topology, the covering-based rough intuitionistic fuzzy nano topology, and the covering-based rough neutrosophic nano topology. In [
41], the authors introduced the concept of fuzzy-rough set theory. Maji et al [
42], applied fuzzy-roughs for relevant Genes selection from microarray data. Chimphlee et al [
34], proposed an anomaly-based IDS which uses Fuzzy-Rough clustering method. In [
28], the authors conducted the experimental studies with two well-known datasets KDD Cup’99 [
44] network anomaly detection dataset and Kitsune [
45] Network Attack dataset.
4. Proposed Algorithm
For generating classification rules, we choose a suitable value of the correlation coefficient (α), to define the α-relation. The correlation coefficient used to define the relation is given in section-3. The procedure of finding classification rules is given as follows. We have a collection of
m-data instances each of which is described by
n-intuitionistic fuzzy attributes and is represented as an intuitionistic fuzzy matrix, where each entry is <
xij,
yij>,
xij∈[0, 1],
yij∈[0, 1] [
46] and 0≤
xij+
yij ≤1,
i=1,2,..
m and
j=1,2,..
n. Usually, the dataset can be expressed as an information system (
U,
C∪
D), where C and D are conditional and decision attributes respectively are expressed as intuitionistic fuzzy sets. The method is described below.
The first step of the proposed method is to compute α-relation of the conditional attribute using correlation coefficient, and compute the equivalence classes of decision attributes using same formula of correlation coefficient. The value of α is taken to be 0.4. Then, “infimum” operator is applied on fuzzy knowledge granules of conditional attributes. Then, intuitionistic fuzzy nano lower approximation and intuitionistic fuzzy nano upper approximation are constructed using decision class. Then, the boundary regions are found. With the help of two approximations, two sets of fuzzy rules namely the certain fuzzy rules and possible fuzzy rules, can be generated. The proposed method is also explained with the help of flowchart given in
Figure 1 below.
The pseudocode of the method is given as follows.
Algorithm. |
Input (U, C∪D), α // C, the conditional fuzzy attributes, D, the decision fuzzy attributes Step1. Create α-relation on C using correlation coefficient. Step2. Create the fuzzy equivalence relation for D. Step3. Apply ‘infimum’ operator on the fuzzy granules of records of U brought up by C. Step4. Construct separately nano lower approximation space () Nano upper approximation space for D and the result of fuzzy granules after applying ‘infimum’ to C. Step5. Find boundary regions. Step6. Generate certain fuzzy rules from Nano lower approximation space, possible fuzzy rules from Nano upper approximation, and boundary rules from boundary regions. |
Obviously, each rules generated by the system is fuzzy in the intuitionistic sense. That is attributes contributing in any rule will be in intuitionistic fuzzy set
6. Experimental Analysis and Results.
-
A.
Datasets
KDD Cup’99 [
44] network anomaly detection dataset: It is a synthetic dataset that simulate intrusion in the military network environment. The data are collected for 9 weeks, and its training data consists of 5000 thousand network connection. The attributes can be divided into the classes viz.: normal (unauthorized access to local super user privileges, unauthorized access from a remote machine), dos (denial of service), probe (surveillance and other probing).
Kitsune [
45] Network Attack dataset: It is a collection of nine network attack datasets that were obtained from either an IP-based commercial surveillance system or a network of IoT devices, each of which contained millions of network packets and various cyberattacks.
The above datasets were obtained via the UCI machine repository. A concised view of the dataset explaining their characteristics, the attribute’s characteristics, the number of attributes, and the number of data instances is presented in
Table 1.
-
B.
Experimental Results and Analysis
The experiments are conducted using Matlab on Intel Core i7-2600 machine with 3.4 GHz, 8 M Cache, 8 GB RAM, 500 GB Hard disc running Windows 10 and the results are compared with five well-known methods namely
Cuijuan et al’s method [
16],
Wang et al’s method [
17], Deep-RBF Network [
15], Bayes Network [
14], and Decision Tree [
13]. The classifiers were built using the aforesaid dataset. The value of α is assumed to be 0.4. The classifiers are then used to categorize any new instance as either normal traffic or an attack. For a variety of attributes sizes, the outcomes of all the aforesaid six methods are recorded. Data instances from various attacks are significantly out of proportion to normal data. Parameters like True Positive Rate (TPR) and False Positive Rate (FPR) were utilized to estimate the effectiveness of the approaches and comparative analysis. A partial view of the results of the six algorithms describing the comparative analysis of Normal True Positive Rate, Attack True Positive Rate, Normal False Positive Rate, Attack False Positive Rate for different sizes of attribute-set of KDDCUP’99 datasets is presented in
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7.
Similarly, a partial view of the results of the six algorithms describing the comparative analysis of Normal True Positive Rate, Attack True Positive Rate, Normal False Positive Rate, Attack False Positive Rate for different sizes of attribute-set of Kitsune Network Attack datasets is presented in
Figure 8,
Figure 9,
Figure 10,
Figure 11,
Figure 12 and
Figure 13.
The obtained results presented as a graphical forms offer the following observations:
The decision tree-based algorithm [
13] has poorest detection rate. It has 71.31-66.45% of Normal TPR, 67.44-62.23% of Attack TPR, 29.69-33.51% of Normal FPR, and 32.56-37.71% of Attack FPR for ascending order of attribute sizes (from 10-41) of dataset KDDCUP’99 [
44]. Similarly, it has 71.31-50.12% of Normal TPR, 67.45-49.34% of Attack TPR, 28.69-49.88% of Normal FPR, and 32.56-50.56% of Attack FPR for ascending order of attribute sizes (from 10-115) of dataset Kitsune [
45]. It shows the algorithm has poorest performances and which decreases with the increase in dimension size of the dataset.
Deep-RBF Network-based algorithm [
15] is better than the decision tree-based algorithm [
13] and It has 94.25-90.24% of Normal TPR, 90.23-85.25% of Attack TPR, 5.75-9.75% of Normal FPR, and 9.75-14.75% of Attack FPR for ascending order of attribute sizes (from 10-41) of dataset KDDCUP’99 [
44]. Similarly, it has 94.25-81.21% of Normal TPR, 93.11-80.56% of Attack TPR, 9.75-18.79% of Normal FPR, and 9.73-19.44% of Attack FPR for ascending order of attribute sizes (from 10-115) of dataset Kitsune [
45].
Bayes Network-based algorithm [
14] is better than Decision tree based algorithm [
13] and Deep-RBF Network-based algorithm [
15] in terms of detection rates. It has 85.87-93.13% of Normal TPR, 90.87-83.49% of Attack TPR, 4.13-6.87% of Normal FPR, and 9.136-16.51% of Attack FPR for ascending order of attribute sizes (from 10-41) of dataset KDDCUP’99 [
44]. Similarly, it has 95.87-80.55% of Normal TPR, 94.8-79.53% of Attack TPR, 4.13-19.45% of Normal FPR, and 5.20-20.47% of Attack FPR for ascending order of attribute sizes (from 10-115) of dataset Kitsune [
45]. Although the algorithm is quite efficient but its performance decreases with the increase in the dimension of datasets.
Cuijuan et al’s algorithm [
16] is better than all the previous three algorithms as per as detection rates is concern. It has 97.75-93.25% of Normal TPR, 95.25-89.25% of Attack TPR, 3.20-5.807% of Normal FPR, and 4.25-10.75% of Attack FPR for ascending order of attribute sizes (from 10-41) of dataset KDDCUP’99 [
44]. Similarly, it has 95.95-82.32% of Normal TPR, 95.75-89.25% of Attack TPR, 4.05-8.132% of Normal FPR, and 4.25-18.58% of Attack FPR for ascending order of attribute sizes (from 10-115) of dataset Kitsune [
45]. Its performance also decreases proportionately with the increase in the dimension of datasets.
Wang et al’s algorithm [
17] is the most efficient in comparison to all the aforesaid algorithms. It has 98.21-96.25% of Normal TPR, 96.21-93.25% of Attack TPR, 2.12-3.02% of Normal FPR, and 3.79-6.75% of Attack FPR for ascending order of attribute sizes (from 10-42) of dataset KDDCUP’99 [
44]. Similarly, it has 98.20-90.44% of Normal TPR, 96.21-89.33% of Attack TPR, 1.79-9.56% of Normal FPR, and 3.79-10.67% of Attack FPR for ascending order of attribute sizes (from 10-115) of dataset Kitsune [
45]. Its performance also decreases proportionately with the increase in the dimension of datasets.
The proposed algorithm (IFRSCAD) has 98.342-96.99% of Normal TPR, 98.04-96.29% of Attack TPR, 1.658-3.01% of Normal FPR, and 1.96-3.71% of Attack FPR for ascending order of attribute sizes (from 10-42) of dataset KDDCUP’99 [
44]. Similarly, it has 98.351-91.989% of Normal TPR, 98.02-91.289% of Attack TPR, 1.658-8.011% of Normal FPR, and 1.96-8.711% of Attack FPR for ascending order of attribute sizes (from 10-115) of dataset Kitsune [
45]. Its performance also decreases proportionately with the increase in the dimension of datasets. It is clear from the data that the proposed algorithm has more TPR and less FPR. The differences between Normal TPR and Attack TPR, Normal FPR and Attack FPR are also less in comparison other methods. The performance decrement is less with the increase in dimensions. Obviously, the proposed algorithm has mores average TPR and less average FPR than others.
Also, the execution time of the proposed algorithm depends upon two factors namely dimension and size of the datasets. It has been found that if the dimension is kept constant, the algorithm has quadratic execution time, whereas if the data size is kept constant, it runs in linear time. So the proposed algorithm’s time complexity is more dependent on the data size than the number of attributes. The time-complexity graphs are given in
Figure 14 and
Figure 15.
Author Contributions
Conceptualization, F.A.M.; Methodology, F.A.M.; Software, F.A.M., M.S.; Validation, F.A.M., M.S.; Formal Analysis, F.A.M.; Investigation, F.A.M., M.S.; Resounce, F.A.M., M.S.; Data Curation, F.A.M., M.S.; Writing—original draft preparation, F.A.M., M.S.; writing—review and editing, F.A.M., M.S.; visualization, F.A.M.; supervision, F.A.M.; project administration, F.A.M., M.S.; funding acquisition, M.S. All authors have read and agreed to the published version of the manuscript.