1. Introduction
Phishing, a deceptive method through social and technical engineering, poses a severe threat to online security, aiming to obtain illicit user identities, personal account details, and bank credentials [
1]. It′s a primary concern within criminal activity, with phishers pursuing objectives such as selling stolen identities, extracting cash, exploiting vulnerabilities, or deriving financial gains [
2,
3]. The nuanced landscape of phishing techniques showcasing symmetry and asymmetry includes algorithms, domain spoofing, HTTPS phishing, SMS phishing, link handling, email phishing, and pop-ups. Attributes such as prefixes, suffixes, subdomain, IP address, URL length, ′@′ symbol, spear phishing, dual-slash attributes, port, HTTPS token, request URL, URL-anchor, tag-links, and domain age contribute to the multifaceted nature of phishing attacks [
4]. Phishing perpetrators adeptly mimic legitimate websites, particularly those related to online banking and e-commerce. This creates a symmetrical illusion that induces users to unwittingly divulge sensitive information, leading to various fraudulent actions [
5,
6].
A phishing attacker′s role involves three specific duties: influencing target selection, sociological aspects, and technological infiltration [
7]. As of March 2006, the Anti-Phishing Working Organization reported 18,480 significant phishing assaults and 9666 distinct phishing domains, resulting in substantial financial repercussions for businesses and affecting billions of site visitors [
8]. Microsoft estimates the potential cost of computerized offenses on the global network to be a staggering 500 billion USD, underscoring the symmetrical impact of cyber threats on the financial ecosystem [
9]. A single data breach could incur an average cost of approximately 3.8 million USD for organizations in 2018, highlighting the symmetrical consequences of security lapses. Data from the Anti-Phishing Working Group (APWG) reveals a notable increase in attack networks, with 180,768 identified during the first quarters of 2019, up from 138,328 in the fourth quarter of 2018 and 151,014 in the third quarter of 2018 [
10]. The visual symmetry between benign and deceptive websites challenges human perception, making it difficult to distinguish between them. When visitors access these mimicked sites, critical information is stolen through scripting, underscoring the symmetrical vulnerability in human-computer interaction. The exponential growth in e-commerce consumers contributes to the escalating frequency of phishing attacks, carried out through various means such as malware, online platforms, and email, creating a symmetrical escalation in cyber threats [
11].
Researchers propose varied solutions to enhance symmetry in phishing detection. Some use a blacklist for identifying phishing sites [
12]. However, this method fails to detect non-blacklisted phishing websites, introducing asymmetry, such as zero-day attacks. Heuristic-based detection analyzes website content and third-party service features, but potential service restrictions create asymmetry. Simultaneously, exploring online content and third-party features introduces temporal asymmetry due to its time-consuming nature [
13]. Similarly, a hierarchical clustering method groups DOM vectors based on distance, limiting detection efficiency and suggesting a need for symmetrical analysis of URL features to enhance throughput [
14].
URLs play a pivotal role in phishing attacks, transmitted to users through various channels like emails and social media, presenting a facade of symmetry by appearing as genuine URLs [
15]. Machine learning-based techniques emerge as symmetrical solutions among the available approaches for evaluating URLs. By familiarizing malicious URLs with categorization algorithms, these techniques effectively differentiate between phishing and benign URLs, introducing a symmetrical balance in the categorization process [
16]. URL-based studies leverage a phishing tank database, a comprehensive collection tracking reported phishing URLs by various online security companies. While this database offers organized data categorization patterns, asymmetries arise when using categorization algorithms or machine learning for URL data, necessitating additional symmetrical URL management techniques [
17]. Standard techniques like blacklisting, regular expression, and signature matching, although employed to identify phishing attempts, exhibit asymmetry by falling short in detecting unfamiliar URLs [
4]. Continuous updating of database signatures to detect unexpected patterns in malicious URLs underscores the need for applying symmetrical machine learning-based research, particularly with deep learning models, for robust and symmetrical identification of malicious URLs [
18].
Machine learning and deep neural networks have been pivotal in various research endeavors, showcasing substantial performance improvements [
19,
20,
21,
22]. In the context of phishing detection, authors [
19] proposed a multidimensional feature engineering approach, harnessing a deep learning model (CNN-LSTM) and machine learning algorithms. This method integrated predictions using the XGBoost (eX-treme Gradient Boosting) algorithm, offering a solution to extract features from diverse dimensions for swiftly effective attack detection. However, the reported results indicated a decline in the false positive rate to 59%, signaling a reduction in the level of attack prediction. Another study [
20] introduced an end-to-end deep learning architecture grounded in natural language processing techniques to combat malicious URL phishing. The model aimed to classify benign and malicious URLs using character-level and word-level embedding in CNN networks. However, the model exhibited a lack of generalization on test data, indicating a need for improved accuracy and malicious URL detection ability. Wang et al. [
21] presented the PDRCNN approach, designed to enhance phishing detection efficiency by eliminating reliance on feature crawling from third-party services. Based on the LSTM network, this approach selects optimal features from the URL, employs CNN to distinguish characters influencing phishing, and ensembles predictions with machine learning classifiers. While reporting efficient performance, the mechanism′s dependency on existing knowledge of phishing detection raises concerns about its susceptibility to errors in identifying the latest vulnerabilities. In contrast to traditional machine learning methods that implicitly extract hand-crafted features, deep learning approaches prove advantageous when faced with the challenge of professional phishers exploiting the multilayer features of URLs. To address this, stacking, an ensemble learning methodology integrating various machine learning algorithms and deep learning models, employs a metamodel to amalgamate predictions, enhancing overall performance. Initially employed for malware identification on mobile devices, the stacking approach demonstrated improved accuracy and F_measure [
23]. We have extended this stacking mechanism by designing two distinct phases, leveraging the symmetrical integration of other methods to enhance detection impact.
This paper leverages a deep-learning neural network, long-short-term memory (LSTM), introducing a novel stack generalization model named AntiPhishStack. The proposed model employs five optimizers in two phases to detect phishing URLs effectively. In the first phase, machine learning classifiers, coupled with K-fold cross-validation to mitigate overfitting, generate a mean prediction. The second phase utilizes a two-layered LSTM-based stack generalized model optimized for premier prediction in phishing site detection. Merging the mean prediction from phase I with the premier prediction from phase II, meta-classifiers, specifically XGBoost, deliver the final prediction. This stacking model significantly enhances phishing detection accuracy by learning URL and character-level TF-IDF features, showing symmetrical capabilities. The AntiPhishStack model intelligently identifies new phishing URLs previously unidentified as fraudulent. Experimental evaluations on two benchmark datasets ([
24] and [
25]) for benign and phishing sites demonstrate robust performance, assessed through various matrices, including AUC-ROC curve, Precision, Recall, F1, mean absolute error (MAE), mean square error (MSE), and accuracy. Comparative analysis with baseline models and traditional machine learning algorithms, such as support vector machine, decision tree, naïve Bayes, logistic regression, K-nearest neighbor, and sequential minimal optimization, highlights the AntiPhishStack model′s superior phishing detection efficiency. Notably, this model offers the following significant advantages in achieving symmetrical advancements in cybersecurity:
Prior feature knowledge independence: The approach taken in this work embraces the concept of symmetry by treating URL strings as character sequences, serving as natural features that require no prior feature knowledge for our proposed model to learn effectively.
Strong generalization ability: The URL character-based features are utilized for more robust generalization and check-side accuracy, and the multi-level or low-level features are combined in the hidden layers of the neural network to attain effective generalization.
Independence of cybersecurity experts and third-party services: Our proposed stack generalization model autonomously extracts necessary URL features, eliminating the reliance on cybersecurity experts. Additionally, the AntiPhishStack model, reliant on URL and character-level TF-IDF features, demonstrates independence from third-party features such as page rank or domain age.
The significant contributions of this paper are:
Presentation of a two-phase stacked-based generalization model (AntiPhishStack) that breaks free from the necessity of prior feature knowledge for phishing site detection. The model achieves this by learning URLs and character-level TF-IDF features.
In Phase I, features are trained on the base machine learning classifier to generate the mean prediction, while Phase II employs two-layered stacked-based LSTM networks and five adaptive optimizers for premier prediction detection.
The final prediction is established by developing a meta-classifier (XGBoost) classifying URLs into benign and phishing categories. Experimental results showcase the AntiPhishStack model′s noteworthy performance on baseline models, utilizing symmetrically structured Alexa and PhishTank datasets.
The structure of the rest of the article is as follows:
Section 2 deliberates the background research work of phishing detection with its methods,
Section 3 introduces the AntiPhishStack proposed model,
Section 4 delivers the experiments, and
Section 5 presents the results and its evaluations, and
Section 6 elaborates the conclusion and future work.
3. AntiPhishStack Proposed Model
The main purpose of this model is to determine the best output through evaluation by applying the stacking technique and deep neural network to the processed data set and to propose an optimized model based on that output. The notations and meanings used in this paper are described in
Table 2. The AntiPhishStack model of stack generalization has been illustrated in
Figure 1.
Our model′s flow has four level approaches. The key steps are as follows:
- i.
Collection of datasets and feature distribution into URL features and character-level features.
- ii.
Dataset division into training and testing by 70:30 ratio, respectively.
- iii.
Construct the stack generalization model′s first phase (Phase I) based on the machine learning base model and calculate the mean prediction with the test dataset.
- iv.
Construct the second phase (Phase II) of the stack generalization model with the LSTM model based on adaptive optimizers and compute the performance evaluation with the test set.
- v.
Merge predictions and evaluations from both Phase I and Phase II for the ultimate prediction, enhancing symmetrically the recognition and determination of phishing web pages.
Figure 1.
AntiPhishStack: proposed LSTM-based stock generalization model′s flow.
Figure 1.
AntiPhishStack: proposed LSTM-based stock generalization model′s flow.
Table 2.
Notations and their meanings.
Table 2.
Notations and their meanings.
Notations |
Meaning |
|
Weight factor of URLs |
|
Hidden state of the instant |
bb |
Bias of each gate |
and |
Input gate, forget gate, output gate, and unit status, respectively. |
and |
Weight matrix of forget gate, input gate, and output gate, respectively. |
|
Current input |
|
Training loss function |
|
Complexity of each leaf |
|
Number of leaves nodes |
3.1. Datasets
The URLs are collected from a variety of sources (Alexa and PhishTank) [
25] [
24]. URLs that were duplicated or did not survive were deleted before they were used to create a dataset. The typical URL elements, such as "http://", "https://", and "www." are deleted. Inconsistent URL forms can easily impair the model′s quality during training if the prefixes are not trimmed. The database management system (pgAdmin) was utilized in conjunction with Python to import the preprocessed data, and then the dataset was divided into two parts: 70% for training and 30% for testing. The distribution of legitimate and phishing URLs are as follows:
Dataset 2 (DS2): Benign sites from common-crawl, the Alexa database, and phishing sites from PhishTank [
25].
The datasets were selected for their diverse and current mix of benign and phishing URLs, ensuring robust model training. DS1 and DS2 offer a balanced representation of typical internet environments and specialized sources, respectively. This variety enhances the model′s applicability and accuracy in real-world phishing detection. Meanwhile, the feature dataset is divided into 70% training and 30% testing datasets to ensure a balanced setup: 70% for training our AntiPhishStack model and 30% for robust testing on unseen data, aligning with standard machine learning practices.
3.2. Feature Distribution
Features and the capacity to use these features must be examined prior to examining the features selection section [
47]. There are four major features and a total of 30 sub-features. Based on the details, each characteristic provides information on whether the website is phishing, legitimate, or suspect. This section contains the plans for highlighting the characteristics.
3.2.1. URL Features
Uniform Resource Locator (URL) provides the location of online resources such as pictures, files, hypertext, video, etc. In general, attackers attempt to build phishing URLs that look to users as reputable websites. Attackers use URL jamming tactics to mislead users into disclosing personal information that can be exploited against them. This research aims to detect phishing websites quickly, utilizing lightweight characteristics, i.e., weight factor URL token system, inspired by [
48]. For example, the segmentation of URL (
Figure 2) provides the different tokens and their final weight
for
-th distinct words can be calculated as:
where,
indicates the length of
-th distinct word,
denotes the total steps available for tokens,
shows the number of URLs from webpages,
total number of
-th word occurrences in step
with respect
level.
Calculating this weight delivers the weight value of each URL assigned to neural network gates for phishing prediction. This is accomplished by extracting only characteristics from the URL rather than accessing the website′s content.
Figure 2 shows an example of URL characteristics for the weight.
Figure 2.
Tokenization of URL characteristics and components for the weight calculation.
Figure 2.
Tokenization of URL characteristics and components for the weight calculation.
The first component of the URL is a protocol (https, http, ftp, etc.), which is a set of rules that regulates how data is transported from data transmission. The second component is the location of the host IP address or resource. The hostname is separated into two parts: major domains and top-level domains (TLD). The URL′s hostname is comprised of the principal domain and the TLD. The hostname is followed by a port number, which is optional. The third component uses the path to identify the specific resource inside the domain accessed by a user. An optional field, such as inquiry, follows the path. The protocol, hostname, and URL path are appended to the base URL. The combination of the second domain and top-level domain names, known as the host domain, makes the URL unique. As a result, cybersecurity firms are working hard to identify the fraudster websites used for phishing offenses by name. If a hostname is designated as phishing, an IP address can be banned to prevent it from accessing the web pages included within it.
It has the following sub-features, according to the dataset:
IP Address: If an IP address is used instead of a domain name in the URL of a phishing website, the client may virtually be certain that someone is attempting to steal his credentials. From this dataset, 570 URLs with an IP address were discovered, accounting for 22.8 percent of the dataset, and a rule IP address is in URL that is termed Phishing; otherwise, it is Legitimate was suggested.
Operate the @ Symbol: Web browsers usually ignore the section preceded by the @ sign. Because it is maintained separately from real-world addresses, finding 90 URLs with the ′@′ sign will provide just 3.6 percent of the total, according to the dataset.
Operate the "//" symbol: As valid URLs, the "//" sign is used after HTTP or HTTPS. If the URL changes after the initial protocol declaration, it is called a phishing URL. The "//" sign is used to redirect to other websites.
Domain name prefixes and suffixes separated by the "-" sign: A URL with the "-" sign in its domain name is a phishing URL. In general, verified URLs do not include the "-" sign.
Use the "." sign in the domain: Use the "." sign in the domain. Adding a sub-domain with the domain name must include the dot. Consider it suspect if you drop out more than one subdomain, and anything greater than that will indicate phishing.
HTTPS (secure socket layer): The majority of legal sites use the HTTPS protocol. Therefore, the age of the certificate is quite important when utilizing HTTPS. This necessitates the use of a trustworthy certificate.
Favicon: A favicon might redirect clients to dubious sites when layered from outside space. It is mainly used on websites and is a graphic picture.
3.2.2. Character Level Features
Term Frequency-Inverse Document Frequency is abbreviated as TF-IDF. The TF-IDF score indicates a term′s relative significance in the document and throughout the whole corpus. The TF-IDF score is made up of two terms: the first computes the normalized Term Frequency (TF), and the second computes the Inverse Document Frequency (IDF), which is calculated as the logarithm of the number of documents in the corpus divided by the number of documents in which the specific term appears [
49,
50].
TF-IDF Vectors may be produced at many levels of input tokens (words, characters, n-grams):
Word level TF-IDF: A matrix indicating the TF-IDF scores of each term in distinct texts.
Character level TF-IDF: A matrix indicating the TF-IDF scores of character-level n-grams in the corpus.
N-gram level TF-IDF: N-grams are the collection of N terms. This matrix indicates the TF-IDF scores of N-grams.
It should be mentioned that TF-IDF has been used in numerous studies to identify website phishing by examining URLs [
25] to get indirectly related connections, target websites, and the validity of suspicious websites [
49]. TF-IDF retrieves prominent keywords from the textual content. However, it has certain limitations. One of the limitations is that the approach fails when extracting misspelled terms. Because the URL might contain nonsensical words, it used a character-level TF-IDF method with a maximum feature count of 5000.
Furthermore, we have measured the URL strings as character sequences by employing the idea from the literature [
51]. This idea provides the advantage that the proposed model can train the URL character sequences as natural features that do not need prior feature knowledge to be learned by our proposed model. Our proposed AntiPhishStack model uses the stack generalization model to extract the local URL features from the URL character sequences, and finally, the URL will be classified by designing a meta classifier for final prediction.
3.3. Stack Generalization Model
The stack generalization model is divided into 2 phases, as illustrated in the flow model (
Figure 1).
3.3.1. Phase I
Based on the characteristics mentioned above, existing machine learning models are utilized directly to distinguish phishing and legitimate web pages. This paper proposes a stacking model (illustrated in
Figure 3) for this purpose by merging various machine learning models, including support vector machine (SVM), naïve Bayes (NB), decision tree (DT), logistic regression (LR), K-nearest neighbors (KNN), sequential minimal optimization (SMO), and XGBoost.
The training set is split into copies, with copies utilized for training and one copy used for testing. The training process will not be terminated until each basic model has predicted the samples. This suggested system employed k-fold cross-validation to avoid overfitting for this training set and each fold of the train part that might be predicted using out-of-fold.
This suggested model uses a value of three to ten for k-fold cross-validation; after all, it delivers output using a test set. Following the temporary prediction (TP) acquisition, the mean prediction is obtained, which is strengthened by the test dataset validation. This time, it is comprehensive, with a fold approach required for estimating all figures on all folds utilized.
3.3.2. Phase II
The train segment is put to a two-layer neural network architecture of LSTM once the features from the training dataset have been loaded. Because there are dependencies on immediately preceding entries in sequential phishing webpage data, LSTM is better suited to simulate phishing detection in this investigation. Meanwhile, it is explicitly designed to avoid the long-term dependency problem by storing the feature information in its memory cell. It can remove or add information to these call states and is regulated by structures called gates. These gates and corresponding operations/functions are presented in [
52], while Phase II of the integrated stack generalized model is illustrated in
Figure 4.
In the first gate (Forget gate), the information from the current input
and the previous hidden state
is passed through the sigmoid activation function. If the output value of the feature is closer to 0, it means forget, and closer to 1 means to retain. The second gate, the input gate, decides what relevant feature (phishing or benign) can be added from the current step. The third gate, the control gate, decides which values will be updated (either 0 or 1), for which a
layer creates a vector of
. The last gate, the output gate, determines the value of the next hidden state [
53].
At time , the LSTM cell′s components are modified as follows:
The sigmoid function is one of them; represents the hidden state of the instant; represents the bias of each gate; and are the input gate, forget gate, output gate, and unit status, respectively. For the connection, and are represented as a weight matrix. The three gates of LSTM cells govern the flow of information and hence define the cell′s state. The gradient vanishing problem may be efficiently handled with LSTM.
Figure 4.
Phase II of the proposed stack generalization model.
Figure 4.
Phase II of the proposed stack generalization model.
The suggested model, in this instance, comprises two LSTM layers. The first LSTM layer outputs a sequence as one input above the LSTM layer. As explained previously, the internal design of both LSTM layers is the same. It also tried the LSTM cell rather than another GRU cell since the network with the LSTM cell outperformed the network with the GRU cell. This study constructs an LSTM network with a hidden vector of 128 elements. After the first LSTM layer, a dropout layer is added. Dropout is designed to reduce overfitting and enhance the model′s generalization [
54]. The LSTM′s last layer generates a vector hi, which is supplied as the input to a fully linked multilayer network. Each layer has an activation function. The rectified linear unit (ReLU) activation function is used for each layer, and the exponential activation function is used for the output layer. Because the data set is binary, a nonlinear activation function is used to solve the binary classification issue. For hidden layers of neurons, the ReLU function was employed, while for the output layer of neurons, the sigmoid function was used.
After the training process, the parameters are changed or tweaked to assess the wrong predictions and ensure the predictions are correct as possible with optimization. Optimizer mold and design the model for the most accurate and possible prediction with the parameters (or weight). The value that the weights are updated in the training process is called the learning rate, a configurable hyperparameter to train deep neural networks with a small value within the 0.0 – 1.0 range. However, the learning rate varies due to overfitting [
52]; thus, our model can predict accurately with the given dataset. Still, it is not appropriate for new or real-world data. We used the regularization technique to overcome the overfitting errors by fitting the functions appropriately on the training sets. It helps to attain optimal optimization solutions. These optimizers modify the neural network′s attributes, i.e., weights and learning rates, to improve the accuracy.
Thus, we have utilized the following five adaptive optimizers to generalize the LSTM networks to overcome the overall loss and improve the accuracy. The selection of these optimizers is also given below:
AdaDelta: This optimizer is based on the learning rate per dimension to address instead of the learning rate by parameter. It can solve the continual decay of learning rates by training and based on manually selected learning rates.
Adam: It utilizes the prediction of the first and second moments to adapt the learning rate for the neural networks. It uses the momentum concept for adding a part of previous gradients to the current one. It is a faster optimizer and requires fewer parameters for tuning.
RMSprop: Root means square propagation optimizer avoids the oscillations in the vertical direction and can increase the learning rate with feasible steps in the horizontal direction.
AdaGard: It deals explicitly with individual features for different learning rates for different weights of sparse datasets to get a high learning rate. It can avoid the manual tuning of the learning rate for individual features.
SGD (Stochastic Gradient Descent): Gradient descent optimizer has a drawback for large datasets. A variant of gradient descent, SGD, is generalized to make neural networks learn faster on the large-scale dataset.
These optimizers are implemented based on the packages and function calls in the Pytorch framework. For instance, we utilized , where indicates the name of the optimizer, i.e., Adam or SGD, etc. The model will then be compiled using these adaptive optimizers. The model is trained to avoid overfitting by utilizing several epochs and early stopping strategies. By assessing the model using the test set, the output is now accessible. The stack generalization technique was used in the dataset after the strategy was implemented.
3.4. Final Prediction
Two outputs are generated using the aforementioned multilayer stacked methods, and a model is chosen depending on the decision based on the value of the initial predictions. The mean prediction is considered and combined with the anticipated outcomes from the premier prediction. Finally, the outputs of the mean and premier prediction of the stacking models are combined as the final perdition using a meta-estimator classifier.
The meta-estimator involves constructing a robust classifier by applying the boosting method. Boosting combines multiple weak yet precise classifiers to create a powerful and resilient classifier for identifying phishing crimes. Additionally, boosting aids in integrating multiple features, resulting in improved classification performance. One notable boosting classifier is the XGBoost classifier, which transforms weak learners into potent contributors. It is well-suited for our proposed stack generalization model for identifying phishing sites, introducing a sense of symmetry to the classification process. Implemented on integrated feature sets of URLs and character-level features, it acts as a robust classifier within our proposed AntiPhishStack model for phishing identification, emphasizing the importance of symmetry in enhancing detection capabilities.
Suppose there are
URLs in a set
, where,
selected features corresponding to
URLs while
is a class label, e.g.,
if the URLs will be considered malicious or phishing websites. The final outcome of the XGBoost model will be computed using the following equation [
55].
where
is the model′s prediction at step
, and
represents the training loss function. The regularization term
is defined as
, where
is the number of leaf nodes in the base learner
,
is the complexity of each leaf, and
is the output value at each final leaf node.
At step
, considering the base learners from previous steps (
) as fixed, the loss function can be expanded using Taylor′s series [
55,
56]:
where
and
are the first and second derivatives of the loss function
with respect to
, computed as:
This formulation defines the model′s optimization process at each step, incorporating both the loss function and the regularization term to balance model complexity and fit. Then, the integrated features are categorized into phishing and benign, based on the weights by the meta-estimator for final prediction. Furthermore, XGBoost comes up with many advantages, some of which include (i) Within the training set, the power to fix missing values, (ii) working with extensive data that does not fit into memory, and (iii) to achieve the faster computing, XGBoost can utilize multiple cores on the CPU.
Deep learning involves many datasets and a significant time for model training. These models′ efficiency depends on the system resource specifications and the complexity of datasets. In order to identify phishing assaults, the time complexity is a crucial factor [
37]. The proposed method′s computational cost is based on how the characteristics are generated and extracted. URL and character level features extracted by our proposed method require logarithmic time complexity
. The extraction of such features during the model training and time complexity depends on the number of samples
and dimensions
. Accordingly, the time complexity of our proposed work is
.
4. Experiments
This paper utilized Python 2.7 to develop the suggested model and TensorFlow GPU v1.8.0 as a machine learning framework. The operating system is Windows 10 Pro Education, and the architecture was built using Python. This project′s Python packages and libraries to detect phishing URLs include Keras (built-in with TensorFlow), SciPy, Pandas, NumPy, Matplotlib, and Seaborn.
Support Vector Machine (SVM), Decision Tree (DT), Naive Bayes (GNB), Logistic Regression (LR), minimal sequential optimization. (SMO) and k-neighbor neighbors (KNN) algorithms were evaluated for stacking in this work. In the first stage, LSTM is employed as the basic classifier for stacked generalization, and further 10-fold cross-validation is utilized. In the second phase, the XGBoost classifier is utilized as a meta-estimator for the final prediction.
For the model′s effectiveness, the following statistical metrics are used to assess the proposed work for different purposes [
57].
Precision-Recall Curve: A graph is utilized for the trade-off between the true positive rate and the true negative or vice versa for the predictive model assessment [
57].
For Positive Precision
For Negative Precision
For Positive Recall
For Negative Recall
where TP indicates the true positive, which means the number of URLs is correctly classified as phishing; in contrast, the parameter TN indicates the true negative, which means the number of URLs is correctly determined as benign. FP is a false positive, which means the number of benign URLs is wrongly classified as phishing, and FN is a false negative, which shows the number of phishing URLs classified as benign.
Mean Absolute Error (MAE): The average value for all absolute errors [
58].
Mean Square Error (MSE): The average value for all squared errors [
58].