AFEX-Net: Adaptive Feature EXtraction CNN for Classifying CT Images

Deep convolutional neural networks (CNN) are favored methods widely used in medical image processing due to their assured shown performance. Recently, the emergence of new lung diseases and the possibility of early detection of their symptoms has attracted many researchers to classify diseases by training deep CNNs on lung CT images. The trained networks are expected to distinguish between lung indications in diﬀerent diseases, especially at the early stages of them. With the hope of achieving this purpose, we proposed an eﬃcient deep CNN called AFEX-Net with adaptive feature extraction layers that successfully extract distinguishing features and classify chest CT images. The eﬃciency of the proposed network has two aspects: it is a lightweight network with low number of parameters and fast training and it has adaptive pooling layers and adaptive activation functions to increase its level of compatibility to the input data. The proposed network has been evaluated on a dataset with more than 10K chest CT slices, while an eﬃcient pre-processing method is developed to remove any bias from the images. Additionally, we evaluated the performance of the proposed model on the public COVID-CTset dataset to prove the generalisability of our model. The obtained results conﬁrm the competence of the proposed network in confronting medical images, where prompt and accurate learning is required.

Keywords:

Subject: Computer Science and Mathematics - Artificial Intelligence and Machine Learning

1. Introduction

Nowadays, medical image processing is an essential component in many medical research fields. From different types of medical images, computerized tomography (CT) is widely used in medical diagnosis and prognosis, as it provides valuable medical supplementary information with inexpensive and popular nature. It is an imaging method that uses x-rays to create cross-sectional pictures (slices) of a body organ and its structure. CT imaging has the advantage of eliminating overlapping structures and making the internal anatomy more apparent. Therefore, it can be used to identify disease or injury within various regions of the body. For example, brain CT images can be used to locate injuries, tumors, hemorrhage, and other conditions in head, while lungs CT images reveal the presence of tumors, pulmonary embolisms (blood clots), excess fluid, and other conditions such as emphysema or pneumonia in lungs.

In this paper, we do a focus on lungs CT images, with the aim to classify different diseases in them. This will be done via one of the most powerful tools in deep learning, known as convolutional neural network (CNN). Noticing the developments in CT and many other medical Imaging techniques, using convolutional neural networks for medical image classification is a very interesting research topic.

On the other hand, the 2019 novel coronavirus which was first recognized in Wuhan city of Hubei province of China and spread quickly and exponentially beyond the Chinese borders [1], has some common symptoms such as cough, shortness of breath, chest pain, and diarrhea [2]. Due to the vital circumstance related to the coronavirus pandemic, it is essential to recognize early differentiation between patients with and without the disease [3]. Currently, reverse-transcription polymerase chain reaction (RT-PCR) is the standard and straightforward test for diagnosis of covid-19 infection [4]. In practice, it may take more than 24 hours to achieve a test result [5].

Computed tomography (CT) is a widely suggested tool to be combined as a clinical test besides PCR test [6] for diagnosing the existence and severity of coronavirus symptoms [7]. Although X-ray images can also be used as the second imaging tool in COVID-19 diagnosis (such as studies done in [8,9,10,11,12]), using CT besides the RT-PCR test has many advantages. For example, peripheral areas of ground glass which are a hallmark of early COVID-19 can be detected in CT images while can easily be missed in chest X-rays [13].

After CT scanning, the chest CT image should be interpreted by a radiologist to detect the symptoms. This might be a boring action if several patients are available and may cause unintentional mistakes. Here, machine learning systems trained based on these images can be a great help to ease medical decisions and save time. Currently, there are many successful deep models developed to perform classification and segmentation tasks on CT images and can diagnose the abnormalities in images without radiologist intervention.

As an example, [14] designed an optimized convolutional neural network to classify patients into the infected and non-infected categories based on CT images. In another attempt, [15] proposed a novel deep convolutional algorithm based on multi-objective differential evolution (MODE) to classify COVID-19 infected patients from non-infected ones. Also, authors of [16] proposed an architecture based on the DenseNet model to classify lung CT images into COVID-19 and health classes where misclassification cases in the test dataset were assessed by a radiologist. Despite the excellent performance of these models in classifying lung CT images and detecting COVID-19 symptoms in patients, they give no information about other abnormalities that can be diagnosed using these images.

In a different research done in [17], a fully automatic deep learning system with transfer learning approach was proposed for diagnosing and prognosis of COVID-19 disease. This deep model was first trained using CT images with lung cancer, then, COVID-19 patients were enrolled to re-train the network. The trained network get externally validated on 4 large multi-regional COVID-19 datasets. Although this research has achieved interesting results, its training has not done using COVID-19 CT images.

There are some successful researches that have tried to distinguish between COVID-19 and other lung diseases. The work done in [18] used patients with positive RT-PCR tests as samples from the COVID-19 class and patients with community-acquired pneumonia (CAP) as samples of the other class and proposed a 3D convolutional model to classify these two classes. Also, Xu et al. [19] proposed a deep learning system to screen CT images in three steps. First, abnormal regions were segmented using a 3D deep learning model, then, the segmented images were categorized into the three COVID-19, Influenza-A viral pneumonia, and healthy classes by the ResNet model. Finally, the infection type and total confidence score of each CT image were calculated with Noisy-or Bayesian function.

In a more general and interesting research, Abbasian et al. in [20] used ten different deep CNNs to classify chest CT slides of patients into three classes.Their research aimed to compare the capabilities of different deep networks in confronting pulmonary infections. Their study showed that the ResNet-101 and Xception networks achieved the best performance; both had an AUC of

0.994

with a sensitivity of

100 %

and

98.04 %

, respectively, while the performance of radiologists was moderate with an AUC of

0.873

and sensitivity of

89.21 %

. The authors concluded that deep CNNs can be considered a high-performance technique in diagnosing lung infections and can be used as a collaborator in hospitals. This was an expected conclusion as deep networks have proved their proficiency in many learning areas, and our proposed network in this research is a step towards this direction.

Here, it worth mentioning that although the diagnosis of COVID-19 is very critical, the importance of identifying other lung diseases such as lung cancer should not be ignored. Based on the WHO reporting, lung cancer was the prominent reason of death worldwide in recent years [21]. Even though the risk of other infections in patients who have lung cancer is still to be determined [22], due to inherent characteristics and care needs, these patients are more vulnerable to other infections than others. In addition, ground-glass opacities or patchy consolidations, which are the most frequent indications of COVID-19 in chest CT images [23] may also appear in the early stages of radiotherapy pneumonitis (RP) of patients with lung cancer and are not easily distinguishable from each other.

The research done in [24] was the first study that used multi-classification deep learning model for diagnosing COVID-19, pneumonia, and lung cancer diseases. In this research, the authors had no access to a dataset that included images from these classes, and CT images of different classes were collected from different public sources. They considered five deep architectures to classify these public chest X-ray and CT datasets, and based on their experiments; the VGG19+CNN model achieved the best result. Although their work was a pioneer and valuable one, it is doubtful whether images collected from different sources and imaging devices can form a compatible and reliable dataset together. In other words, images collected from each medical imaging device comprise some characteristics related to that device that might interfere with learning algorithms. Due to the experiments done in [25], even a single-source dataset seems to have a solid build-in bias. Also, based on the research done in [26], in case there is any bias in the collected images such as corner labels and typical characteristics of a medical device, the classification model might easily learn to recognize these biases in different classes rather than focusing on the main features separating the different classes.

Therefore, based on all mentioned above, in this paper, we aim to tackle the worldwide challenge of classifying chest CT images using a lightweight and fast deep CNN-based architecture trained on an unbiased dataset. The proposed CNN architecture is an adaptive version of the one developed in [27] for classifying CT brain images. To the best of our knowledge, our proposed method (shown generally in Figure 1) is the second supervised deep learning model in the area of multi-class CT chest images classification (after the work done in [24]) where COVID-19, lung cancer, and normal images are included (see samples in Figure 2). However, one of the important points about our work is that all the 10715 chest CT images were collected from one imaging device with same device configuration to limit the risk of bias in images and achieve a reliable dataset.

In short, the main contributions of this work are as follows

A novel lightweight CNN-based model with Adaptive Feature EXtraction layers (AFEX-Net) is suggested for the classification of chest CT images into three classes.
The proposed network has an adaptive pooling strategy with adaptive activation functions, increasing model robustness.
The proposed network has few parameters compared to other CNN models used in this area (e.g. ResNet50 and VGG16) with faster training while preserving the accuracy. The low computational time of the proposed model makes it highly attractive in the clinic.
The proposed model has been evaluated on collected chest CT images from one origin to limit the learning risk of bias.

Hence, three significant properties of the proposed network (low number of parameters, adaptivity, and robustness) make it an easy to train and efficient network to be used with clinical data.

The rest of this article is organized as follows: In Section 2, we have explained a) the preprocessing step used to make an unbiased dataset (SubSection 2.1), and b) the proposed deep CNN network in details (SubSection 2.2). Section 3 provides the obtained results from the proposed network and discusses its performance compared to other efficient networks. Finally, conclusion is presented in Section 4.

2. AFEX-Net: The Proposed Deep Model

This section will focus on the proposed model and its characteristics. Based on Figure 1, there are two major steps in the proposed approach: 1) preparing the chest CT images by removing the background and probable biases and enhancing the dataset reliability, and 2) applying and evaluating the proposed AFEX-Net on the dataset. These two steps are explained in Section 2.1 and Section 2.2 subsections meticulously.

2.1. Image Preparation

Deep CNN networks consist of many layers with vast amounts of neurons that can automatically extract valuable features in images. These extracted features are then used for abnormality detection or classification of medical images by the network [28]. Although there is a belief that CNNs are powerful feature extractors, the bias in the input images can impress their outcome by perverting the network’s attention to worthless features. According to Figure 2, each slice of the CT image contains useless curves and noises outside the chest region, which can lead to biases during the learning process. Therefore, this study tries to do a background subtraction by eliminating the out-of-chest region to prevent undesirable biases.

An interesting work done by authors of [29] suggests a robust algorithm for removing the non-brain tissues in MR images. Here, the idea of the mentioned algorithm is used and modified to be suitable for preparing the chest CT images (see Figure 3).

The proposed algorithm has four basic steps as follows

applying an irrational mask based on the Gregory-Leibniz infinite series on the input image.
applying an image binarization method on the previously filtered image using a proper and adapted binarization threshold.
eliminating the undesired tissues from the binary image using morphological operations (using proper and adaptive disc size) and applying the flood-fill algorithm to fill holes in the image.
cropping the final images (removing the black background) and resizing the resulted images to a unique size containing only the chest area.

Figure 3 shows a chest CT image with the removed background. According to these modifications, all chest CT images were placed in an unbiased condition to be used by the proposed CNN. In this way, network training is done with no concern about network deviation due to focusing on unnecessary elements in the background.

2.2. Proposed Deep CNN for Chest CT Images Classification

Deep convolution neural networks are well-known supervised methods that have proved their efficiency in many areas, including classification and segmentation tasks on medical images. Here, we have proposed an adaptive deep CNN, where its architecture is inspired by the one developed in [27]. Our proposed network is an adaptive and more general one, bringing adaptation to the network to make it be applicable in different situations.

Obviously, hyperparameter tuning is essential to adapt a previously developed network for a different purpose and to get the most out of it. CNN’s have many hyperparameters that can affect the network performance, such as number of convolution layers ([30,31]), pooling layer ([32,33]), size of mini-batches ([34]), neurons’ activation function ([19,35]), etc. There are many researches devoted to investigate the influence of these parameters. This paper focuses on the role of pooling operation and activation functions in a deep CNN to make them adaptive, because they introduce non-linearity in the network and boost its performance in confronting complex data.

There are some traditional activation functions used in networks, such as sigmoid, tanh, and ReLU family, which the last one is the most popular. Nonetheless, using adaptive activation functions seems a better idea because they have higher learning capabilities than the traditional ones and can adapt themselves to the training data. Also, they improve the convergence rate, especially at early stages in training, as well as the network accuracy [36]. Using adaptive activation functions, it is possible to have smaller networks with fewer number of parameters while the efficiency and accuracy of the network get positively affected [37] and this seems very interesting.

Based on the explanations mentioned above, in this paper, we have developed a CNN with an "Adaptive Feature EXtraction" box called AFEX that consists adaptive activation function and adaptive pooling layer.

Adaptive activation function Here, the gated adaptive activation function in [38] is employed which is formulated as follows

$\begin{matrix} f_{g a t e} (x) = σ (ω x) f_{1} (x) + (1 - σ (ω x)) f_{2} (x) \\ σ (ω x) = \frac{1}{(1 + e x p (- ω x))} \end{matrix}$

(1)

where, $ω$ is a learnable controlling parameter and $f_{1} (x)$ and $f_{2} (x)$ are the LeakyReLU and PReLU activation functions, respectively. The aim of using gated function is to combine basic activation function in non-linear structure that is the most suitable one for a special data.
Adaptive pooling layer based on the proposed adaptive pooling in [39], this layer is a combination of max pooling ( $f_{m a x}$ ) and average pooling ( $f_{a v g}$ ) layer,

$f_{m i x} (x) = a_{l r c} . f_{m a x} (x) + (1 - a_{l r c}) . f_{a v g} (x)$

(2)

where the combination coefficient $a_{l r c} \in [0, 1]$ is a scalar parameter that is learned during network training (abbreviation $l r c$ stands for per layer/region/channel combination). Using this layer, the pooling operation is also influenced by the learning process and the input data.

The adaptation parameters should be learned, therefore, they participate in the back-propagation algorithm and get tuned during the learning process.

The proposed AFEX-Net consists of 21 different layers including input layer, feature extraction layers (convolution layers (Conv), adaptive activation function (AAF) and, adaptive pooling layers (A-Pool)), and classification layers (flatten layer (F), fully connected layer (FC), and softmax function (S)) that are arranged as illustrated in Figure 4.

In this figure, batches of input images of size

200 \times 200

are fed to the model during network training. The most common type of normalisation, called batch normalisation, is applied to the input images to transfer the input range to a proper range. As pooling layers change the distribution of their input data, batch normalization is also used after the three adaptive pooling layers of the proposed model for the same purpose.

The essential feature extraction stage is associated with the convolution layer that extracts the most related features of each image. The proposed network is based on six convolution layers; the first four are followed by three adaptive activation function layers, while the last two utilize Rectified linear unit (RELU) activation functions. Each adaptive activation function layer is followed by an adaptive pooling layer itself.

The classification part of the proposed network starts with a flatten layer (convert the last 2-D convolution layer into the one-dimensional vector), then a fully connected layer and softmax layer are used. The softmax function calculates the likelihood of each class for a given input image and classifies the input image into the three alternative classes. Furthermore, we used two dropout layers as a regularization method after the two last convolution layers to prevent the over-fitting problem. Details of all the layers used in the proposed CNN and related parameters are summarized in Table 1.

Just for the aim of comparison and to show the efficiency of the proposed model, the well-known VGG16 [40], and ResNet50 [41] deep models were selected, because they are known to be powerful feature extractors. The VGG16 is a CNN with 23 layers and 107,008,707 total trainable parameters. The ResNet50 is also a CNN uses 50 layers to extract features from images with 23,587,587 parameters including 23,534,467 trainable and 53,120 non-trainable parameters. Therefore, we have the proposed AFEX-Net, VGG16, and ResNet50 models to be trained using the chest CT images. Table 2 briefly provides the comparison between these three CNN models based on the number of network parameters and the elapsed time for training these networks. As the table presents, AFEX-Net has almost five times fewer parameters than ResNet and 24 times fewer parameters than the VGG16 network. Comparing these three networks, one can easily infer that AFEX-Net is a fast and lightweight network that can be tuned for different medical image classification purposes. These networks will be compared in more detail in the next section.

3. Evaluation

In this section, we describe the collected chest CT images used as the dataset, the conducted experiments on the dataset using the three AFEX-Net, VGG16 and ResNet50 models, and discuss the obtained results.

3.1. Dataset

3.1.1. COVID-Cancer-Set (CC)

The CT images were collected from the radiology section of Afzalipour hospital in Kerman, Iran. This hospital is the biggest in southeast of Iran and has been the primary academic and health department for COVID-19 since the pandemic outbreak in the mentioned area. The collected dataset that we called COVID-Cancer-set (CCs) includes 10715 2D chest CT images from 46 patients, divided into 4883 slices of COVID-19 and 3047 slices of lung cancer, and 2785 slices of Normal images. We should mention that the images with closed-lung mode were excluded from the dataset, and those COVID-19 slices without any lung infections. All processes were monitored by a medical expert. The images were randomly divided into training and test sets, 6719 2D slices were used for training, and the remaining 3996 slices (including 1954, 1067, and 975 2D slices of COVID-19, lung cancer, and normal, respectively) were reserved for the test set.

3.1.2. COVID-CTset

COVID-CTset is a publicly available dataset obtained from [42], which contains 2282 COVID-19 images from 95 patients and 9776 normal images from 282 persons with TIFF format and 512×512 pixels resolution (For more information refer to the original paper). Alternatively, we randomly divided the dataset into the 7462 CT images for training (including 1597 and 5865 CT slices of COVID-19 and normal, respectively) and 4596 CT images for the test set (685 slices of COVID-19 and 3911 slices of normal).

Thus, to be fair about the results, all three networks were evaluated using the same data, while test set images were never seen in the training phase. As pre-processing, all images were cropped (as explained in subSection 2.1) and resized into

200 \times 200

pixels for COVID-Cancer-set and

256 \times 256

pixels for COVID-CTset, and no data augmentation was used.

3.2. Experimental Setup

To use adaptive activation function mentioned in relation 1,

ω

was initialized from a uniform distribution

U (- 1, + 1)

and the initial value for the hyper-parameter

α

in PReLU was set to

0.1

. Table 3 shows the number of epochs and learning rate of the three networks described in subSection 2.2 for the CCs dataset. All three models were trained and validated using Adam optimizer [43] and a batch size of 32. For COVID-CTset dataset, the number of epochs and learning rate are 500 and

1 e^{- 6}

, respectively.

The initial value for neurons weight was sampled from a truncated normal distribution

N (0, \sqrt{2 / ν})

where

ν

is the number of input units in the weight tensor. Also, the dropout ratio was set to

0.4

The three models were implemented using Python 3 and the Keras [44] with TensorFlow [45] backend as the deep learning framework and were run on Google Colab [46] with 358.27 GB storage, 12 GB RAM, and an NVIDIA Tesla K80 GPU processor.

3.3. Performance Metrics

The performance of the three models was evaluated based on accuracy, sensitivity, specificity, and precision metrics that are formulated as follows

Accuracy = \frac{t p + t n}{t p + t n + f p + f n}

(3)

Sensitivity = \frac{t p}{t p + f n}

(4)

Specificity = \frac{t n}{t n + f p}

(5)

Precision = \frac{t p}{t p + f p}

(6)

where

t p

t n

f p

and

f n

stand for the true positive, true negative, false positive and false negative measures, respectively. Sensitivity or recall evaluates a model’s ability to correctly predict positive samples for each available category of data or determine images with a specific disease. In contrast, specificity evaluates a model’s ability to correctly exclude negative samples that do not have a given disease. Also, precision is used to be more confident about the predicted positive samples by preventing the occurrence of false negatives.

To complete the evaluation criteria, categorical cross entropy loss function was also calculated on the three mentioned models which is given as

L o s s = - (y_{i} log ({\hat{y}}_{i}) + (1 - y_{i}) log (1 - {\hat{y}}_{i}))

(7)

where

y_{i}

is the actual label or ground truth of the input image and

{\hat{y}}_{i}

is the predicted label or class for that.

Moreover, a confusion matrix is presented for each model to better visualize the performance of that model in confronting chest images from the three available classes.

3.4. Results & Discussion

3.4.1. CCs Dataset

Using the evaluation metrics mentioned in subSection 3.3, this subsection provides the obtained results from VGG16, ResNet50, and the proposed AFEX-Net models. However, to show the efficiency of the proposed network architecture, we also use the AFEX-Net* model in comparisons, which uses the AFEX-Net architecture without applying the adaptive layers. This allows for a fairer comparison among the architecture of these models, where the lightweight AFEX-Net competes for the big networks such as VGG16 and ResNet50, and no adaptation was used.

All these networks have been trained and evaluated several times, using the random division of images as described in subSection 3.1. The results reported in the following figures and tables are based on the average of each network performance in these runs.

Figure 5 illustrates the accuracy and loss charts obtained during the network training of the four AFEX-Net, AFEX-Net*, VGG16, and ResNet50 models. As can be seen in this figure, the models seem similar in training, and there is no meaningful difference among their final results; all get converged in 100 epochs to very near accuracy and loss values. However, taking a closer look at this figure, AFEX-Net has performed better than other networks in terms of accuracy, as was expected from its adaptive characteristics.Also, ResNet50 has achieved higher accuracy and less loss than other models in lower epochs, which might be because of using residual blocks in its heavy architecture. Nevertheless, it is worth mentioning again that based on Table 2, the training time for the proposed architecture in AFEX-Net and AFEX-Net* is much lower than the training time for the two other models. This should be considered a privilege for the proposed architecture, making this architecture suitable to be used in systems with limited hardware facilities or an embedded network in the medical platforms.

Although network behavior in the training phase is essential, the network efficiency in confronting unseen samples plays a prominent role in judging its capabilities. Table 4 presents the average of obtained results from evaluating the mentioned four models on test data (for a visualized comparison among obtained results, see Figure 6). In this table, each model’s sensitivity, specificity, and precision measures were calculated for the COVID-19, cancer, and normal classes. Also, each model’s overall loss and accuracy were assessed by averaging the three classes.

As can be seen in Table 4, the proposed AFEX-Net averagely has the highest accuracy and the lowest loss on test data compared to the other three networks. AFEX-Net* and VGG16 networks are very similar to AFEX-Net in terms of overall accuracy, while they have higher losses than that. Also, this table reveals a fantastic finding on the performance of the ResNet50 network; although the behavior of this network was hopeful during the training phase (see Figure 5), its performance on test data is not remarkable. In fact, it has lower accuracy and much higher loss than the other three evaluated networks; its accuracy is about

7 %

lower than the proposed AFEX-Net, while its loss is about 20 times higher than it. Also, noticing the higher variance of overall loss and accuracy in this model than in others leaves doubt about the proper training or overfit of this heavy network on training data.

Figure 6 clearly visualizes how the performance of AFEX-Net architecture in terms of sensitivity, specificity, precision, and accuracy was the highest compared to the other networks. Again, this figure emphasizes that the ResNet50 network is not successful in distinguishing the COVID-19 and cancer images because the precision of this network is much lower than its specificity, especially in cancer images (see the ResNet50 row in Table 4 for details). Totally, based on the obtained results from CCs dataset during training and test of these models, it is evident that AFEX-Net, with its lightweight architecture and fewer parameters, was able to provide the most remarkable results compared to already existing and successful models. The proposed AFEX-Net performance in all terms of the evaluation metrics was effective and superior. It achieved

99.71 %

accuracy and

1.04 %

loss rate on test data, the best result among other state-of-the-art networks. The second-best results in Table 4 belong to VGG16, which has about 25 times more parameters and requires three times more training time than AFEX-Net (refer to Table 2). Moreover, in terms of loss of test data, the AFEX-Net performance is much better than the VGG16 network, which indicates the accurate classification by the proposed model. It should also be mentioned that the AFEX-Net* architecture without the adaptive layers had only

0.3 %

less accuracy than the VGG16, with remarkably fewer parameters and less training time.

3.4.2. COVID-CTset

As a second analysis, we apply our proposed model to the COVID-CTset dataset and benchmark it against the other state-of-the-art models. Figure 7 shows a bright comparison between the accuracy and loss curves of AFEX-Net and AFEX-Net* in the COVID-CTset dataset. In stark contrast, the adaptive layers present better results, particularly at the primary epochs, when there is no yet information for learning. Thus, it is evident that although our proposed models obtain similar convergence results at the end of learning, the adaptive version performs much better on the unseen data, as it is provided in the Table 5. This table depicts the averaged values of obtained results on the test set from the proposed models compared to the models in [42], and [47], which trained their models on this dataset.

The visual comparison of our proposed models along with the other four models is presented in Figure 8. AFEX-Net achieved the best sensitivity, precision, and accuracy results and the second-best result for specificity.

In Table 5, we can see that AFEX-Net performs noticeably on the COVID-CTset dataset in terms of both accuracy and sensitivity of the COVID-19 category, where it obtains 0.76% and 0.71% improvement compared to the second-best model proposed in [42].

The confusion matrices for maximum accuracy obtained from the four models on the CCs dataset and the two versions of AFEX-Net on the COVID-CTset dataset are presented in Figure 9 and Figure 10, respectively, to have more scrutiny look at classification results by these trained networks.

Based on Figure 9, one can infer that there were rare COVID-19 samples (just four among 1954 images) that were misclassified as cancer and normal in the proposed AFEX-Net model. Just two cancer images were classified as COVID-19, and just one sample of normal images was not correctly predicted. Therefore, AFEX-Net can be considered a perfect network for this classification. To see the impact of adaptive layers used in AFEX-Net on network performance, we have to compare the AFEX-Net confusion matrix with the one resulting from AFEX-Net* and shown in Figure 9. Here, 13 COVID-19 samples got the wrong cancer label, while seven cancer samples were wrongly considered as COVID-19. Although these differences are numerically subtle, they are important due to the consequences of misdiagnoses in medicine. Undoubtedly, these differences mirror the role of adaptive pooling layers and adaptive non-linear activation functions used in AFEX-Net.

Figure demonstrates the ResNet50 model confusion matrix, where 57 COVID-19 samples (

3 %

of COVID-19 class) and 54 cancer samples (

5 %

of cancer class) were misclassified in the best case of this network. Comparing this confusion matrix with the two previous matrices resulting from the proposed AFEX-Net and AFEX-Net* shows how superior the proposed network is.

Finally, the confusion matrix of the VGG16 model is shown in Figure . As was mentioned before (due to Table 4), this network achieved the second-best result in COVID-19, cancer, and normal image classification. Also, VGG16 achieved the best classification in the cancer category with no misclassification. Although VGG16 performance for the aim of lung images classification is reasonable, it is worth mentioning again that it is a massive network with many parameters, and accordingly, it requires a long training time and equipped hardware. Regarding the confusion matrix obtained from the COVID-CTset (see Figure 10), it is evident that the AFEX-Net classified both classes robustly, even in the presence of imbalance in data.

The proposed AFEX-Net can be considered a small, lightweight CNN with adaptive feature extraction layers to classify lung CT images to COVID-19, cancer, and normal images. Furthermore, the AFEX-Net parameters are reasonably good (almost 4.5 thousand) to be employed in the medical clinics. There is a hope that this developed network can be re-tuned to be applied successfully in similar applications.

4. Conclusions

This work presented a rapid lightweight CNN with adaptive feature extraction layers, called AFEX-Net, to classify the lung CT images. To get sure about the fairness and unbiasedness of images, a robust preprocessing step was applied to the lung CT images to remove any noise and unrelated background. Then the proposed AFEX-Net was trained, which included adaptive pooling layers and adaptive nonlinear activation functions. The role of adaptive layers is to help the network adapt its configuration to the input data and learn the required features faster and more accurately.

Two datasets were utilized to investigate the performance of the proposed model. The CC dataset is a non-public one containing three classes of COVID-19, lung cancer, and Normal images. Concerning the importance of distinguishing between COVID-19 and cancer from CT images, this study was the first attempt to collect the COVID-19 and lung cancer images from one origin to eliminate probable biases in images and the obtained results. The COVID-CTset dataset is a public dataset containing two CONID-19 and normal classes. According to the two utilized datasets, the proposed model was then benchmarked against three networks: the non-adaptive version of AFEX-Net and the two state-of-the-art ResNet50 and VGG16 networks for the CC dataset and five networks, the non-adaptive version of AFEX-Net and other four state-of-the-art models in [42,47] for CODID-CTset. Considering several evaluation metrics, AFEX-Net achieved higher results than the its non-adaptive version and all the other models, regarding the influence of adaptive layers and feature extraction. Although the performance of the VGG16 network was very similar to the proposed adaptive AFEX-Net in terms of evaluation metrics, the privilege of our proposed model relies upon having much fewer parameters (almost 24 times fewer parameters) and low computational complexity (3 times faster training). Hence, being lightweight, accurate, and fast make AFEX-Net a unique method for diagnosing the abnormalities in chest CT images. Also, it can be successfully re-tuned to get applied in similar applications.

Author Contributions

The authors confirm contribution to the paper as follows: study conception and design: Roxana Zahedi Nasab; data collection: Mahdieh Montazeri; data labeling: Sobhan Amin; analysis and interpretation of results: Hadis Mohseni, Fahimeh Ghasemian; draft manuscript prepration: Roxana Zahedi Nasab; All authors reviewed the results and approved the final version of the manuscript.

Funding

This research received no external funding

Institutional Review Board Statement

his paper was approved by the Ethics Committee of Kerman University of Medical Sciences (IR.KMU.REC 99000303).

Informed Consent Statement

In this study, we only extracted the necessary data items from the patients’ records, and the demographic information of the patients was not extracted.

Data Availability Statement

The data and code that support the findings of this study are available upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wang, C.; Horby, P.W.; Hayden, F.G.; Gao, G.F. A novel coronavirus outbreak of global health concern. Lancet 2020, 395, 470–473. [Google Scholar] [CrossRef] [PubMed]
Chen, N.; Zhou, M.; Dong, X.; Qu, J.; Gong, F.; Han, Y.; Qiu, Y.; Wang, J.; Liu, Y.; Wei, Y.; et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet 2020, 395, 507–513. [Google Scholar] [CrossRef] [PubMed]
Zhu, Y.; Gao, Z.H.; Liu, Y.L.; Xu, D.Y.; Guan, T.M.; Li, Z.P.; Kuang, J.Y.; Li, X.M.; Yang, Y.Y.; Feng, S.T. Clinical and CT imaging features of 2019 novel coronavirus disease (COVID-19). J. Infect. 2020, 81, 147–178. [Google Scholar] [CrossRef] [PubMed]
Singhal, T. A review of coronavirus disease-2019 (COVID-19). Indian J. Pediatr. 2020, 87, 281–286. [Google Scholar] [CrossRef]
World Health Organization. Laboratory testing for 2019 novel coronavirus (2019-nCoV) in suspected human cases. Who - Interim Guid. 2020, 2019, 1–7. [Google Scholar]
Tabik, S.; Gómez-Ríos, A.; Martín-Rodríguez, J.L.; Sevillano-García, I.; Rey-Area, M.; Charte, D.; Guirado, E.; Suárez, J.L.; Luengo, J.; Valero-González, M.A.; et al. COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images. IEEE J. Biomed. Health Informatics 2020, 24, 3595–3605. [Google Scholar] [CrossRef]
Kalra, M.K.; Homayounieh, F.; Arru, C.; Holmberg, O.; Vassileva, J. Chest CT practice and protocols for COVID-19 from radiation dose management perspective. Eur. Radiol. 2020, 30, 6554–6560. [Google Scholar] [CrossRef]
Ucar, F.; Korkmaz, D. COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images. Med. Hypotheses 2020, 140, 109761–109761. [Google Scholar] [CrossRef]
Pereira, R.M.; Bertolini, D.; Teixeira, L.O.; Silla, C.N.; Costa, Y.M. COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios. Comput. Methods Programs Biomed. 2020, 194, 105532. [Google Scholar] [CrossRef]
Yi, P.; Kim, T.K.; Lin, C. Generalizability of Deep Learning Tuberculosis Classifier to COVID-19 Chest Radiographs: New Tricks for an Old Algorithm? J. Thorac. Imaging 2020, Publish Ahead of Print, 1. [Google Scholar] [CrossRef]
Tuncer, T.; Dogan, S.; Ozyurt, F. An automated Residual Exemplar Local Binary Pattern and iterative ReliefF based corona detection method using lung X-ray image. Chemom. Intell. Lab. Syst. 1040. [Google Scholar]
Yamaç, M.; Ahishali, M.; Degerli, A.; Kiranyaz, S.; Chowdhury, M.E.H.; Gabbouj, M. Convolutional Sparse Support Estimator-Based COVID-19 Recognition From X-Ray Images. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 1810–1820. [Google Scholar] [CrossRef]
Rubin, G.D.; Ryerson, C.J.; Haramati, L.B.; Sverzellati, N.; Kanne, J.P.; Raoof, S.; Schluger, N.W.; Volpi, A.; Yim, J.J.; Martin, I.B.K.; et al. The Role of Chest Imaging in Patient Management during the COVID-19 Pandemic: A Multinational Consensus Statement from the Fleischner Society. Radiology 2020, 296, 172–180. [Google Scholar] [CrossRef]
Castiglione, A.; Vijayakumar, P.; Nappi, M.; Sadiq, S.; Umer, M. COVID-19: Automatic Detection of the Novel Coronavirus Disease from CT Images Using an Optimized Convolutional Neural Network. IEee Trans. Ind. Informatics 2021. [Google Scholar] [CrossRef]
Singh, D.; Kumar, V.; Vaishali. ; Kaur, M. Classification of COVID-19 patients from chest CT images using multi-objective differential evolution-based convolutional neural networks. Eur. J. Clin. Microbiol. Infect. Dis. 2020, 39, 1379–1389. [Google Scholar] [CrossRef] [PubMed]
Yang, S.; Jiang, L.; Cao, Z.; Wang, L.; Cao, J.; Feng, R.; Zhang, Z.; Xue, X.; Shi, Y.; Shan, F. Deep learning for detecting corona virus disease 2019 (COVID-19) on high-resolution computed tomography: a pilot study. Ann. Transl. Med. 2020, 8. [Google Scholar] [CrossRef] [PubMed]
A Fully Automatic Deep Learning System for COVID-19 Diagnostic and Prognostic Analysis. Eur. Respir. J. 2020. [CrossRef]
Pu, J.; Leader, J.; Bandos, A.; Shi, J.; Du, P.; Yu, J.; Yang, B.; Ke, S.; Guo, Y.; Field, J.B.; et al. Any unique image biomarkers associated with COVID-19? Eur. Radiol. 2020, 1. [Google Scholar] [CrossRef] [PubMed]
Xu, X.; Jiang, X.; Ma, C.; Du, P.; Li, X.; Lv, S.; Yu, L.; Ni, Q.; Chen, Y.; Su, J.; et al. A deep learning system to screen novel coronavirus disease 2019 pneumonia. Engineering 2020, 6, 1122–1129. [Google Scholar] [CrossRef] [PubMed]
Ardakani, A.A.; Kanafi, A.R.; Acharya, U.R.; Khadem, N.; Mohammadi, A. Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks. Comput. Biol. Med. 2020, 121, 103795. [Google Scholar] [CrossRef]
Kasinathan, G.; Jayakumar, S.; Gandomi, A.H.; Ramachandran, M.; Fong, S.J.; Patan, R. Automated 3-D lung tumor detection and classification by an active contour model and CNN classifier. Expert Syst. Appl. 2019, 134, 112–119. [Google Scholar] [CrossRef]
Piñeiro, F.M.; López, C.T.; De la Fuente Aguado, J. Management of lung cancer in the COVID-19 pandemic: A review. J. Cancer Metastasis Treat. 2021, 2021, 10. [Google Scholar] [CrossRef]
Liu, M.; Zeng, W.; Wen, Y.; Zheng, Y.; Lv, F.; Xiao, K. COVID-19 pneumonia: CT findings of 122 patients and differentiation from influenza pneumonia. Eur. Radiol. 2020, 30, 5463–5469. [Google Scholar] [CrossRef] [PubMed]
Ibrahim, D.M.; Elshennawy, N.M.; Sarhan, A.M. Deep-chest: Multi-classification deep learning model for diagnosing COVID-19, pneumonia, and lung cancer chest diseases. Comput. Biol. Med. 2021, 132, 104348. [Google Scholar] [CrossRef] [PubMed]
Torralba, A.; Efros, A.A. Unbiased look at dataset bias. In Proceedings of the CVPR 2011, 2011, pp. 1521–1528. [Google Scholar] [CrossRef]
López-Cabrera, J.D.; Orozco-Morales, R.; Portal-Diaz, J.A.; Lovelle-Enríquez, O.; Pérez-Díaz, M. Current limitations to identify COVID-19 using artificial intelligence with chest X-ray imaging. Health Technol. 2021, 11, 411–424. [Google Scholar] [CrossRef] [PubMed]
Gao, X.W.; Hui, R.; Tian, Z. Classification of CT brain images based on deep learning networks. Comput. Methods Programs Biomed. 2017, 138, 49–56. [Google Scholar] [CrossRef]
Nakamura, Y.; Higaki, T.; Tatsugami, F.; Honda, Y.; Narita, K.; Akagi, M.; Awai, K. Possibility Deep. Learn. Med. Imaging Focus. Improv. Comput. Tomogr. Image Qual. 2020. Learn. Med. Imaging Focus. Improv. Comput. Tomogr. Image Qual. 2020, 2020. 44, 161–167. [Google Scholar] [CrossRef]
Moldovanu, S.; Moraru, L.; Biswas, A. Robust Skull-Stripping Segmentation Based on Irrational Mask for Magnetic Resonance Brain Images. J. Digit. Imaging 2015, 28, 738–747. [Google Scholar] [CrossRef]
Tek, F.B.; İlker Çam. ; Karlı, D. Adaptive convolution kernel for artificial neural networks. J. Vis. Commun. Image Represent. 2021, 75, 103015. [Google Scholar] [CrossRef]
Su, H.; Jampani, V.; Sun, D.; Gallo, O.; Learned-Miller, E.G.; Kautz, J. Pixel-Adaptive Convolutional Neural Networks. CoRR, 1904. [Google Scholar]
Liu, D.; Zhou, Y.; Sun, X.; Zha, Z.; Zeng, W. Adaptive Pooling in Multi-instance Learning for Web Video Annotation. 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
Momeny, M.; Sarram, M.A.; Latif, A.; Sheikhpour, R.a. A Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images. Signal Data Process. 2021, 17, http. [Google Scholar] [CrossRef]
Li, Y.; Wang, N.; Shi, J.; Hou, X.; Liu, J. Adaptive Batch Normalization for practical domain adaptation. Pattern Recognit. 2018, 80, 109–117. [Google Scholar] [CrossRef]
ZahediNasab, R.; Mohseni, H. Neuroevolutionary based convolutional neural network with adaptive activation functions. Neurocomputing 2020, 381, 306–313. [Google Scholar] [CrossRef]
Jagtap, A.D.; Kawaguchi, K.; Karniadakis, G.E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 2020, 404, 109136. [Google Scholar] [CrossRef]
Farahani, A.; Mohseni, H. Medical image segmentation using customized U-Net with adaptive activation functions. Neural Comput. Appl. 2021, 33, 6307–6323. [Google Scholar] [CrossRef]
Qian, S.; Liu, H.; Liu, C.; Wu, S.; Wong, H.S. Adaptive activation functions in convolutional neural networks. Neurocomputing 2017, 272, 204–212. [Google Scholar] [CrossRef]
Lee, C.Y.; Gallagher, P.W.; Tu, Z. Generalizing pooling functions in convolutional neural networks: Mixed, gated, and tree. In Proceedings of the Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016, 2016, pp. 464–472, [150908985]. [Google Scholar]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556 2014. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. CoRR, 1512. [Google Scholar]
Rahimzadeh, M.; Attar, A.; Sakhaei, S.M. A fully automated deep learning-based network for detecting COVID-19 from a new and large lung CT scan dataset. Biomed. Signal Process. Control. 2021, 68, 102588. [Google Scholar] [CrossRef] [PubMed]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 2015, Conference Track Proceedings; Bengio, Y.; LeCun, Y., Eds., 2015., May 7-9.
Chollet, F.; et al. Keras. https://keras.io, 2015.
Abadi, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, 2015. Software available from tensorflow.org.
Bisong, E. Building Machine Learning and Deep Learning Models on Google Cloud Platform; Apress, 2019. [CrossRef]
Nguyen, D.; Kay, F.; Tan, J.; Yan, Y.; Ng, Y.S.; Iyengar, P.; Peshock, R.; Jiang, S. Deep Learning–Based COVID-19 Pneumonia Classification Using Chest CT Images: Model Generalizability. Front. Artif. Intell. 2021, 4. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Overview of the framework of the proposed method. It has been investigated on two different datasets, the first one (which is a collected dataset by authors) has three COVID-19, Cancer and Normal classes while the the second one has only two COVID-19 and Normal classes.

Figure 2. The example of CT chest images from subjects showing cov COVID-19, can Cancer and no Normal images.

Figure 3. Preprocessing steps done on CT images.

Figure 4. Framework of AFEX-Net for classification between COVID-19, Cancer and Normal CT lung images (BN:Batch Normalization, Conv:Convolution layer, A-Pool:Adaptive Pooling layer, AAF:Adaptive Activation Function, D:dropout, F:flatten, FC:Fully Connected layer, S:Softmax classifier).

Figure 5. The training accuracy (left) and training loss (right) comparison in the four utilized models on the CCs dataset. The horizontal axis shows the number of epochs while the vertical axis shows the model accuracy and loss in left and right plots, respectively.

Figure 6. Visual comparison among the average performance of four models on the test set of CCs dataset.

Figure 7. The training accuracy (left) and training loss (right) comparison of AFEX-Net and AFEX-Net* on the COVID-CTset dataset. The horizontal axis shows the number of epochs while the vertical axis shows the model accuracy and loss in left and right plots, respectively.

Figure 8. Visual comparison among the average performance of six models on the test set of COVID-CTset dataset.

Figure 9. Confusion matrices for the four trained networks on the CCs dataset.

Figure 10. Confusion matrices for the two trained networks on the COVID-CTset dataset.

Table 1. CNN Architecture Used in AFEX-Net (BN: Batch Normalization, Conv: Convolution layer, A-Pool: Adaptive Pooling layer, AAF: Adaptive Activation Function, FC: Fully Connected layer).

Layer	Type	Filter-size	Strides	Filter	Output Shape	Parameters
1	BN	-	-	-	( 200, 200, 1)	4
2	Conv	$11 \times 11$	(4,4)	6	( 48, 48, 96)	11712
3	AAF	-	-	-	( 48,48, 96)	221184
4	A-pool	$3 \times 3$	(3,3)	-	( 16,16, 96)	24576
5	BN	-	-	-	( 16,16, 96)	384
6	Conv	$5 \times 5$	(2,2)	256	(8,8, 256)	614656
7	AAF	-	-	-	( 8, 8, 256)	16384
8	A-pool	$3 \times 3$	(3,3)	-	( 8, 8, 256)	2304
9	BN	-	-	-	( 4, 4, 256)	1024
10	Conv.	$3 \times 3$	(3,3)	384	( 4, 4, 384)	885120
11	Conv	$3 \times 3$	(2,2)	256	(2, 2, 256)	884992
12	AAF	-	-	-	( 2,2, 256)	1024
13	A-pool	$3 \times 3$	(3,3)	-	(1, 1, 256)	256
14	BN	-	-	-	( 1, 1, 256)	1024
15	Conv	$6 \times 6$	(2,2)	192	(1, 1, 192)	1769664
16	dropout				1, 1, 192	0
17	Conv	$1 \times 1$	(2,2)	96	(1, 1, 96)	18528
18	dropout				(1, 1, 96)	0
19	Flatten				(96)	0
20	FC	-	-	-	(3)	291
21	Softmax	-	-	-	(3)	0
Total params: 4,453,127; Trainable params:4,451,909; Non-trainable params: 1,218.

Table 2. Comparing the three AFEX-Net, ResNet50 and VGG16 convolutional networks based on the number of parameters and training time.

Model	trainable	non-trainable	Total	Training Time	Epochs
AFEX-Net	4,451,909	1,218	4,453,127	45m	100
ResNet50	23,534,467	53,120	23,587,587	1h:47m	100
VGG16	107,008,707	0	107,008,707	2h:16m	100

Table 3. Models’ training parameters for CC dataset.

Model	Optimizer	Learning rate	Epochs
AFEX-Net	Adam	$1 e^{- 6}$	100
ResNet50	Adam	$1 e^{- 6}$	100
VGG16	Adam	$1 e^{- 7}$	100

Table 4. Average of the obtained results from the four models on test data of CCs dataset.

Model	Class	Sensitivity	Specificity	Precision	Overall loss	Overall Accuracy
AFEX-Net	COVID-19	99.89	99.80	99.79	1.04 ± 0.18	99.71 ± 0.05
	Cancer	99.53	99.93	99.81
	Normal	100	99.96	99.96
	Cancer	98.78	99.76	99.34
	Normal	99.89	100	100
ResNet50	COVID-19	88.65	97.45	97.59	24.31 ± 2.62	92.81 ± 0.75
	Cancer	94.2	91.39	74.6
	Normal	96	99.49	98.46
VGG16	COVID-19	99.74	99.65	99.64	2.6 ± 0.18	99.50 ± 0.05
	Cancer	99.25	100	100
	Normal	99.89	99.76	99.28

[*] without adaptive layers.

Table 5. Average of the obtained results from the six models on test data of COVID-CTset dataset.

Model	Overall Accuracy	Sensitivity	Specificity	Precision
AFEX-Net	99.25	97.69	99.79	98.83
AFEX-Net*	98.95	96.50	99.43	96.78
ResNet50V2	97.52	69.44	99.87	97.98
Xception	96.55	61.71	99.88	98.02
[42]	98.49	80.91	99.69	94.77
[47]	85.4	86.49	84.36	83.9

{ without adaptive layers.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

AFEX-Net: Adaptive Feature EXtraction CNN for Classifying CT Images

Abstract

1. Introduction

2. AFEX-Net: The Proposed Deep Model

2.1. Image Preparation

2.2. Proposed Deep CNN for Chest CT Images Classification

3. Evaluation

3.1. Dataset

3.1.1. COVID-Cancer-Set (CC)

3.1.2. COVID-CTset

3.2. Experimental Setup

3.3. Performance Metrics

3.4. Results & Discussion

3.4.1. CCs Dataset

3.4.2. COVID-CTset

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe