This version is not peer-reviewed.
Submitted:
01 November 2024
Posted:
04 November 2024
You are already at the latest version
Research Gap | Description | Corrective Measures |
---|---|---|
1. Flexibility, interpretability , generalizability [25]. | The cited model has limited adaptability to different types of features or data; its dense layers make it harder to interpret the influence of individual features and it may overfit due to its reliance on a single deep network. | Our model has multi-branch that helps to integrate diverse architectures for richer feature extraction. We have used classifiers like Logistic Regression and feature selection (PCA) allows for clearer insight into how decisions are made. With multiple classifiers, it reduces overfitting, improving generalization across diverse lesion types. |
2. Feature extraction and attention mechanism [26]. | As per the context of the cited architecture, only ResNet50 is used for feature extraction. While ResNet50 is powerful, it is limited to the feature set learned from a single model. The model does not incorporate an attention mechanism, which is essential in medical imaging to focus on the most critical regions (lesions). | Our multi-branch approach leverages both EfficientNetB0 and MobileNetV2, two different architectures trained on large-scale datasets, enabling a richer and more diverse feature representation that better captures different patterns across lesion types. Our model includes attention layers in both branches, allowing it to prioritize key areas in the medical images, improving the detection of subtle lesions. |
3. Limited Model Diversity and class imbalance handling [27]. | The cited algorithm uses two pre-trained CNN models (DenseNet and ResNet), both of which are powerful for image feature extraction. However, they might still provide similar kinds of features as both are deep convolutional architectures. The algorithm does not mention handling class imbalance, which is a common problem in medical datasets. | In our case, the model captures a wider diversity of features, leading to potentially better generalization on medical images with subtle lesion patterns. The algorithm explicitly handles class imbalance by introducing class weights during training, ensuring that all lesion classes (CIN1, CIN2, CIN3) are properly represented and the model doesn’t become biased toward the majority class. |
4. Handling lesion cases and dropout [28]. | The cited model focuses on general feature extraction and classification without mentioning specific mechanisms for handling class imbalance or lesion-specific processing. The algorithm uses dropout (0.1 probability) to prevent overfitting. While dropout is effective, it operates by randomly disabling neurons, which can potentially drop important feature representations. | The attention mechanism in our model makes it inherently better at focusing on the critical regions related to lesions, leading to more accurate predictions for different lesion classes. The attention mechanism provides a more targeted approach to overfitting prevention by focusing only on important features. |
5. Absence of Class-Specific Fine-Tuning [29]. | The cited algorithm does not mention any fine-tuning or class-specific optimization of the CNN models. It relies on the pre-trained models to extract general features, which may not be tailored to the dataset. | Our approach not only incorporates fine-tuning of EfficientNetB0 and MobileNetV2 but also applies attention mechanisms to focus on the most relevant lesion-specific regions. This helps the network adapt better to the cervical cancer dataset and provide more accurate predictions for CIN1, CIN2, and CIN3.[2] |
Reference | Dataset | Method | Accuracy | Remarks |
[30]2024 | Colposcopy | Deep Learning | 94.55% | Has used a hybrid deep neural network for segmentation. |
[31]2024 | All cancer images | Deep Learning | 99% | Has used a hybrid of pretrained CNN, CNN-LSTM, machine learning and deep neural classifiers. |
[32]2024 | Pap Smear | Deep Learning | 93% | Has used the CerviSegNet-DistillPlus as a powerful, efficient, and accessible tool for early cervical cancer diagnosis. |
[33]2023 | Pap Smear | Deep Learning | N/A | Employs a lightweight deep learning network known as MLNet, which is based on metaheuristics. |
[34]2023 | Pap Smear | Deep Learning | 99.22% | Uses deep learning integrated with MixUp, CutOut, and CutMix. |
[25]2023 | Colposcopy | Deep Learning | 92% | Uses predictive deep learning model. |
[35]2022 | Colposcopy | CNN | 87% | Used CNN with weighted loss function. |
[36]2022 | – | ANN | 98.87% | Applied artificial jellyfish search to ANN. |
[37]2022 | Colposcopy | CNN,SVM | 80% | ensemble of U-net and SVM. |
[38]2022 | Colposcopy and histopathological images | AI | N/A | A review on application of AI on cervical cancer screening. |
[39]2021 | Colposcopy | Deep Learning | 92% | Uses Deep neural techniques for cervical cancer classification. |
[8]2021 | Colposcopy | Deep Learning | 90% | Using deep neural network generated attention maps for segmentation. |
[40]2021 | Colposcopy | Residual Learning | 90%, 99% | Employed residual network using Leaky ReLU and PReLU for classification. |
[41]2021 | MR-CT Images | GAN | N/A | Uses a conditional generative adversarial network (GAN). |
[42]2021 | Pap Smear | Biosensors | N/A | Uses biosensors for higher accuracy. |
[43]2021 | Colposcopy | CNN | 99% | Uses Faster Small-Object Detection Neural Networks. |
[44]2021 | Pap Smear | Deep Convolutional Neural Network | 95.628% | Constructs a CNN called DeepCELL with multiple kernels of varying sizes. |
[45]2020 | MRI Data of Cervix | Statistical Model | - | A statistical model called LM is used for outlier detection in lognormal distributions. |
[46]2020 | Colposcopy | CNN | 81.95% | Employs a graph convolutional network with edge features (E-GCN). |
[47]2020 | Colposcopy | CNN | N/A | The Squeeze-Excitation Convolutional Neural Network (SE-CNN) is utilized to capture depth features across the entire image, leveraging the SE module for targeted feature recalibration. Furthermore, the Region Proposal Network (RPN) produces proposal boxes to pinpoint regions of interest (ROI). |
[48]2020 | Colposcopy | Pre-trained densenet | 96.13% | Parameters of all layers are fine-tuned with pre-trained DenseNet convolutional neural networks from two datasets (ImageNet and Kaggle). |
[49]2020 | Colposcopy | CNN | 96.13% | Uses a recurrent convolutional neural network for classification of cervigrams . |
Metric | Formula | Description |
---|---|---|
Recall | The percentage of true positives that are accurately detected | |
Precision | The percentage of expected positives that turn out to be positive | |
F1 Score | Harmonic mean of precision and recall | |
Accuracy | The proportion of correct predictions over all instances |
CIN Grades | Lugol’s-Iodine | Acetic-Acid | Normal- Saline | |||
---|---|---|---|---|---|---|
Ori-ginal | Aug-mented | Ori-ginal | Aug-mented | Ori-ginal | Aug-mented | |
CIN1 | 114 | 898 | 98 | 691 | 83 | 347 |
CIN2 | 119 | 945 | 101 | 700 | 84 | 365 |
CIN3 | 121 | 1005 | 107 | 705 | 86 | 353 |
CIN1 | CIN2 | CIN3 | |||
---|---|---|---|---|---|
Ori-ginal | Aug-mented | Ori-ginal | Aug-mented | Ori-ginal | Aug-mented |
900 | 1112 | 930 | 2009 | 960 | 2879 |
CIN Grades | Precision | Recall | F1-Score |
---|---|---|---|
CIN 1 | 0.9733 | 0.9723 | 0.9728 |
CIN 2 | 0.9740 | 0.9750 | 0.9745 |
CIN 3 | 0.9769 | 0.9798 | 0.9783 |
Classifier | Fold | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|---|
Logistic Regression | Fold 1 | 0.9801 | 0.9800 | 0.9789 | 0.9798 |
Fold 2 | 0.9804 | 0.9801 | 0.9799 | 0.9798 | |
Fold 3 | 0.9810 | 0.9804 | 0.9801 | 0.9800 | |
Fold 4 | 0.9806 | 0.9800 | 0.9798 | 0.9799 | |
Fold 5 | 0.9803 | 0.9801 | 0.9801 | 0.9805 | |
XGBoost | Fold 1 | 0.9903 | 0.9843 | 0.9844 | 0.9843 |
Fold 2 | 0.9910 | 0.9903 | 0.9900 | 0.9906 | |
Fold 3 | 0.9904 | 0.9900 | 0.9901 | 0.9902 | |
Fold 4 | 0.9901 | 0.9902 | 0.9900 | 0.9905 | |
Fold 5 | 0.9906 | 0.9900 | 0.9901 | 0.9903 | |
CatBoost | Fold 1 | 0.9904 | 0.9901 | 0.9900 | 0.9902 |
Fold 2 | 0.9901 | 0.9904 | 0.9901 | 0.9902 | |
Fold 3 | 0.9908 | 0.9905 | 0.9903 | 0.9903 | |
Fold 4 | 0.9906 | 0.9902 | 0.9901 | 0.9902 | |
Fold 5 | 0.9905 | 0.9903 | 0.9901 | 0.9902 |
Class-ifier | Vali-dation Accuracy | CIN1 | CIN2 | CIN3 | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Pre-cision | Rec-all | F1-Score | Pre-cision | Rec-all | F1-Score | Pre-cision | Rec-all | F1-Score | ||
LR | 0.9823 | 0.9815 | 0.9811 | 0.9814 | 0.9902 | 0.9900 | 0.9901 | 0.9904 | 0.9903 | 0.9902 |
XGBoost | 0.9905 | 0.9899 | 0.9901 | 0.9901 | 0.9904 | 0.9903 | 0.9904 | 0.9907 | 0.9904 | 0.9904 |
CatBoost | 0.9909 | 0.9901 | 0.9900 | 0.9901 | 0.9905 | 0.9903 | 0.9904 | 0.9908 | 0.9906 | 0.9905 |
Ensemble | 0.9985 | 0.9995 | 0.9994 | 0.9993 | 0.9996 | 0.9994 | 0.9998 | 0.9998 | 0.9997 | 0.9996 |
CIN-Grades | Lugol’s Iodine | Acetic Acid | Normal Saline | Malhari | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Prec. | Recall | F1 | Prec. | Recall | F1 | Prec. | Recall | F1 | Prec. | Recall | F1 | |
CIN1 | 0.97 | 0.97 | 0.97 | 0.96 | 0.95 | 0.96 | 0.95 | 0.95 | 0.95 | 0.98 | 0.98 | 0.98 |
CIN2 | 0.97 | 0.97 | 0.97 | 0.96 | 0.97 | 0.97 | 0.95 | 0.94 | 0.95 | 0.98 | 0.98 | 0.98 |
CIN3 | 0.98 | 0.98 | 0.97 | 0.97 | 0.96 | 0.97 | 0.96 | 0.95 | 0.96 | 0.98 | 0.98 | 0.98 |
Existing Models | Val Acc. | CIN1 | CIN2 | CIN3 | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Prec. | Recall | F1 | Prec. | Recall | F1 | Prec. | Recall | F1 | ||
[25] | 0.97 | 0.95 | 0.94 | 0.95 | 0.96 | 0.96 | 0.95 | 0.97 | 0.96 | 0.96 |
[26] | 0.96 | 0.94 | 0.92 | 0.93 | 0.94 | 0.93 | 0.93 | 0.95 | 0.94 | 0.94 |
[27] | 0.95 | 0.91 | 0.90 | 0.91 | 0.92 | 0.91 | 0.91 | 0.93 | 0.92 | 0.92 |
ResNet50 (baseline) | 0.92 | 0.90 | 0.89 | 0.90 | 0.91 | 0.90 | 0.91 | 0.91 | 0.90 | 0.91 |
DenseNet-121 (baseline) | 0.93 | 0.91 | 0.90 | 0.91 | 0.91 | 0.91 | 0.91 | 0.92 | 0.91 | 0.92 |
EffB0 | 0.92 | 0.90 | 0.89 | 0.90 | 0.91 | 0.89 | 0.90 | 0.91 | 0.90 | 0.91 |
MobV2 | 0.91 | 0.90 | 0.88 | 0.89 | 0.90 | 0.89 | 0.90 | 0.90 | 0.90 | 0.90 |
EffB0 + MobV2 | 0.94 | 0.91 | 0.90 | 0.91 | 0.92 | 0.91 | 0.92 | 0.93 | 0.91 | 0.92 |
EffB0 + MobV2 (with attention) | 0.95 | 0.93 | 0.92 | 0.92 | 0.93 | 0.92 | 0.92 | 0.94 | 0.93 | 0.94 |
Proposed approach | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 MDPI (Basel, Switzerland) unless otherwise stated