Version 1
: Received: 1 November 2024 / Approved: 3 November 2024 / Online: 4 November 2024 (10:55:32 CET)
How to cite:
Chatterjee, P.; Siddiqui, S. Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification. Preprints2024, 2024110150. https://doi.org/10.20944/preprints202411.0150.v1
Chatterjee, P.; Siddiqui, S. Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification. Preprints 2024, 2024110150. https://doi.org/10.20944/preprints202411.0150.v1
Chatterjee, P.; Siddiqui, S. Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification. Preprints2024, 2024110150. https://doi.org/10.20944/preprints202411.0150.v1
APA Style
Chatterjee, P., & Siddiqui, S. (2024). Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification. Preprints. https://doi.org/10.20944/preprints202411.0150.v1
Chicago/Turabian Style
Chatterjee, P. and Shadab Siddiqui. 2024 "Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification" Preprints. https://doi.org/10.20944/preprints202411.0150.v1
Abstract
Colposcopy image classification is vital for early cervical cancer detection, yet it remains challenging due to the significant variation in lesion appearances. Although deep learning models have advanced medical image classification, few studies have explored combining different model architectures to enhance diagnostic accuracy in colposcopy. This study addresses this gap by proposing a lesion-specific, multi-branch architecture that integrates attention mechanisms, deep feature extraction, and ensemble learning. Multi-task learning is employed to manage multiple lesion-specific classification tasks, while an ensemble of classifiers—Logistic Regression, XGBoost, and CatBoost—enhances decision-making accuracy. The architecture includes deep learning branches using EfficientNetB0 and MobileNetV2 for rich feature extraction from colposcopy images, with their outputs combined through a soft voting ensemble. Hyperparameter tuning, k-fold cross-validation, PCA visualization, and AUC plots for multiclass performance were used to optimize and assess model effectiveness. Training and validation accuracy were tracked in two phases: after the training phase, training accuracy reached 97.85\% and validation accuracy was 97.33\%; after the final ensemble classification, training accuracy improved to 99.95\% and validation accuracy to 99.85\%, surpassing individual model performance and demonstrating enhanced generalization. This model shows substantial promise for improving colposcopy classification accuracy, providing a valuable tool for clinical decision support in cervical cancer diagnosis.
Keywords
Cervical cancer; Deep Learning; Ensemble approach; Classifiers; Hyperparameter fine tuning; K-Fold cross validation
Subject
Engineering, Other
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.