Preprint Article Version 1 This version is not peer-reviewed

Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification

Version 1 : Received: 1 November 2024 / Approved: 3 November 2024 / Online: 4 November 2024 (10:55:32 CET)

How to cite: Chatterjee, P.; Siddiqui, S. Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification. Preprints 2024, 2024110150. https://doi.org/10.20944/preprints202411.0150.v1 Chatterjee, P.; Siddiqui, S. Multi-Modal Deep Learning Architecture for Improved Colposcopy Image Classification. Preprints 2024, 2024110150. https://doi.org/10.20944/preprints202411.0150.v1

Abstract

Colposcopy image classification is vital for early cervical cancer detection, yet it remains challenging due to the significant variation in lesion appearances. Although deep learning models have advanced medical image classification, few studies have explored combining different model architectures to enhance diagnostic accuracy in colposcopy. This study addresses this gap by proposing a lesion-specific, multi-branch architecture that integrates attention mechanisms, deep feature extraction, and ensemble learning. Multi-task learning is employed to manage multiple lesion-specific classification tasks, while an ensemble of classifiers—Logistic Regression, XGBoost, and CatBoost—enhances decision-making accuracy. The architecture includes deep learning branches using EfficientNetB0 and MobileNetV2 for rich feature extraction from colposcopy images, with their outputs combined through a soft voting ensemble. Hyperparameter tuning, k-fold cross-validation, PCA visualization, and AUC plots for multiclass performance were used to optimize and assess model effectiveness. Training and validation accuracy were tracked in two phases: after the training phase, training accuracy reached 97.85\% and validation accuracy was 97.33\%; after the final ensemble classification, training accuracy improved to 99.95\% and validation accuracy to 99.85\%, surpassing individual model performance and demonstrating enhanced generalization. This model shows substantial promise for improving colposcopy classification accuracy, providing a valuable tool for clinical decision support in cervical cancer diagnosis.

Keywords

Cervical cancer; Deep Learning; Ensemble approach; Classifiers; Hyperparameter fine tuning; K-Fold cross validation

Subject

Engineering, Other

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.