1. Introduction
Parkinson's disease (PD) is a neurological disorder that is usually characterized by symptoms including memory loss, cognitive decline, muscle weakness, nervousness, and trembling [1, 2]. The exact etiology of various cognitive rigidity syndromes, such as trauma, inflammation, tumors, and drug use, remains uncertain. Meanwhile, the pathogenesis of Parkinson's disease is known for certain [
3]. In addition, Parkinson's symptoms can be caused by the influence of certain chemicals, thus indicating that the surrounding environment also influences the development of this disease [
4]. This study will take use of the fact that tremors and muscle rigidity, the two most prevalent signs of Parkinson's disease, have an immediate effect on how drawn by hand spirals and waves appears visually [5, 6].
Spiral and wave hand drawing have been proposed as non-invasive tests that can measure motor dysfunction in PD [7, 8]. The identification of PD using hand-drawn tasks is crucial and straightforward for diagnosis because both sensory and motor symptoms might be present. Because of their sluggish motions and poor brain-hand synchronization, people with Parkinson's disease (PD) frequently draw spirals and waves that are not precisely spiral or wave shapes [
9]. The speed and pressure of the pen used to design spirals are found to be lower among persons with Parkinson's disease who have a more severe form of the disease [
10]. Furthermore, spiral drawing has been utilized in order to evaluate the effect that therapy has on the execution of motor functions [
11]. The use of wave handwriting analysis has been proposed as a complementary approach to standard clinical evaluations, offering the potential to support earlier diagnosis of PD by identifying subtle signs and manifestations of the disease [
12]. Additionally, the diagnosis of Parkinson's disease is strictly clinical and does not require the assistance of a laboratory. It is often made using the eyes, hearing, and hands [
13]. However, advancements in technology have facilitated the development of computer-based systems that leverage handwriting patterns as potential biomarkers for modeling PD, especially artificial intelligence (AI).
A large amount of assistance is provided by AI in the detection of diseases based on age-related activities. Hand sketching and deep learning have gained significant interest in recent years for the classification of Parkinson's disease. Convolutional neural networks (CNN) have shown great potential in several medical applications, such as accurately diagnosing Parkinson's disease [14-17]. This illustrates the extensive scope of deep learning techniques in the context of identifying and classifying Parkinson's illness. CNN has been utilized extensively to the classification of hand gestures and motions and it has proven a high level of accuracy in discriminating between drawings created by individuals dealing with Parkinson's disease and those created by individuals who are in good condition. Research has developed a CNN-based spiral image classifier to detect early-stage PD with 85% accuracy [
18]. However, there are few comparisons between spiral and wave categories classified by artificial intelligence. This is important to support clinical diagnosis work for each category.
Furthermore, comparing CNN models is crucial for identifying the most suitable model for diagnosing PD from hand-drawn tasks [
19]. The need for diagnostic tools that are accurate and reliable for PD is the reason why this comparison is so important. Furthermore, there is the possibility of differences in model performance depending on specific features of the input data and the level of difficulty of the task [
7]. An accurate classification of PD can be achieved by the utilization of these methodologies, which make use of the abilities of deep learning to extract relevant features from hand-drawn data.
The importance of comparing CNN models for medical diagnosis has been brought to light by a number of recently published papers. Specifically, it underlined the significance of changing existing models in order to reduce the amount of time spent on training instances, as well as the utilization of pre-trained CNNs through transfer learning and fine-tuning [
20]. Similarly, the clinical value of CNN models was verified by comparing them with established guidelines in plantar pressure detection of foot problems [
21]. In alongside this, it emphasizes the importance of conducting broad performance evaluations of deep-learned, hand-crafted, and fused features with deep and traditional models in medical environments [
22].
This study compares pre-trained CNN models, MobileNet, ResNet50, EfficientNet B1, and Inception V3 for PD hand-drawn image auto-classification. MobileNet’s efficiency in mobile and embedded vision applications makes it suitable for limited computational resources, such as mobile devices in diagnosis cases [
23]. ResNet50 is recognized for its high classification accuracy, which makes it a strong candidate for tasks where precision is crucial, such as medical image classification [
24]. In contrast, EfficientNet B1’s demonstrated accuracy and efficiency make it a promising option for Parkinson’s hand-drawn image auto-classification, especially considering its performance compared to other architectures [
25]. Inception V3’s efficient use of model parameters makes it a valuable contender, especially in scenarios where computational resources must be utilized optimally, especially for hand-drawn images [
26]. Thus, comparing CNN models is crucial for identifying the most suitable model for diagnosing PD from hand-drawn tasks. This comparison allows for the evaluation of model performance, generalizability, and suitability for specific clinical applications, ultimately contributing to the development of accurate and reliable diagnostic tools for PD.
4. Discussion
Based on our results, a comparison of MobileNet, ResNet50, EfficientNet-B1, and InceptionV3 in classifying PD from the provided image dataset shows that MobileNet outperforms other architectures, achieving an impressive accuracy of 0.92 in PD diagnosis. Furthermore, the results showed that MobileNet is superior in classifying four classes with a range of 0.86-0.97 F1-Score. On the other hand, EfficienNet matches the results from MobileNet in two classes, namely Spiral Normal and Spiral Parkinson, with a range of 0.86-0.87 F1-Score. Furthermore, this research shows the importance of transfer learning on ImageNet, a support and adjustment strategy for training CNN models. These pre-trained can contribute to MobileNet’s high accuracy in PD classification tasks [
41]. Additionally, this work was carried out by highlighting the high classification results achieved by pre-trained CNNs, indicating their effectiveness in disease classification tasks [
29].
Moreover, the study's incorporation of augmentation techniques, such as rotation by 15°, zoom range of 0.2, width shift range of 0.2, and height shift range of 0.2, contributed to the enhanced performance of the classification models. Augmentation techniques play a crucial role in expanding the diversity and variability of the training dataset, thereby enabling the models to learn robust and generalized features. By introducing variations in the training data through augmentation, the models become more adept at capturing and recognizing patterns, leading to improved accuracy and performance in PD classification. The augmentation techniques effectively enriched the training dataset, enabling the models to better adapt to variations and nuances in the hand-drawn images, ultimately contributing to the higher accuracy observed, particularly in the case of MobileNet [
42].
Comparison of MobileNet, ResNet50, EfficientNet B1, and Inception V3 in accurately classifying PD from provided image datasets highlights the potential of MobileNet as a promising architecture for PD diagnosis. It was observed that MobileNet achieved the highest accuracy of 0.92, while ResNet50 only achieved an accuracy of 0.80 (
Figure 5). MobileNet is explicitly designed for mobile and embedded vision applications, emphasizing efficiency without compromising performance [
43]. On the other hand, ResNet50 is a deep residual network that focuses on residual function learning, making it suitable for complex image recognition tasks [
33]. The performance variation between MobileNet and ResNet50 in the context of PD classification aligns with research findings by Thu et al. (2023), which showed that the pre-trained MobileNet outperformed ResNet50 in pedestrian classification [
44].
Based on
Table 2. MobileNet’s precision, Recall, and F1-Score are above 0.80 in Parkinson’s disease classification from hand image datasets which is caused by several factors. MobileNet’s performance in this context is in line with the success of pre-trained deep learning models in various medical and image classification tasks with CNN. Kaur et al. (2021) explored the CNN model based on Magnetic Resonance Imaging of PD and got 89.23% accuracy [
45]. The results indicated the potential of deep learning approaches in accurately identifying PD from such image data. Additionally, the work by Fan & Sun (2022) explored the use of CNN for the early detection of PD using drawing movements and got 85% accuracy. The results may further highlight the applicability of deep learning techniques in this domain [
18].
Additionally, MobileNet success in achieving high performance can be attributed to its architecture and feature extraction capabilities [46, 47]. MobileNet can effectively extract relevant features from hand-drawn images and distinguish patterns associated with Parkinson’s disease, which contributes to its high precision and recall scores. Additionally, transfer learning approaches, as discussed in the work of Baghdadi, Nadiah A., et al. (2022) [
48], may play an important role in improving the performance of MobileNet for Parkinson’s disease classification. Transfer learning allows a model to leverage knowledge gained from a source task to improve learning in a related target task, which can be especially beneficial when working with limited datasets, such as hand-drawn images [
49].
The utilization of the MobileNet model for diagnosing PD from handwriting, particularly spiral and wave patterns, holds significant promise for future applications. MobileNet, characterized by their lightweight and efficient architecture, have been widely recognized for their suitability in embedded vision applications, making them well-suited for processing handwriting data obtained from mobile devices [
50]. The efficient nature of MobileNets, achieved through depthwise separable convolutions, enables the development of models that can effectively analyze and classify handwriting patterns associated with PD, thereby contributing to the early detection and monitoring of the disease [51, 52]. Furthermore, the use of MobileNet-based models in conjunction with transfer learning techniques offers the potential to enhance the computational efficiency and accuracy of PD diagnosis from handwriting data, thereby facilitating the integration of this approach into clinical practice [
53].
Moreover, the application of MobileNet model for PD diagnosis aligns with the growing interest in leveraging advanced technologies, such as deep learning and artificial intelligence, to develop non-invasive and accessible diagnostic tools for neurodegenerative diseases [
34]. By harnessing the computational capabilities of MobileNet, researchers can explore the intricate features of handwriting, including dynamic characteristics and spatial patterns, to identify distinctive markers associated with PD [
54]. Additionally, the potential integration of MobileNet models with other modalities, such as speech signals, presents an opportunity to create comprehensive diagnostic frameworks that encompass multiple data sources, thereby enhancing the accuracy and reliability of PD diagnosis [
55]. The future utilization of MobileNet models for PD diagnosis from handwriting offers a pathway towards innovative, technology-driven approaches that can revolutionize the early detection and management of neurodegenerative conditions, ultimately improving patient outcomes and quality of care.
According to
Figure 8. The similar F1-Scores achieved by MobileNet and EfficientNet-B1 in predicting Spiral Normal and Spiral Parkinson data can be attributed to the effectiveness of the CNN architecture used in these models. The study by Sarvamangala and Kulkarni (2022) highlighted basic CNN design variants in achieving state-of-the-art results in image-based classification tasks [
56]. In addition, research from Elfatimi et al. (2022) demonstrated the high classification performance of the MobileNet architecture in a similar image classification task, indicating the effectiveness of this architecture in achieving accurate results [
57]. Furthermore, research by Filatov and Yar (2022) shows that the EfficientNet-B1 architecture also performs well in the task of not very different classes, which supports the model’s performance to achieve high accuracy in classifying multi classes [
58].
Based on
Table 2 and
Figure 8. Variations in F1 Scores for the four classes (Normal Spiral, Parkinson’s Spiral, Normal Wave, Parkinson’s Wave) on MobileNet can be attributed to inherent differences in the characteristics and complexity of the classified classes. The F1 score, a harmonious average of precision and Recall, provides a balanced measure of model performance in various classes, namely in the range 0.86-0.97. In the context of skin cancer classification, it was shown that the weighted average F1 Score was 0.83, which highlights the importance of considering F1-Score in multiclass classification tasks [
59]. Differences in F1-Score for each class can be influenced by specific features and patterns associated with each class. In the case of PD, the spiral image will look more the same if rotational augmentation is used, in contrast to the wave image, which will be different if the same augmentation technique is used. In this study, we use several augmentation techniques to overcome the insufficient dataset. This affects the CNN performance results with different F1-Scores [
29]. However, this research proved that MobileNet is suitable for classification tasks with small amounts of data.
The choice of the most appropriate architecture for PD classification may depend on factors such as the nature of the image data set, the specific features relevant to PD diagnosis, and the computational resources available for model implementation. Therefore, although MobileNet has demonstrated superior performance in the context of the provided image datasets, further research and experiments may be needed to validate its effectiveness across various PD image datasets and clinical settings. The other limitation of this study are the diversity and heterogeneity of PD manifestations and progression across individuals may pose a challenge in developing universally applicable deep learning models. Variability in symptom presentation, disease subtypes, and comorbidities could impact the generalizability of deep learning-based diagnostic systems, potentially leading to limitations in accurately capturing the full spectrum of PD manifestations.