1. Introduction
The automatic detection of pathogens in plants, as early as possible and without damaging the plant, is an approach that is enjoying increasing success in the agri–food sector. In automatic detection, the basic assumption is that a diseased plant looks different from a healthy one. For example, leaves can exhibit subtle color differences, often invisible to the human eye but which can be captured using techniques such as spectral imaging. Pest detection is often complicated because pests and their eggs are often found under the canopy of plants and are therefore difficult to detect. They are often very small and show a very local distribution. Crops in general could be affected by multiple pests at the same time. Therefore, not only high–resolution detection, but also local and organism–specific detection is needed. High–resolution imaging, combined with deep learning techniques, especially Convolutional Neural Networks (CNNs), could have the potential for precision agriculture for standard and greenhouse crops. In both cases, large quantities of labeled images from different situations (locations, seasons, crop varieties) are needed to sufficiently train deep learning algorithms. Moreover, augmentation and smarter training techniques are necessary to overcome the lack of real data and labeled images. Transfer learning has also been useful in the detection and diagnosis of diseases in agricultural crops, without ignoring the multiplicity of applications it has [
1].
Particularly, potato (
Solanum tuberosum L.) crops are constantly affected by the incidence of parasites which cause a decrease in their yield every year. Being a widespread crop in the world, the control of its production requires attention, and the problem of automatic disease recognition from leaf images via CNNs has been the subject of much recent literature, such as [
2,
3,
4,
5,
6,
7], to cite only some contributions. For instance, in [
8], potato tuber diseases were diagnosed using the VGG architecture by adding new dropout layers to avoid overfitting. As a result, 96% of the test images were classified correctly. After comparing MobileNet, VGG16, InceptionResNetV2, InceptionV3, ResNet50,VGG19 and Xception architectures, in [
9], it was found that VGG16 had the highest accuracy (99.43%) on test data for the diagnosis of late blight and early blight, the most incident diseases in potato crops. Finally, in [
10], a novel hybrid deep learning model, called PLDPNet, has been proposed, for automatic segmentation and classification of potato leaf diseases. The PLDPNet utilizes auto–segmentation and deep feature ensemble fusion modules to enhance disease prediction accuracy, with an end–to–end performance of 98.66% on the PlantVillage dataset ([
11],
https://www.kaggle.com/datasets/emmarex/plantdisease).
The versatility of CNNs allows their implementation from different platforms, including mobile devices. Mobile applications achieved rapid popularity because, in addition to being practical and lightweight, they simplify access to information and promote their widespread use. Their ecosystem is made up of several factors: infrastructure, operating system (OS), information distribution channels, etc. Nowadays, almost everyone owns a smartphone, whether it is an Android, iOS or other operating system device. Despite their diffusion, in Cuba a large part of the population can only afford phones with low performance (Android version over 4.0, 2G data network with a population coverage of 85%, internal memory 1 GB, etc.). The price–quality ratio is an obstacle to technological updating and, therefore, obtaining a practical tool, low demanding in term of computational resources, for the detection of potato pests is a necessary strategy for closing surveillance gaps in crop campaigns threatened by more than one disease.
The objective of this paper is to release an offline mobile application with the most effective machine learning architecture for the diagnosis of fungal blight in potatoes. The mobile application developed is compatible with Android versions higher than 4.1, has a storage capacity of 77.57 MB, and does not require Internet connection or mobile coverage. Other similar proposals can be found in literature, as in [
12], where a mobile app based on the MobileNetv2 architecture was developed, able to classify five pest categories: general early blight, severe early blight, severe late blight, severe late blight fungus and general late blight fungus. The model achieved an accuracy of 97.73%. Nonetheless, this study, as well as [
13,
14,
15], requires high resolution images and/or advanced features in the technological infrastructure, incompatible with the characteristics of Cuban mobiles, which are mostly on the way to obsolescence and unable to take even medium quality pictures. In [
16], a mobile application called VegeCare was devised for the diagnosis of diseases in potatoes, yielding a 96% accuracy. However, it is a proprietary software, difficult to access for the Cuban community. Moroever, most of these studies propose an architecture that requires connection to an external server for image processing, as in [
14]. While it is true that this works without extenuating circumstances in many countries, due to the availability of resources and access to free online platforms, it must be considered that in Cuba there are still planting regions where there is no mobile coverage or very low signal, which would hinder access to available international solutions.
Many mobile apps for smart agriculture have recently been devised based on deep learning [
17], sometimes founded on proprietary software. However, these apps, in addition to not being free of cost, can only be installed on devices with current Android versions and, normally, refer to a client–server architecture where the information is stored in external databases. Therefore, they require mobile networks and an external server with a MySQL manager for queries [
14]. In Cuba, the company GeoCuba has focused its efforts on image processing in the agricultural sector, mainly for the control of sugar cane and rice cultivation. Using satellite photos, drones and AI techniques, damages in these crops can be identified; however, this requires advanced tools to capture images in real time and platforms with high computational performance. Not to mention that the distance at which the images are taken may hinder the efficiency of the diagnosis.
All the above motivations push towards having a simple mobile app, which in addition of being free, offline and suitable for the characteristics of devices with low performance, can also cover the role of a decision assistant. The real–time diagnosis of the main pests contributes to the reduction of the risk of crop losses, to the early identification of the type of parasite, to the reduction of the use of pesticides and, therefore, to ecological sustainability. It includes an important strategic component as it is an informative tool that helps non–expert personnel to know about different diseases present in the crops, produced by insects, viruses, bacteria and nematodes, as each one has degenerative factors on a medium or large scale in the potato cultivation.
The rest of the paper is organized as follows. In the following section, the PlantVillage dataset and the experimental setting are described. In
Section 3, our experimental results are reported, assessing the superiority of the MobelNetv2 architecture—as a good compromise between computational lightness and performance—to be included in a mobile app for potato pest detection. The PPC (Potato Pest Control) app is briefly described in the subsequent
Section 4. Finally,
Section 5 traces some conclusions and future perspectives.
3. Experimental Results
The increase in the cost of energy and raw materials in Cuba is causing a new concept in agricultural production techniques, and the use of IT tools, mainly based on artificial intelligence, paves the way for developments capable of revolutionizing agricultural work. The goal of smart agriculture is to increase profits and, of course, reduce the risks of capital loss and destruction of natural resources. Mobile applications for disease detection represent a smart strategy and their use is currently essential to strengthen food sustainability, especially given the lack of investment in infrastructure plaguing the Cuban agricultural system. To obtain a CNN model capable of significantly reducing computational costs, being adaptable to the performance of mobile devices and capable of processing images effectively, it is necessary to take into account several incident factors (e.g., the number of model parameters and processing times), mainly related to the limited computational resources available. In fact, traditional deep learning models cannot be applied directly to mobile devices.
Therefore, after investigating lightweight neural network architectures and using transfer learning to limit the computational load due to training, the MobileNetv2 architecture was found to have the best adaptability to the data, with the highest level of accuracy, the lowest number of parameters and the lowest number of epochs (see
Table 1).
Indeed, after training the MobileNetv2 only for ten epochs, data overfitting was observed. Instead, adding the layers described in
Step 2, the accuracy on the validation set remains relatively aligned with that on the training data (see
Figure 2). In other words, based on the test set, the predicted values are close to the observed values (
Figure 3).
These results differ from those obtained in [
6] where, after applying ten deep learning models such as DenseNet201, DenseNet121, NasNetLarge, Xception, ResNet152v2, EfficientNetB5, EfficientNetB7, VGG19 and MobileNetv2 along with the hybrid model EfficientNetB7–ResNet152v2 for classification, it resulted that DenseNet201 obtained the highest accuracy, equal to 98.67%, with a validation error of 0.04. However, the model covers not only potato pests but also tomato and bell pepper pests, with a total of 15 disease classes. Instead, in [
5] the VGG16 model was selected, achieving 100% accuracy on the test data, after also evaluating VGG19, MobileNetv2, Inceptionv3 and Resnet50v2. Anyway, neither network size nor processing speed were taken into account in this study, although they are necessary elements for a model to be encapsulated in a mobile app, which is the ultimate goal of the present research.