Version 1
: Received: 31 October 2024 / Approved: 31 October 2024 / Online: 31 October 2024 (12:20:43 CET)
How to cite:
Chinchilla Caravaca, J. Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision. Preprints2024, 2024102554. https://doi.org/10.20944/preprints202410.2554.v1
Chinchilla Caravaca, J. Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision. Preprints 2024, 2024102554. https://doi.org/10.20944/preprints202410.2554.v1
Chinchilla Caravaca, J. Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision. Preprints2024, 2024102554. https://doi.org/10.20944/preprints202410.2554.v1
APA Style
Chinchilla Caravaca, J. (2024). Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision. Preprints. https://doi.org/10.20944/preprints202410.2554.v1
Chicago/Turabian Style
Chinchilla Caravaca, J. 2024 "Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision" Preprints. https://doi.org/10.20944/preprints202410.2554.v1
Abstract
This study explores the classification of seven registered dry bean species using a dataset of 13,611 grain samples characterized by 16 features, including both dimensional and shape measurements. The primary objective was to develop robust Machine Learning models to accurately identify species while addressing challenges related to class imbalance and feature redundancy. A series of classification algorithms, including Decision Trees (DT), Random Forests (RF), k-Nearest Neighbors (KNN), Support Vector Machines (SVM), and Multi-Layer Perceptrons (MLP), were applied to evaluate their performance under various data conditions. Notably, the MLP model exhibited superior performance due to its capacity to capture complex patterns in high-dimensional data, achieving the highest classification accuracy. Random Forest models showed resilience to class imbalance, though misclassifications between similar species highlighted the need for improved feature selection. The results indicate that effective feature engineering and careful model tuning are crucial for enhancing classification accuracy in agricultural applications. The study concludes that future work should aim to expand the dataset and explore advanced data augmentation techniques to further improve model robustness and applicability in food security contexts. All code to recapitulate the analysis is available along with documentation of pipeline usage at https://github.com/jesusch10/beans_classification.
Keywords
Dry beans; Machine Learning (ML); feature selection; class imbalance; Multi-Layer Perceptron; Random Forest; agricultural applications; food security
Subject
Biology and Life Sciences, Biology and Biotechnology
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.