Preprint Article Version 1 This version is not peer-reviewed

Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision

Version 1 : Received: 31 October 2024 / Approved: 31 October 2024 / Online: 31 October 2024 (12:20:43 CET)

How to cite: Chinchilla Caravaca, J. Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision. Preprints 2024, 2024102554. https://doi.org/10.20944/preprints202410.2554.v1 Chinchilla Caravaca, J. Comparison of Machine Learning Techniques to Classify Dry Beans Using Computer Vision. Preprints 2024, 2024102554. https://doi.org/10.20944/preprints202410.2554.v1

Abstract

This study explores the classification of seven registered dry bean species using a dataset of 13,611 grain samples characterized by 16 features, including both dimensional and shape measurements. The primary objective was to develop robust Machine Learning models to accurately identify species while addressing challenges related to class imbalance and feature redundancy. A series of classification algorithms, including Decision Trees (DT), Random Forests (RF), k-Nearest Neighbors (KNN), Support Vector Machines (SVM), and Multi-Layer Perceptrons (MLP), were applied to evaluate their performance under various data conditions. Notably, the MLP model exhibited superior performance due to its capacity to capture complex patterns in high-dimensional data, achieving the highest classification accuracy. Random Forest models showed resilience to class imbalance, though misclassifications between similar species highlighted the need for improved feature selection. The results indicate that effective feature engineering and careful model tuning are crucial for enhancing classification accuracy in agricultural applications. The study concludes that future work should aim to expand the dataset and explore advanced data augmentation techniques to further improve model robustness and applicability in food security contexts. All code to recapitulate the analysis is available along with documentation of pipeline usage at https://github.com/jesusch10/beans_classification.

Keywords

Dry beans; Machine Learning (ML); feature selection; class imbalance; Multi-Layer Perceptron; Random Forest; agricultural applications; food security

Subject

Biology and Life Sciences, Biology and Biotechnology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.