Analisis Komparatif Algoritma Machine Learning dengan Metrik Akurasi, Presisi, Recall, dan F1-Score pada Dataset Kacang Kering

Siti Helmiyah; Rico Pramestiawan

doi:10.35960/ikomti.v6i3.2031

Authors

Siti Helmiyah Sekolah Tinggi Keguruan dan Ilmu Pendidikan Rosalia Lampung
Rico Pramestiawan Sekolah Tinggi Keguruan dan Ilmu Pendidikan Rosalia Lampung

DOI:

https://doi.org/10.35960/ikomti.v6i3.2031

Keywords:

accuracy, dry bean dataset, f1-score, machine learning algorithms, precision, recall

Abstract

This study aims to compare the performance of five machine learning algorithms in classifying dry bean varieties as an effort to support quality detection systems for agricultural products. Issues related to authenticity and food safety that frequently occur, such as rice adulteration, highlight the importance of fast and accurate methods for variety identification. The study utilizes the Dry Bean Dataset from the UCI Machine Learning Repository, which consists of 13,611 samples with 16 numerical features and 7 classes of bean varieties. Five algorithms were tested, including K-Nearest Neighbors (KNN), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR). The data were divided into 80% for training and 20% for testing, and evaluated using accuracy, precision, recall, and F1-Score metrics. The results show that the SVM algorithm achieved the best performance with an accuracy of 92.43% and an F1-Score of 93.61%, followed by Logistic Regression and Random Forest. The confusion matrix analysis indicates that most varieties were correctly classified, although some misclassifications occurred among classes with similar morphological characteristics such as Dermason, Seker, and Sira. Based on these findings, it can be concluded that selecting the appropriate algorithm is crucial in applying machine learning for agricultural product classification. Evaluation using multiple metrics provides a more comprehensive performance overview compared to relying solely on accuracy. This approach has the potential to support more efficient automation in the identification of agricultural product varieties.

References

[1] S. Y. Hukmana, “Total 22 Saksi Diperiksa Kasus Beras Premium Oplosan, Termasuk 6 Produsen.” Accessed: Sep. 28, 2025. [Online]. Available: https://www.metrotvnews.com/read/bw6CgBxG-total-22-saksi-diperiksa-kasus-beras-premium-oplosan-termasuk-6-produsen

[2] A. Damayanti, “Daftar Merek Beras yang Diduga Dioplos, Ada Sania hingga Ayana Baca artikel detikfinance, "Daftar Merek Beras yang Diduga Dioplos, Ada Sania hingga Ayana.” Accessed: Sep. 28, 2025. [Online]. Available: https://finance.detik.com/berita-ekonomi-bisnis/d-8010316/daftar-merek-beras-yang-diduga-dioplos-ada-sania-hingga-ayana

[3] Tempo, “Beras Biasa Dijual Premium, Bareskrim Periksa 4 Produsen Beras.” Accessed: Sep. 28, 2025. [Online]. Available: https://www.tempo.co/hukum/beras-biasa-dijual-premium-bareskrim-periksa-4-produsen-beras--1995462

[4] A. Chan, “Bareskrim Polri Bongkar Praktik Curang: 201 Ton Beras Oplosan Disita, Kerugian Capai Rp99 Triliun.” Accessed: Sep. 28, 2025. [Online]. Available: https://sorongraya.inews.id/read/621994/bareskrim-polri-bongkar-praktik-curang-201-ton-beras-oplosan-disita-kerugian-capai-rp99-triliun

[5] F. Falcone, A. Aznan, C. Gonzalez Viejo, A. Pang, and S. Fuentes, “Rapid Detection of Rice Adulteration Using a Low-Cost Electronic Nose and Machine Learning Modelling,” Engineering Proceedings 2022, Vol. 27, Page 1, vol. 27, no. 1, p. 1, Nov. 2022, doi: 10.3390/ECSA-9-13291.

[6] N. Fazeli Burestan, A. H. Afkari Sayyah, and E. Taghinezhad, “Prediction of some quality properties of rice and its flour by near-infrared spectroscopy (NIRS) analysis,” Food Sci Nutr, vol. 9, no. 2, pp. 1099–1105, Feb. 2021, doi: 10.1002/FSN3.2086.

[7] M. T. Ribeiro, S. Singh, and C. Guestrin, “Model-Agnostic Interpretability of Machine Learning,” arXiv.org, 2016.

[8] A. Sharma, A. Jain, P. Gupta, and V. Chowdary, “Machine Learning Applications for Precision Agriculture: A Comprehensive Review,” IEEE Access, vol. 9, pp. 4843–4873, 2021, doi: 10.1109/ACCESS.2020.3048415.

[9] J. A. Wani, S. Sharma, M. Muzamil, S. Ahmed, S. Sharma, and S. Singh, “Machine Learning and Deep Learning Based Computational Techniques in Automatic Agricultural Diseases Detection: Methodologies, Applications, and Challenges,” Archives of Computational Methods in Engineering, vol. 29, no. 1, pp. 641–677, Jan. 2022, doi: 10.1007/S11831-021-09588-5/METRICS.

[10] A. Koirala, K. B. Walsh, Z. Wang, and C. McCarthy, “Deep learning – Method overview and review of use for fruit detection and yield estimation,” Comput Electron Agric, vol. 162, pp. 219–234, Jul. 2019, doi: 10.1016/J.COMPAG.2019.04.017.

[11] M. Koklu and I. A. Ozkan, “Multiclass classification of dry beans using computer vision and machine learning techniques,” Comput Electron Agric, vol. 174, p. 105507, Jul. 2020, doi: 10.1016/J.COMPAG.2020.105507.

[12] M. Salauddin Khan et al., “Comparison of multiclass classification techniques using dry bean dataset,” International Journal of Cognitive Computing in Engineering, vol. 4, pp. 6–20, Jun. 2023, doi: 10.1016/J.IJCCE.2023.01.002.

[13] J. K. Chahal and A. Kaur, “A Hybrid Approach based on Classification and Clustering for Intrusion Detection System,” International Journal of Mathematical Sciences and Computing, vol. 2, no. 4, pp. 34–40, Nov. 2016, doi: 10.5815/IJMSC.2016.04.04.

[14] C. Y. Lee, W. Wang, and J. Q. Huang, “Clustering and classification for dry bean feature imbalanced data,” Sci Rep, vol. 14, no. 1, pp. 1–19, Dec. 2024, doi: 10.1038/S41598-024-82253-6;SUBJMETA.

[15] T. Saito and M. Rehmsmeier, “The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets,” PLoS One, vol. 10, no. 3, p. e0118432, Mar. 2015, doi: 10.1371/JOURNAL.PONE.0118432.

[16] D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, pp. 1–13, Jan. 2020, doi: 10.1186/S12864-019-6413-7/TABLES/5.

[17] S. Aksoy and R. M. Haralick, “Feature normalization and likelihood-based similarity measures for image retrieval,” Pattern Recognit Lett, vol. 22, no. 5, pp. 563–582, Apr. 2001, doi: 10.1016/S0167-8655(00)00112-4.

[18] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, Jan. 2012, Accessed: Oct. 28, 2025. [Online]. Available: https://arxiv.org/pdf/1201.0490

[19] J. Han, M. Kamber, and J. Pei, “Data Mining. Concepts and Techniques, 3rd Edition (The Morgan Kaufmann Series in Data Management Systems),” 2011.

[20] S. Raschka and V. Mirjalili, “Python Machine Learning Third Edition Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2,” 2019, Accessed: Oct. 28, 2025. [Online]. Available: www.packt.com

[21] T. M. Cover and P. E. Hart, “Nearest Neighbor Pattern Classification,” IEEE Trans Inf Theory, vol. 13, no. 1, pp. 21–27, 1967, doi: 10.1109/TIT.1967.1053964.

[22] J. R. Quinlan, “Induction of decision trees,” Machine Learning 1986 1:1, vol. 1, no. 1, pp. 81–106, Mar. 1986, doi: 10.1007/BF00116251.

[23] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning 1995 20:3, vol. 20, no. 3, pp. 273–297, Sep. 1995, doi: 10.1007/BF00994018.

[24] L. Breiman, “Random forests,” Mach Learn, vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: 10.1023/A:1010933404324/METRICS.

[25] D. W. Hosmer and S. Lemeshow, “Applied Logistic Regression,” Applied Logistic Regression, Sep. 2000, doi: 10.1002/0471722146.

[26] A. Kamilaris and F. X. Prenafeta-Boldú, “Deep learning in agriculture: A survey,” Comput Electron Agric, vol. 147, pp. 70–90, Apr. 2018, doi: 10.1016/J.COMPAG.2018.02.016.

[27] “Encyclopedia Of Machine Learning And Data Mining.” Accessed: Sep. 30, 2025. [Online]. Available: https://pdf-up.com/download/encyclopedia-of-machine-learning-and-data-mining-4975350

[28] P. Branco, L. Torgo, and R. P. Ribeiro, “A Survey of Predictive Modeling on Imbalanced Domains,” ACM Computing Surveys (CSUR), vol. 49, no. 2, Aug. 2016, doi: 10.1145/2907070.