Data mining approach for dry bean seeds classification

被引:8
|
作者
Macuacua, Jaime Carlos [1 ]
Centeno, Jorge Antonio Silva [1 ]
Amisse, Caisse [2 ]
机构
[1] Univ Fed Parana, Geomat Dept, Postgrad Program Geodet Sci, Curitiba, Brazil
[2] Rovuma Univ, Nampula, Mozambique
来源
关键词
Data mining; Machine learning; Hyperparameter optimization; SMOTE technique; Dry bean seeds; SMOTE;
D O I
10.1016/j.atech.2023.100240
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
Product quality certification is an important process in agricultural production and productivity. Traditional methods for seed quality classification have shown limitations such as complex steps, low precision, and slow inspection for large production volumes. Automatic classification techniques based on machine learning and computer vision offer fast and high throughput solutions. Despite the major advances in state-of-the-art automatic classification models, there is still a need to improve these models by incorporating other techniques. In this article, we developed a computer vision system for the automatic classification of different seed varieties based on machine learning models, combined with data mining techniques using a set of features related to the geometry of bean seeds, extracted from binary images. Three machine learning techniques were compared, namely: Random Forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN), including Principal Component Analysis (PCA), Hyperparameter tuning in machine learning algorithms, and dataset balancing based on Synthetic Minority Oversampling Technique (SMOTE). The results showed that data mining processes, such as Principal Component Analysis, Hyperparameter tuning, and application of the SMOTE technique, help to improve the quality of classification results. The KNN classifier showed better performance, with around 95% accuracy and 96% precision and recall. The best results were obtained applying hyperparameter tuning and the SMOTE technique, in the preprocessing step, obtaining an increase around 2.6%. The results proved that the combined use of data mining in the preprocessing step and machine learning classification methods can effectively and efficiently increase the classification accuracy and help automatic bean seed selection based on digital images. This can help small farmers and/or agricultural managers make decisions regarding seed selection to increase production.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] On the classification techniques in data mining for microarray data classification
    Aydadenta, Husna
    Adiwijaya
    INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE (ICODIS), 2018, 971
  • [42] Rapid and accurate classification of mung bean seeds based on HPMobileNet
    Song, Shaozhong
    Chen, Zhenyang
    Yu, Helong
    Xue, Mingxuan
    Liu, Junling
    FRONTIERS IN PLANT SCIENCE, 2025, 15
  • [43] CLASSIFICATION OF THE VIGOR OF SEEDS OF BEAN-KID BY THE RESPIRATORY ACTIVITY
    Aumonde, Tiago Zanatta
    Marini, Patricia
    de Moraes, Dario Munt
    Maia, Manoel de Souza
    Pedo, Tiago
    Andre Tillmann, Maria Angela
    Villela, Francisco Amaral
    INTERCIENCIA, 2012, 37 (01) : 55 - 58
  • [44] Classification of Diabetes Mellitus Disease (DMD): A Data Mining (DM) Approach
    Das, Himansu
    Naik, Bighnaraj
    Behera, H. S.
    PROGRESS IN COMPUTING, ANALYTICS AND NETWORKING, ICCAN 2017, 2018, 710 : 539 - 549
  • [45] A hybrid data mining approach for knowledge extraction and classification in medical databases
    Hassan, Syed Zahid
    Verma, Brijesh
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2007, : 503 - 508
  • [46] Wind Power Ramp Events Classification and Forecasting: A Data Mining Approach
    Zareipour, Hamidreza
    Huang, Dongliang
    Rosehart, William
    2011 IEEE POWER AND ENERGY SOCIETY GENERAL MEETING, 2011,
  • [47] A data mining based approach for the EEG transient event detection and classification
    Exarchos, TP
    Tzallas, AT
    Fotiadis, DI
    Konitsiotis, S
    Giannopoulos, S
    18TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2005, : 35 - 40
  • [48] A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining
    Dehuri, Satchidananda
    Mishra, Bijan Bihari
    Cho, Sung-Bae
    ADVANCES IN NEURO-INFORMATION PROCESSING, PT I, 2009, 5506 : 1121 - +
  • [49] Data mining classification algorithms
    Saouabi, Mohamed
    Ezzati, Abdellah
    INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE, 2020, 15 (01): : 389 - 394
  • [50] Classification and data mining in musicology
    Beran, J
    Classification - the Ubiquitous Challenge, 2005, : 3 - 10