Data mining approach for dry bean seeds classification

被引:8
|
作者
Macuacua, Jaime Carlos [1 ]
Centeno, Jorge Antonio Silva [1 ]
Amisse, Caisse [2 ]
机构
[1] Univ Fed Parana, Geomat Dept, Postgrad Program Geodet Sci, Curitiba, Brazil
[2] Rovuma Univ, Nampula, Mozambique
来源
关键词
Data mining; Machine learning; Hyperparameter optimization; SMOTE technique; Dry bean seeds; SMOTE;
D O I
10.1016/j.atech.2023.100240
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
Product quality certification is an important process in agricultural production and productivity. Traditional methods for seed quality classification have shown limitations such as complex steps, low precision, and slow inspection for large production volumes. Automatic classification techniques based on machine learning and computer vision offer fast and high throughput solutions. Despite the major advances in state-of-the-art automatic classification models, there is still a need to improve these models by incorporating other techniques. In this article, we developed a computer vision system for the automatic classification of different seed varieties based on machine learning models, combined with data mining techniques using a set of features related to the geometry of bean seeds, extracted from binary images. Three machine learning techniques were compared, namely: Random Forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN), including Principal Component Analysis (PCA), Hyperparameter tuning in machine learning algorithms, and dataset balancing based on Synthetic Minority Oversampling Technique (SMOTE). The results showed that data mining processes, such as Principal Component Analysis, Hyperparameter tuning, and application of the SMOTE technique, help to improve the quality of classification results. The KNN classifier showed better performance, with around 95% accuracy and 96% precision and recall. The best results were obtained applying hyperparameter tuning and the SMOTE technique, in the preprocessing step, obtaining an increase around 2.6%. The results proved that the combined use of data mining in the preprocessing step and machine learning classification methods can effectively and efficiently increase the classification accuracy and help automatic bean seed selection based on digital images. This can help small farmers and/or agricultural managers make decisions regarding seed selection to increase production.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A novel switching function approach for data mining classification problems
    Ibrahim, Mohammed Hussein
    Hacibeyoglu, Mehmet
    SOFT COMPUTING, 2020, 24 (07) : 4941 - 4957
  • [32] Efficient Mining of Data Streams Using Associative Classification Approach
    Kompalli, Prasanna Lakshmi
    Cherku, Ramesh Kumar
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2015, 25 (03) : 605 - 631
  • [33] Rare Disease Registries Classification and Characterization: A Data Mining Approach
    Santoro, Michele
    Coi, Alessio
    Di Paola, Michele Lipucci
    Bianucci, Anna Maria
    Gainotti, Sabina
    Mollo, Emanuela
    Taruscio, Domenica
    Vittozzi, Luciano
    Bianchi, Fabrizio
    PUBLIC HEALTH GENOMICS, 2015, 18 (02) : 113 - 122
  • [34] Text Associative Classification Approach for Mining Arabic Data Set
    Ghareb, Abdullah S.
    Hamdan, Abdul Razak
    Abu Bakar, Azuraliza
    2012 4TH CONFERENCE ON DATA MINING AND OPTIMIZATION (DMO), 2012, : 114 - 120
  • [35] A hybrid approach for improving the accuracy of classification algorithms in data mining
    Akgobek, Omer
    ENERGY EDUCATION SCIENCE AND TECHNOLOGY PART A-ENERGY SCIENCE AND RESEARCH, 2012, 29 (02): : 1039 - 1054
  • [36] A Novel Data Mining Approach for Multi Variant Text Classification
    Dsouza, Kevin Joy
    Ansari, Zaheed Ahmed
    2015 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), 2016, : 68 - 73
  • [37] A Novel Visualization Approach for Data-Mining-Related Classification
    Seifert, Christin
    Lex, Elisabeth
    INFORMATION VISUALIZATION, IV 2009, PROCEEDINGS, 2009, : 490 - 495
  • [38] A Data Mining Approach to Rainfall Intensity Classification Using TRMM/TMI Data
    Chen, Shan-Tai
    Dou, Shung-Lin
    Chen, Wann-Jin
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2008, 12 (06) : 516 - 522
  • [39] Dry Bean and Anthracnose Development From Seeds With Varying Symptom Severity
    Halvorson, Jessica M.
    Lamppa, Robin S.
    Simons, Kristin
    Conner, Robert L.
    Pasche, Julie S.
    PLANT DISEASE, 2021, 105 (02) : 392 - 399
  • [40] Identification of Dry Bean Seeds Using PSO Feature Selection Technique
    Yasar, Ali
    2024 59TH INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATION, COMMUNICATION AND ENERGY SYSTEMS AND TECHNOLOGIES, ICEST 2024, 2024,