Data mining approach for dry bean seeds classification

被引:8
|
作者
Macuacua, Jaime Carlos [1 ]
Centeno, Jorge Antonio Silva [1 ]
Amisse, Caisse [2 ]
机构
[1] Univ Fed Parana, Geomat Dept, Postgrad Program Geodet Sci, Curitiba, Brazil
[2] Rovuma Univ, Nampula, Mozambique
来源
关键词
Data mining; Machine learning; Hyperparameter optimization; SMOTE technique; Dry bean seeds; SMOTE;
D O I
10.1016/j.atech.2023.100240
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
Product quality certification is an important process in agricultural production and productivity. Traditional methods for seed quality classification have shown limitations such as complex steps, low precision, and slow inspection for large production volumes. Automatic classification techniques based on machine learning and computer vision offer fast and high throughput solutions. Despite the major advances in state-of-the-art automatic classification models, there is still a need to improve these models by incorporating other techniques. In this article, we developed a computer vision system for the automatic classification of different seed varieties based on machine learning models, combined with data mining techniques using a set of features related to the geometry of bean seeds, extracted from binary images. Three machine learning techniques were compared, namely: Random Forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN), including Principal Component Analysis (PCA), Hyperparameter tuning in machine learning algorithms, and dataset balancing based on Synthetic Minority Oversampling Technique (SMOTE). The results showed that data mining processes, such as Principal Component Analysis, Hyperparameter tuning, and application of the SMOTE technique, help to improve the quality of classification results. The KNN classifier showed better performance, with around 95% accuracy and 96% precision and recall. The best results were obtained applying hyperparameter tuning and the SMOTE technique, in the preprocessing step, obtaining an increase around 2.6%. The results proved that the combined use of data mining in the preprocessing step and machine learning classification methods can effectively and efficiently increase the classification accuracy and help automatic bean seed selection based on digital images. This can help small farmers and/or agricultural managers make decisions regarding seed selection to increase production.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Storage of dry bean seeds coated with polymers and treated with fungicides
    Pires, LL
    Bragantini, C
    Costa, JLD
    PESQUISA AGROPECUARIA BRASILEIRA, 2004, 39 (07) : 709 - 715
  • [22] AN ENHANCED OPTIMIZATION APPROACH FOR IMPROVING CLASSIFICATION ACCURACY IN DATA MINING
    Krubakaran, Chidambaranathan
    Venkatachalapathy, Kaliyappan
    IIOAB JOURNAL, 2020, 11 (02) : 78 - 84
  • [23] Computational intelligence approach for gene expression data mining and classification
    Wang, ZY
    Kung, SY
    Zhang, JY
    Khan, J
    Xuan, JH
    Wang, Y
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 449 - 452
  • [24] Sentiment-Based Data Mining Approach for Classification and Analysis
    Vashi, Viral
    Babu, L. D. Dhinesh
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT, ICT4SD 2015, VOL 1, 2016, 408 : 581 - 595
  • [25] A structural data mining approach for the classification of secondary RNA structure
    Lam, Winnie W. M.
    Chan, Keith C. C.
    2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4759 - 4762
  • [26] A Data Mining Approach for Sleep Wave and Sleep Stage Classification
    Swetapadma, Aleena
    Swain, Brijesh Raj
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 916 - 921
  • [27] Static Cycling Postures Classification Analysis: A Data Mining Approach
    Zakarria, Noor Syuhadah
    Ping, Loh Wei
    4TH INNOVATION AND ANALYTICS CONFERENCE & EXHIBITION (IACE 2019), 2019, 2138
  • [28] A novel switching function approach for data mining classification problems
    Mohammed Hussein Ibrahim
    Mehmet Hacibeyoglu
    Soft Computing, 2020, 24 : 4941 - 4957
  • [29] Mining the data from a hyperheuristic approach using associative classification
    Thabtah, Fadi
    Cowling, Peter
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (02) : 1093 - 1101
  • [30] An Efficient Approach to Book Review Mining Using Data Classification
    Harvinder
    Soni, Devpriya
    Madan, Shipra
    EMERGING ICT FOR BRIDGING THE FUTURE, VOL 2, 2015, 338 : 629 - 636