Using Machine Learning and Feature Selection for Alfalfa Yield Prediction

被引:31
|
作者
Whitmire, Christopher D. D. [1 ]
Vance, Jonathan M. M. [2 ]
Rasheed, Hend K. K.
Missaoui, Ali [3 ]
Rasheed, Khaled M. M. [1 ,2 ]
Maier, Frederick W. W. [1 ]
机构
[1] Univ Georgia, Inst Artificial Intelligence, 515 Boyd Grad Studies,200 DW Brooks Dr, Athens, GA 30602 USA
[2] Univ Georgia, Dept Comp Sci, 415 Boyd Grad Studies,200 D W Brooks Dr, Athens, GA 30602 USA
[3] Univ Georgia, Inst Plant Breeding Genet & Genom, Dept Crop & Soil Sci, 4317 Miller Plant Sci, Athens, GA 30602 USA
关键词
alfalfa; cross validation; feature selection; machine learning; regression; yield prediction;
D O I
10.3390/ai2010006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predicting alfalfa biomass and crop yield for livestock feed is important to the daily lives of virtually everyone, and many features of data from this domain combined with corresponding weather data can be used to train machine learning models for yield prediction. In this work, we used yield data of different alfalfa varieties from multiple years in Kentucky and Georgia, and we compared the impact of different feature selection methods on machine learning (ML) models trained to predict alfalfa yield. Linear regression, regression trees, support vector machines, neural networks, Bayesian regression, and nearest neighbors were all developed with cross validation. The features used included weather data, historical yield data, and the sown date. The feature selection methods that were compared included a correlation-based method, the ReliefF method, and a wrapper method. We found that the best method was the correlation-based method, and the feature set it found consisted of the Julian day of the harvest, the number of days between the sown and harvest dates, cumulative solar radiation since the previous harvest, and cumulative rainfall since the previous harvest. Using these features, the k-nearest neighbor and random forest methods achieved an average R value over 0.95, and average mean absolute error less than 200 lbs./acre. Our top R-2 of 0.90 beats a previous work's best R-2 of 0.87. Our primary contribution is the demonstration that ML, with feature selection, shows promise in predicting crop yields even on simple datasets with a handful of features, and that reporting accuracies in R and R-2 offers an intuitive way to compare results among various crops.
引用
收藏
页码:71 / 88
页数:18
相关论文
共 50 条
  • [31] Rice Yield Estimation Using Machine Learning and Feature Selection in Hilly and Mountainous Chongqing, China
    Fan, Li
    Fang, Shibo
    Fan, Jinlong
    Wang, Yan
    Zhan, Linqing
    He, Yongkun
    AGRICULTURE-BASEL, 2024, 14 (09):
  • [32] Performance Evaluation of Best Feature Subsets for Crop Yield Prediction Using Machine Learning Algorithms
    Gopal, Maya P. S.
    Bhargavi, R.
    APPLIED ARTIFICIAL INTELLIGENCE, 2019, 33 (07) : 621 - 642
  • [33] Feature Selection for Wheat Yield Prediction
    Russ, Georg
    Kruse, Rudolf
    RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVI: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XVII, 2010, : 465 - 478
  • [34] Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms with Feature Selection Techniques
    Habiba, Sultana Umme
    Tasnim, Farzana
    Chowdhury, Mohammad Saeed Hasan
    Islam, Md Khairul
    Nahar, Lutfun
    Mahmud, Tanjim
    Kaiser, M. Shamim
    Hossain, Mohammad Shahadat
    Andersson, Karl
    APPLIED INTELLIGENCE AND INFORMATICS, AII 2023, 2024, 2065 : 224 - 242
  • [35] Efficient prediction of coronary artery disease using machine learning algorithms with feature selection techniques
    Hassan, Md. Mehedi
    Zaman, Sadika
    Rahman, Md. Mushfiqur
    Bairagi, Anupam Kumar
    El-Shafai, Walid
    Rathore, Rajkumar Singh
    Gupta, Deepak
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 115
  • [36] Feature-selection based data prioritization in mobile traffic prediction using machine learning
    Yamada, Yoshinobu
    Shinkuma, Ryoichi
    Sato, Takehiro
    Oki, Eiji
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [37] Heart Diseases Prediction for Optimization based Feature Selection and Classification using Machine Learning Methods
    Rajinikanth, N.
    Pavithra, L.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (02) : 636 - 643
  • [38] Sarcopenia feature selection and risk prediction using machine learning A cross-sectional study
    Kang, Yang-Jae
    Yoo, Jun-Il
    Ha, Yong-chan
    MEDICINE, 2019, 98 (43)
  • [39] Obsolescence Prediction based on Joint Feature Selection and Machine Learning Techniques
    Trabelsi, Imen
    Zeddini, Besma
    Zolghadri, Marc
    Barkallah, Maher
    Haddar, Mohamed
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 787 - 794
  • [40] Feature selection in machine learning prediction systems for renewable energy applications
    Salcedo-Sanz, S.
    Cornejo-Bueno, L.
    Prieto, L.
    Paredes, D.
    Garcia-Herrera, R.
    RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2018, 90 : 728 - 741