Enhancing crop yield prediction in Senegal using advanced machine learning techniques and synthetic data

被引:1
|
作者
Razavi, Mohammad Amin [1 ]
Nejadhashemi, A. Pouyan [2 ]
Majidi, Babak [3 ]
Razavi, Hoda S. [2 ]
Kpodo, Josue [2 ,4 ]
Eeswaran, Rasu [2 ,5 ,6 ]
Ciampitti, Ignacio [6 ]
Prasad, P. V. Vara [7 ]
机构
[1] Univ Tehran, Sch Elect & Comp Engn, Tehran, Iran
[2] Michigan State Univ, Dept Biosyst & Agr Engn, E Lansing, MI 48824 USA
[3] Khatam Univ, Dept Comp Engn, Tehran, Iran
[4] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI USA
[5] Univ Jaffna, Fac Agr, Dept Agron, Kilinochchi, Sri Lanka
[6] Kansas State Univ, Dept Agron, Manhattan, KS USA
[7] Kansas State Univ, Feed Future Sustainable Intensificat Innovat Lab, Manhattan, KS USA
来源
基金
美国食品与农业研究所;
关键词
Crop yield prediction; Variational auto encoder; Pattern recognition on spatiotemporal and; physiographical variables; Synthetic tabular data generation; Ensemble learning; INTERPOLATION METHODS; CLIMATE-CHANGE; AGRICULTURE; MANAGEMENT; SYSTEMS;
D O I
10.1016/j.aiia.2024.11.005
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
In this study, we employ advanced data-driven techniques to investigate the complex relationships between the yields of five major crops and various geographical and spatiotemporal features in Senegal. We analyze how these features influence crop yields by utilizing remotely sensed data. Our methodology incorporates clustering algorithms and correlation matrix analysis to identify significant patterns and dependencies, offering a comprehensive understanding of the factors affecting agricultural productivity in Senegal. To optimize the model's performance and identify the optimal hyperparameters, we implemented a comprehensive grid search across four distinct machine learning regressors: Random Forest, Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Light Gradient-Boosting Machine (LightGBM). Each regressor offers unique functionalities, enhancing our exploration of potential model configurations. The top-performing models were selected based on evaluating multiple performance metrics, ensuring robust and accurate predictive capabilities. The results demonstrated that XGBoost and CatBoost perform better than the other two. We introduce synthetic crop data generated using a Variational Auto Encoder to address the challenges posed by limited agricultural datasets. By achieving high similarity scores with real-world data, our synthetic samples enhance model robustness, mitigate overfitting, and provide a viable solution for small dataset issues in agriculture. Our approach distinguishes itself by creating a flexible model applicable to various crops together. By integrating five crop datasets and generating high-quality synthetic data, we improve model performance, reduce overfitting, and enhance realism. Our findings provide crucial insights for productivity drivers in key cropping systems, enabling robust recommendations and strengthening the decision-making capabilities of policymakers and farmers in datascarce regions. (c) 2024 The Authors. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:99 / 114
页数:16
相关论文
共 50 条
  • [41] Machine Learning and Synthetic Minority Oversampling Techniques for Imbalanced Data: Improving Machine Failure Prediction
    Wah, Yap Bee
    Ismail, Azlan
    Azid, Nur Niswah Naslina
    Jaafar, Jafreezal
    Aziz, Izzatdin Abdul
    Hasan, Mohd Hilmi
    Zain, Jasni Mohamad
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (03): : 4821 - 4841
  • [42] Lung cancer prediction using machine learning and advanced imaging techniques
    Kadir, Timor
    Gleeson, Fergus
    TRANSLATIONAL LUNG CANCER RESEARCH, 2018, 7 (03) : 304 - 312
  • [43] Crop Yield Prediction Using Deep Learning
    Jeny, J. R. V.
    Divya, Phulari
    Varsha, Kolanu
    Mrunalini, Anantha
    Irfan, S. K. M.
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, MACHINE LEARNING AND APPLICATIONS, VOL 1, ICDSMLA 2023, 2025, 1273 : 1192 - 1199
  • [44] Enhancing Crop Yield Prediction Utilizing Machine Learning on Satellite-Based Vegetation Health Indices
    Pham, Hoa Thi
    Awange, Joseph
    Kuhn, Michael
    Nguyen, Binh Van
    Bui, Luyen K.
    SENSORS, 2022, 22 (03)
  • [45] New Approach to Enhancing Student Performance Prediction Using Machine Learning Techniques and Clickstream Data in Virtual Learning Environments
    Zakaria Khoudi
    Nasereddine Hafidi
    Mourad Nachaoui
    Soufiane Lyaqini
    SN Computer Science, 6 (2)
  • [46] CROP YIELD PREDICTION USING INTEGRATION OF POLARIMTERIC SYNTHETIC APERTURE RADAR AND OPTICAL DATA
    Hosseini, Mehdi
    Becker-Reshef, Inbal
    Sahajpal, Ritvik
    Fontana, Lucas
    Lafluf, Pedro
    Leale, Guillermo
    Puricelli, Estefania
    Varela, Mauricio
    Justice, Chris
    2020 IEEE INDIA GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (INGARSS), 2020, : 17 - 20
  • [47] Prediction of crop yield using big data
    Wu Fan
    Chen Chong
    Guo Xiaoling
    Yu Hua
    Wang Juyun
    2015 8TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1, 2015, : 255 - 260
  • [48] Crop Yield Prediction Based on Bacterial Biomarkers and Machine Learning
    Ma, Li
    Niu, Wenquan
    Li, Guochun
    Du, Yadan
    Sun, Jun
    Siddique, Kadambot H. M.
    JOURNAL OF SOIL SCIENCE AND PLANT NUTRITION, 2024, 24 (02) : 2798 - 2814
  • [49] A Comprehensive Review of Crop Yield Prediction Using Machine Learning Approaches With Special Emphasis on Palm Oil Yield Prediction
    Rashid, Mamunur
    Bari, Bifta Sama
    Yusup, Yusri
    Kamaruddin, Mohamad Anuar
    Khan, Nuzhat
    IEEE ACCESS, 2021, 9 : 63406 - 63439
  • [50] A Hybrid Approach to Tea Crop Yield Prediction Using Simulation Models and Machine Learning
    Batool, Dania
    Shahbaz, Muhammad
    Asif, Hafiz Shahzad
    Shaukat, Kamran
    Alam, Talha Mahboob
    Hameed, Ibrahim A.
    Ramzan, Zeeshan
    Waheed, Abdul
    Aljuaid, Hanan
    Luo, Suhuai
    PLANTS-BASEL, 2022, 11 (15):