Assessing temporal data partitioning scenarios for estimating reference evapotranspiration with machine learning techniques in arid regions

被引:27
|
作者
Kazemi, Mohammad Hossein [1 ]
Shiri, Jalal [1 ,2 ]
Marti, Pau [3 ]
Majnooni-Heris, Abolfazl [1 ]
机构
[1] Univ Tabriz, Fac Agr, Water Engn Dept, Tabriz, Iran
[2] Univ Tabriz, Fac Civil Engn, Ctr Excellence Hydroinformat, Tabriz, Iran
[3] Univ Illes Balears, Area Engn Agroforestal, Carretera Valldemossa Km 7-5, Palma De Mallorca 07022, Spain
关键词
Evapotranspiration; Gene expression programming; Hold out; K-fold validation; MODELING REFERENCE EVAPOTRANSPIRATION; NEURAL-NETWORKS; TIME-SERIES; TEMPERATURE; ALGORITHMS; STRATEGIES; EQUATIONS; SELECTION;
D O I
10.1016/j.jhydrol.2020.125252
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Recently, data driven machine learning techniques has been widely applied for modeling reference evapotranspiration (ETo) values under various climatic conditions taking into account the different number of sites and available data length. A major issue with applying those models is the proper selection of training/testing data sets. Although some spatial generalization approaches have been recommended for this purpose, there are no specified recommended local (temporal) data partitioning strategies for machine learning based ETo estimation. The present study evaluates different hold-out and k-fold validation temporal data partitioning strategies when using gene expression programming (GEP) technique to estimate daily ETo in arid regions. The k-fold validation strategies considered annual, monthly and growing season period patterns as test data sets. Although commonly used partitioning of the available patterns into training and testing sets gave accurate results, statistical analysis showed that the results obtained through k-fold validation assessment were more reliable. A two-block partitioning strategy with chronologic data selection for training and testing provided the most accurate results among the hold-out procedures (mean scatter index (SI) value of 0.162). Fixing the extreme ETo values as training data set in hold-out procedures provided the less accurate results with considerable over/underestimation of the ETo values (mean SI value was 0.506). Results on the basis of hold-out approaches can be biased or only partially valid depending on selection of the test data from the time series. K-fold validation yielded the lowest over/underestimations of ETo values. Further, considering monthly patterns as minimum affordable test size produced higher error magnitudes among k-fold validation strategies, while considering the complete patterns of one growing season provided more accurate results among k-fold validation strategies.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Performance Evaluation of Five Machine Learning Algorithms for Estimating Reference Evapotranspiration in an Arid Climate
    Raza, Ali
    Fahmeed, Romana
    Syed, Neyha Rubab
    Katipoglu, Okan Mert
    Zubair, Muhammad
    Alshehri, Fahad
    Elbeltagi, Ahmed
    WATER, 2023, 15 (21)
  • [2] Reference evapotranspiration estimation using machine learning approaches for arid and semi-arid regions of India
    Heramb, Pangam
    Rao, K. V.
    Subeesh, A.
    Singh, R. K.
    Rajwade, Yogesh A.
    Singh, Karan
    Kumar, Manoj
    Rawat, Shashi
    CLIMATE RESEARCH, 2023, 91 : 97 - 120
  • [3] Evaluation of Hargreaves equations for estimating of reference evapotranspiration in semiarid and arid regions
    Zarraty, Ali Reza
    Esmaeili, Yaser
    Jafarzadeh, Mehdi
    Ghandi, Amir
    Heydari, Mohammad Mehdi
    INTERNATIONAL JOURNAL OF ADVANCED AND APPLIED SCIENCES, 2015, 2 (11): : 12 - 21
  • [4] Empirical and learning machine approaches to estimating reference evapotranspiration based on temperature data
    Reis, Matheus Mendes
    da Silva, Ariovaldo Jose
    Zullo Junior, Jurandir
    Tuffi Santos, Leonardo David
    Azevedo, Alcinei Mistico
    Goncalves Lopes, Erika Manuela
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 165
  • [5] Spatial distribution and temporal variation of reference evapotranspiration in arid and semi-arid regions of Iran
    Tabari, Hossein
    Aeini, Ali
    Talaee, P. Hosseinzadeh
    Some'e, B. Shifteh
    HYDROLOGICAL PROCESSES, 2012, 26 (04) : 500 - 512
  • [6] Evaluation of different data management scenarios for estimating daily reference evapotranspiration
    Shiri, Jalal
    Sadraddini, Ali Ashraf
    Nazemi, Amir Hossein
    Kisi, Ozgur
    Marti, Pau
    Fard, Ahmad Fakheri
    Landeras, Gorka
    HYDROLOGY RESEARCH, 2013, 44 (06): : 1058 - 1070
  • [7] Calibration of Hargreaves-Samani equation for estimating reference evapotranspiration in semiarid and arid regions
    Heydari, Mohammad Mehdi
    Heydari, Morteza
    ARCHIVES OF AGRONOMY AND SOIL SCIENCE, 2014, 60 (05) : 695 - 713
  • [8] Evaluation and calibration of Blaney–Criddle equation for estimating reference evapotranspiration in semiarid and arid regions
    Mohammad Mehdi Heydari
    Ali Tajamoli
    Seyyed Hojjat Ghoreishi
    Masoud Khodabakhshi Darbe-Esfahani
    Hadi Gilasi
    Environmental Earth Sciences, 2015, 74 : 4053 - 4063
  • [9] Linear Regression Machine Learning Algorithms for Estimating Reference Evapotranspiration Using Limited Climate Data
    Kim, Soo-Jin
    Bae, Seung-Jong
    Jang, Min-Won
    SUSTAINABILITY, 2022, 14 (18)
  • [10] Machine Learning and Conventional Methods for Reference Evapotranspiration Estimation Using Limited-Climatic-Data Scenarios
    dos Santos, Pietros Andre Balbino
    Schwerz, Felipe
    de Carvalho, Luiz Gonsaga
    Baptista, Victor Buono da Silva
    Marin, Diego Bedin
    Ferraz, Gabriel Araujo e Silva
    Rossi, Giuseppe
    Conti, Leonardo
    Bambi, Gianluca
    AGRONOMY-BASEL, 2023, 13 (09):