Application of training data affects success in broad-scale local climate zone mapping

被引:13
|
作者
Xu, Chunxue [1 ]
Hystad, Perry [2 ]
Chen, Rui [3 ]
Van Den Hoek, Jamon [1 ]
Hutchinson, Rebecca A. [4 ,5 ]
Hankey, Steve [6 ]
Kennedy, Robert [1 ]
机构
[1] Oregon State Univ, Coll Earth Ocean & Atmospher Sci, Corvallis, OR 97331 USA
[2] Oregon State Univ, Coll Publ Hlth & Human Sci, Corvallis, OR 97331 USA
[3] Tufts Univ, Dept Comp Sci, Medford, MA 02155 USA
[4] Oregon State Univ, Sch Elect Engn & Comp Sci, Corvallis, OR 97331 USA
[5] Oregon State Univ, Dept Fisheries Wildlife & Conservat Sci, Corvallis, OR 97331 USA
[6] VA Tech, Sch Publ & Int Affairs, Blacksburg, VA USA
关键词
Local climate zone; Machine learning; Training areas; Crowdsourced data; Spatial autocorrelation; DIFFERENCE WATER INDEX; SENTINEL-2; IMAGES; CROSS-VALIDATION; CLASSIFICATION; FOREST; NDWI;
D O I
10.1016/j.jag.2021.102482
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Satellite imagery has been widely used to map urbanization processes. To address the urgent need for urban landscape mapping that goes beyond urban footprint analysis, the local climate zone (LCZ) scheme has been increasingly used to reveal the urban forms and functions important to urban heat islands and micro-climates across the globe. As with most supervised classification strategies, proper application of training data is critical for the success of LCZ classification models. However, the collection and application of LCZ training areas brings with it two challenges that may affect mapping success. First, because digitizing training areas is a timeconsuming task, there is a broad effort in the LCZ mapping community to create a crowdsourced data collection among different experts. However, this strategy likely leads to inconsistencies in labels that could weaken models. Second, the LCZ labeling process typically involves the delineation of large zones from which multiple training samples are drawn, but those samples are likely spatially autocorrelated and lead to overly optimistic estimates of model accuracy. Although both effects - inconsistent labeling and spatial autocorrelation - are theoretically possible, it is unknown whether they substantially affect accuracy. We investigated both issues, specifically asking: (i) how do the discrepancies of LCZ labeling by different experts impact broad-scale LCZ mapping? (ii) to what extent does spatial correlation affect model prediction power? We used two classifiers (Random Forests and ResNets) to map eight metropolitan areas in the US into LCZs, comparing training areas drawn by different or consistent interpreters, and data splitting strategy using rules that allow or reduce spatial autocorrelation. We found large discrepancies among results built from crowdsourced training areas digitized by different experts; improving the consistency of labels can lead to substantial improvements in LCZ classification accuracy. Second, we found that spatial autocorrelation can boost the apparent accuracy of the classifier by 16% to 21%, leading to erroneous interpretation of mapping results. The two effects interplay as well: spatial auto correlation in the raw data can lead to an underestimation of the model's predictive error when modeling with crowdsourced training areas of high inconsistency. Due to the uncertainty in the labeling process and spatial autocorrelation in derived training data, broad-scale LCZ mapping results should be interpreted with caution.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Broad-scale patterns in local diversity of marine benthic harpacticoid copepods (Crustacea)
    Azovsky, Andrey I.
    Garlitska, Lesya A.
    Chertoprud, Elena S.
    MARINE ECOLOGY PROGRESS SERIES, 2012, 460 : 63 - U74
  • [22] Analysis of UK river restoration using broad-scale data sets
    Smith, Benjamin
    Clifford, Nicholas J.
    Mant, Jenny
    WATER AND ENVIRONMENT JOURNAL, 2014, 28 (04) : 490 - 501
  • [23] Data-intensive science applied to broad-scale citizen science
    Hochachka, Wesley M.
    Fink, Daniel
    Hutchinson, Rebecca A.
    Sheldon, Daniel
    Wong, Weng-Keen
    Kelling, Steve
    TRENDS IN ECOLOGY & EVOLUTION, 2012, 27 (02) : 130 - 137
  • [24] Use of ENVISAT ASAR Global Monitoring Mode to complement optical data in the mapping of rapid broad-scale flooding in Pakistan
    O'Grady, D.
    Leblanc, M.
    Gillieson, D.
    HYDROLOGY AND EARTH SYSTEM SCIENCES, 2011, 15 (11) : 3475 - 3494
  • [25] On the efficiency of indicator species for broad-scale monitoring of bird diversity across climate conditions
    Terrigeol, Alexandre
    Ebouele, Sergio Ewane
    Darveau, Marcel
    Herbert, Christian
    Rivest, Louis-Paul
    Fortin, Daniel
    ECOLOGICAL INDICATORS, 2022, 137
  • [26] Broad-scale wood degradation dynamics in the face of climate change: A meta-analysis
    Chagnon, Catherine
    Moreau, Guillaume
    Bombardier-Cauffope, Christine
    Barrette, Julie
    Havreljuk, Filip
    Achim, Alexis
    GLOBAL CHANGE BIOLOGY BIOENERGY, 2022, 14 (08): : 941 - 958
  • [27] Broad-Scale Analysis Contradicts the Theory That Generation Time Affects Molecular Evolutionary Rates in Plants
    Carrie-Ann Whittle
    Mark O. Johnston
    Journal of Molecular Evolution, 2003, 56 : 223 - 233
  • [28] Predictions and tests of climate-based hypotheses of broad-scale variation in taxonomic richness
    Currie, DJ
    Mittelbach, GG
    Cornell, HV
    Field, R
    Guégan, JF
    Hawkins, BA
    Kaufman, DM
    Kerr, JT
    Oberdorff, T
    O'Brien, E
    Turner, JRG
    ECOLOGY LETTERS, 2004, 7 (12) : 1121 - 1134
  • [29] Broad-scale analysis contradicts the theory that generation time affects molecular evolutionary rates in plants
    Whittle, CA
    Johnston, MO
    JOURNAL OF MOLECULAR EVOLUTION, 2003, 56 (02) : 223 - 233
  • [30] Estimating the movements of terrestrial animal populations using broad-scale occurrence data
    Sarah R. Supp
    Gil Bohrer
    John Fieberg
    Frank A. La Sorte
    Movement Ecology, 9