On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper)

被引:12
|
作者
Mai, Gengchen [1 ]
Huang, Weiming [2 ]
Sun, Jin [3 ]
Song, Suhang [4 ]
Mishra, Deepak [5 ]
Liu, Ninghao [3 ]
Gao, Song [6 ]
Liu, Tianming [3 ]
Cong, Gao [2 ]
Hu, Yingjie [7 ]
Cundy, Chris [8 ]
Li, Ziyuan [9 ]
Zhu, Rui [10 ]
Lao, Ni [11 ]
机构
[1] Univ Georgia, Dept Geog, 210 Field St, Athens, GA USA
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Block N4,50 Nanyang Ave, Singapore, Singapore
[3] Univ Georgia Athens, Sch Comp, 415 Boyd Res & Educ Ctr, Athens, GA 30602 USA
[4] Univ Georgia, Coll Publ Hlth, Rhodes Hall,105 Spear Rd, Athens, GA 30602 USA
[5] Univ Georgia, Dept Geog, 210 Field St, Athens, GA 30602 USA
[6] Univ Wisconsin Madison, Dept Geog, Geospatial Data Sci Lab, Sci Hall,550 N Pk St, Madison, WI 53715 USA
[7] Univ Buffalo, Dept Geog, GeoAI Lab, Ste 105, Buffalo, NY 14261 USA
[8] Stanford Univ, Dept Comp Sci, 353 Jane Stanford Way, Stanford, CA 94305 USA
[9] Univ Connecticut, Sch Business, 2100 Hillside Rd, Storrs, CT 06269 USA
[10] Sch Geog Sci, Univ Rd, Bristol BS81SS, Avon, England
[11] Google, 1600 Amphitheatre Pkwy, Mountain View, CA 94043 USA
基金
美国国家科学基金会;
关键词
Foundation models; geospatial artificial intelligence; multimodal learning; GEOGRAPHICALLY WEIGHTED REGRESSION; URBAN LAND-USE; GEOSPATIAL SEMANTICS; HEALTH GEOGRAPHY; KNOWLEDGE GRAPH; TRAJECTORIES; LOCATION; CONTEXT; IMPACT; PLACE;
D O I
10.1145/3653070
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have not yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial domains, including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality, such as toponym recognition, location description recognition, and US state-level/county-level dementia time series forecasting, the task-agnostic large learning models (LLMs) can outperform task-specific fully supervised models in a zero-shot or few-shot learning setting. However, on other geospatial tasks, especially tasks that involve multiple data modalities (e.g., POI-based urban function classification, street view image-based urban noise intensity classification, and remote sensing image scene classification), existing FMs still underperform task-specific models. Based on these observations, we propose that one of the major challenges of developing an FM for GeoAI is to address the multimodal nature of geospatial tasks. After discussing the distinct challenges of each geospatial data modality, we suggest the possibility of a multimodal FM that can reason over various types of geospatial data through geospatial alignments. We conclude this article by discussing the unique risks and challenges to developing such a model for GeoAI.
引用
收藏
页数:46
相关论文
共 50 条
  • [1] Foundation models in ophthalmology: opportunities and challenges
    Sevgi, Mertcan
    Ruffell, Eden
    Antaki, Fares
    Chia, Mark A.
    Keane, Pearse A.
    CURRENT OPINION IN OPHTHALMOLOGY, 2025, 36 (01) : 90 - 98
  • [2] Foundation models meet visualizations: Challenges and opportunities
    Yang, Weikai
    Liu, Mengchen
    Wang, Zheng
    Liu, Shixia
    COMPUTATIONAL VISUAL MEDIA, 2024, 10 (03) : 399 - 424
  • [3] Human-centered GeoAI foundation models: where GeoAI meets human dynamics
    Xinyue Ye
    Jiaxin Du
    Xinyu Li
    Shih-Lung Shaw
    Yanjie Fu
    Xishuang Dong
    Zhe Zhang
    Ling Wu
    Urban Informatics, 4 (1):
  • [4] On opportunities and challenges of large multimodal foundation models in education
    Kuechemann, Stefan
    Avila, Karina E.
    Dinc, Yavuz
    Hortmann, Chiara
    Revenga, Natalia
    Ruf, Verena
    Stausberg, Niklas
    Steinert, Steffen
    Fischer, Frank
    Fischer, Martin
    Kasneci, Enkelejda
    Kasneci, Gjergji
    Kuhr, Thomas
    Kutyniok, Gitta
    Malone, Sarah
    Sailer, Michael
    Schmidt, Albrecht
    Stadler, Matthias
    Weller, Jochen
    Kuhn, Jochen
    NPJ SCIENCE OF LEARNING, 2025, 10 (01)
  • [5] Foundation models in smart agriculture: Basics, opportunities, and challenges
    Li, Jiajia
    Xu, Mingle
    Xiang, Lirong
    Chen, Dong
    Zhuang, Weichao
    Yin, Xunyuan
    Li, Zhaojian
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 222
  • [6] Low-Resource Vision Challenges for Foundation Models
    Zhang, Yunhua
    Doughty, Hazel
    Snoek, Cees G. M.
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 21956 - 21966
  • [7] Challenges and Opportunities in Neuro-Symbolic Composition of Foundation Models
    Jha, Susmit
    Roy, Anirban
    Cobb, Adam
    Berenbeim, Alexander
    Bastian, Nathaniel D.
    MILCOM 2023 - 2023 IEEE MILITARY COMMUNICATIONS CONFERENCE, 2023,
  • [8] Foundation Models of Scientific Knowledge for Chemistry: Opportunities, Challenges and Lessons Learned
    Horawalavithana, Sameera
    Ayton, Ellyn
    Sharma, Shivam
    Howland, Scott
    Subramanian, Megha
    Vasquez, Scott
    Cosbey, Robin
    Glenski, Maria
    Volkova, Svitlana
    PROCEEDINGS OF WORKSHOP ON CHALLENGES & PERSPECTIVES IN CREATING LARGE LANGUAGE MODELS (BIGSCIENCE EPISODE #5), 2022, : 160 - 172
  • [9] Rethinking Data-driven Networking with Foundation Models: Challenges and Opportunities
    Le, Franck
    Srivatsa, Mudhakar
    Ganti, Raghu
    Sekar, Vyas
    THE 21ST ACM WORKSHOP ON HOT TOPICS IN NETWORKS, HOTNETS 2022, 2022, : 188 - 197
  • [10] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
    Myers, Devon
    Mohawesh, Rami
    Chellaboina, Venkata Ishwarya
    Sathvik, Anantha Lakshmi
    Venkatesh, Praveen
    Ho, Yi-Hui
    Henshaw, Hanna
    Alhawawreh, Muna
    Berdik, David
    Jararweh, Yaser
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1 - 26