Learning Multi-context Aware Location Representations from Large-scale Geotagged Images

被引:4
|
作者
Yin, Yifang [1 ]
Zhang, Ying [2 ]
Liu, Zhenguang [3 ]
Liang, Yuxuan [1 ]
Wang, Sheng [1 ,4 ]
Shah, Rajiv Ratn [5 ]
Zimmermann, Roger [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Northwestern Polytech Univ, Xian, Peoples R China
[3] Zhejiang Gongshang Univ, Hangzhou, Peoples R China
[4] Alibaba Grp, Singapore, Singapore
[5] IIIT Delhi, Delhi, India
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
关键词
Location representations; pre-trained neural networks; attentionbased; fusion; geo-aware applications; FEATURES;
D O I
10.1145/3474085.3475268
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the ubiquity of sensor-equipped smartphones, it is common to have multimedia documents uploaded to the Internet that have GPS coordinates associated with them. Utilizing such geotags as an additional feature is intuitively appealing for improving the performance of location-aware applications. However, raw GPS coordinates are fine-grained location indicators without any semantic information. Existing methods on geotag semantic encoding mostly extract hand-crafted, application-specific location representations that heavily depend on large-scale supplementary data and thus cannot perform efficiently on mobile devices. In this paper, we present a machine learning based approach, termed GPS2Vec+, which learns rich location representations by capitalizing on the world-wide geotagged images. Once trained, the model has no dependence on the auxiliary data anymore so it encodes geotags highly efficiently by inference. We extract visual and semantic knowledge from image content and user-generated tags, and transfer the information into locations by using geotagged images as a bridge. To adapt to different application domains, we further present an attention-based fusion framework that estimates the importance of the learnt location representations under different contexts for effective feature fusion. Our location representations yield significant performance improvements over the state-of-the-art geotag encoding methods on image classification and venue annotation.
引用
收藏
页码:899 / 907
页数:9
相关论文
共 50 条
  • [31] Learning to Associate Words and Images Using a Large-scale Graph
    Ya, Heqing
    Sun, Haonan
    Helt, Jeffrey
    Lee, Tai Sing
    2017 14TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2017), 2017, : 16 - 23
  • [32] Revisiting Document Representations for Large-Scale Zero-Shot Learning
    Kil, Jihyung
    Chao, Wei-Lun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3117 - 3128
  • [33] Learning improvement representations to accelerate evolutionary large-scale multiobjective optimization
    Liu, Songbai
    Wang, Zeyi
    Ma, Lijia
    Chen, Jianyong
    Zhou, Xun
    INFORMATION SCIENCES, 2025, 705
  • [34] RoboNet: Large-Scale Multi-Robot Learning
    Dasari, Sudeep
    Ebert, Frederik
    Tian, Stephen
    Nair, Suraj
    Bucher, Bernadette
    Schmeckpeper, Karl
    Singh, Siddharth
    Levine, Sergey
    Finn, Chelsea
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [35] Large-scale multi-label classification using unknown streaming images: Large-scale multi-label classification using unknown streaming images
    Zhang Y.
    Wang Y.
    Liu X.-Y.
    Mi S.
    Zhang M.-L.
    Pattern Recognition, 2020, 99
  • [36] Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion
    Liao, Zhen
    Jiang, Daxin
    Chen, Enhong
    Pei, Jian
    Cao, Huanhuan
    Li, Hang
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2012, 3 (01)
  • [37] Topology-aware Sparse Allreduce for Large-scale Deep Learning
    Thao Nguyen Truong
    Wahib, Mohamed
    Takano, Ryousei
    2019 IEEE 38TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2019,
  • [38] A Multi-Context Aware Human Mobility Prediction Model Based on Motif-Preserving Travel Preference Learning
    Chen, Yong
    Xie, Ningke
    Xu, Haoge
    Chen, Xiqun
    Lee, Der-Horng
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (02) : 2139 - 2152
  • [39] Learning a gaze estimator with neighbor selection from large-scale synthetic eye images
    Wang, Yafei
    Zhao, Tongtong
    Ding, Xueyan
    Peng, Jinjia
    Bian, Jiming
    Fu, Xianping
    KNOWLEDGE-BASED SYSTEMS, 2018, 139 : 41 - 49
  • [40] Learning Visual Balance from Large-scale Datasets of Aesthetically Highly Rated Images
    Jahanian, Ali
    Vishwanathan, S. V. N.
    Allebach, Jan P.
    HUMAN VISION AND ELECTRONIC IMAGING XX, 2015, 9394