Learning Multi-context Aware Location Representations from Large-scale Geotagged Images

被引：4

作者：

Yin, Yifang ^{[1
]}

Zhang, Ying ^{[2
]}

Liu, Zhenguang ^{[3
]}

Liang, Yuxuan ^{[1
]}

Wang, Sheng ^{[1
,4
]}

Shah, Rajiv Ratn ^{[5
]}

Zimmermann, Roger ^{[1
]}

机构：

[1] Natl Univ Singapore, Singapore, Singapore

[2] Northwestern Polytech Univ, Xian, Peoples R China

[3] Zhejiang Gongshang Univ, Hangzhou, Peoples R China

[4] Alibaba Grp, Singapore, Singapore

[5] IIIT Delhi, Delhi, India

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

关键词：

Location representations; pre-trained neural networks; attentionbased; fusion; geo-aware applications; FEATURES;

D O I：

10.1145/3474085.3475268

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the ubiquity of sensor-equipped smartphones, it is common to have multimedia documents uploaded to the Internet that have GPS coordinates associated with them. Utilizing such geotags as an additional feature is intuitively appealing for improving the performance of location-aware applications. However, raw GPS coordinates are fine-grained location indicators without any semantic information. Existing methods on geotag semantic encoding mostly extract hand-crafted, application-specific location representations that heavily depend on large-scale supplementary data and thus cannot perform efficiently on mobile devices. In this paper, we present a machine learning based approach, termed GPS2Vec+, which learns rich location representations by capitalizing on the world-wide geotagged images. Once trained, the model has no dependence on the auxiliary data anymore so it encodes geotags highly efficiently by inference. We extract visual and semantic knowledge from image content and user-generated tags, and transfer the information into locations by using geotagged images as a bridge. To adapt to different application domains, we further present an attention-based fusion framework that estimates the importance of the learnt location representations under different contexts for effective feature fusion. Our location representations yield significant performance improvements over the state-of-the-art geotag encoding methods on image classification and venue annotation.

引用

页码：899 / 907

页数：9

共 50 条

[1] Learning Local and Global Multi-Context Representations for Document Classification
Liu, Yi
Yuan, Hao
Ji, Shuiwang
2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1234 - 1239
[2] Adaptive multi-context cooperatively coevolving particle swarm optimization for large-scale problems
Ruo-Li Tang
Zhou Wu
Yan-Jun Fang
Soft Computing, 2017, 21 : 4735 - 4754
[3] Adaptive multi-context cooperatively coevolving particle swarm optimization for large-scale problems
Tang, Ruo-Li
Wu, Zhou
Fang, Yan-Jun
SOFT COMPUTING, 2017, 21 (16) : 4735 - 4754
[4] A Deep Learning Approach To Multi-Context Socially-Aware Navigation
Banisetty, Santosh Balajee
Rajamohan, Vineeth
Vega, Fausto
Feil-Seifer, David
2021 30TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2021, : 23 - 30
[5] Semi-supervised learning on large-scale geotagged photos for situation recognition
Tang, Mengfan
Nie, Feiping
Pongpaichet, Siripen
Jain, Ramesh
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 310 - 316
[6] Learning to Match Images in Large-Scale Collections
Cao, Song
Snavely, Noah
COMPUTER VISION - ECCV 2012: WORKSHOPS AND DEMONSTRATIONS, PT I, 2012, 7583 : 259 - 270
[7] Context Aware Help and Guidance for Large-Scale Public Spaces
Mahmud, Nasim
Luyten, Kris
Coninx, Karin
PROCEEDINGS 2009 FOURTH INTERNATIONAL WORKSHOP ON SEMANTIC MEDIA ADAPTATION AND PERSONALIZATION, 2009, : 105 - 110
[8] Learning Fused Representations for Large-Scale Multimodal Classification
Nawaz, Shah
Calefati, Alessandro
Janjua, Muhammad Kamran
Anwaar, Muhammad Umer
Gallo, Ignazio
IEEE SENSORS LETTERS, 2019, 3 (01)
[9] On Learning Semantic Representations for Large-Scale Abstract Sketches
Xu, Peng
Huang, Yongye
Yuan, Tongtong
Xiang, Tao
Hospedales, Timothy M.
Song, Yi-Zhe
Wang, Liang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (09) : 3366 - 3379
[10] A Data Distribution Model for Large-Scale Context Aware Systems
Chattopadhyay, Soumi
Banerjee, Ansuman
Banerjee, Nilanjan
MOBILE AND UBIQUITOUS SYSTEMS: COMPUTING, NETWORKING, AND SERVICES, 2014, 131 : 615 - 627

← 1 2 3 4 5 →