Multi-modal Representation Learning for Social Post Location Inference

被引：0

作者：

Dai, RuiTing ^{[1
]}

Luo, Jiayi ^{[1
]}

Luo, Xucheng ^{[1
]}

Mo, Lisi ^{[1
]}

Ma, Wanlun ^{[2
]}

Zhou, Fan ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China

[2] Swinburne Univ Technol, Melbourne, Vic, Australia

来源：

ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS | 2023年

关键词：

Social geographic location; multi-modal social post dataset; multi-modal representation learning; multi-head attention mechanism; PREDICTION;

D O I：

10.1109/ICC45041.2023.10279649

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

Inferring geographic locations via social posts is essential for many practical location-based applications such as product marketing, point-of-interest recommendation, and infector tracking for COVID-19. Unlike image-based location retrieval or social-post text embedding-based location inference, the combined effect of multi-modal information (i.e., post images, text, and hashtags) for social post positioning receives less attention. In this work, we collect real datasets of social posts with images, texts, and hashtags from Instagram and propose a novel Multi-modal Representation Learning Framework (MRLF) capable of fusing different modalities of social posts for location inference. MRLF integrates a multi-head attention mechanism to enhance location-salient information extraction while significantly improving location inference compared with single domain-based methods. To overcome the noisy user-generated textual content, we introduce a novel attention-based character-aware module that considers the relative dependencies between characters of social post texts and hashtags for flexible multimodel information fusion. The experimental results show that MRLF can make accurate location predictions and open a new door to understanding the multi-modal data of social posts for online inference tasks.

引用

页码：6331 / 6336

页数：6

共 50 条

[41] Multi-modal Alignment using Representation Codebook
Duan, Jiali
Chen, Liqun
Tran, Son
Yang, Jinyu
Xu, Yi
Zeng, Belinda
Chilimbi, Trishul
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15630 - 15639
[42] Deep multi-modal learning for joint linear representation of nonlinear dynamical systems
Qian, Shaodi
Chou, Chun-An
Li, Jr-Shin
SCIENTIFIC REPORTS, 2022, 12 (01)
[43] Multi-modal Relation Distillation for Unified 3D Representation Learning
Wang, Huiqun
Bao, Yiping
Pan, Panwang
Li, Zeming
Liu, Xiao
Yang, Ruijie
Huang, Di
COMPUTER VISION - ECCV 2024, PT XXXIII, 2025, 15091 : 364 - 381
[44] Exploiting Multi-modal Fusion for Robust Face Representation Learning with Missing Modality
Zhu, Yizhe
Sun, Xin
Zhou, Xi
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 283 - 294
[45] Molecular Joint Representation Learning via Multi-Modal Information of SMILES and Graphs
Wu, Tianyu
Tang, Yang
Sun, Qiyu
Xiong, Luolin
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (05) : 3044 - 3055
[46] MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning
Lu, Xinyu
Wang, Lifang
Jiang, Zejun
He, Shichang
Liu, Shizhong
APPLIED INTELLIGENCE, 2022, 52 (07) : 7480 - 7497
[47] MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning
Xinyu Lu
Lifang Wang
Zejun Jiang
Shichang He
Shizhong Liu
Applied Intelligence, 2022, 52 : 7480 - 7497
[48] Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Liang, Weixin
Zhang, Yuhui
Kwon, Yongchan
Yeung, Serena
Zou, James
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[49] Representation learning using step-based deep multi-modal autoencoders
Bhatt, Gaurav
Jha, Piyush
Raman, Balasubramanian
PATTERN RECOGNITION, 2019, 95 : 12 - 23
[50] MutualFormer: Multi-modal Representation Learning via Cross-Diffusion Attention
Wang, Xixi
Wang, Xiao
Jiang, Bo
Tang, Jin
Luo, Bin
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 3867 - 3888

← 1 2 3 4 5 →