Multi-modal Representation Learning for Social Post Location Inference

被引：0

作者：

Dai, RuiTing ^{[1
]}

Luo, Jiayi ^{[1
]}

Luo, Xucheng ^{[1
]}

Mo, Lisi ^{[1
]}

Ma, Wanlun ^{[2
]}

Zhou, Fan ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China

[2] Swinburne Univ Technol, Melbourne, Vic, Australia

来源：

ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS | 2023年

关键词：

Social geographic location; multi-modal social post dataset; multi-modal representation learning; multi-head attention mechanism; PREDICTION;

D O I：

10.1109/ICC45041.2023.10279649

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

Inferring geographic locations via social posts is essential for many practical location-based applications such as product marketing, point-of-interest recommendation, and infector tracking for COVID-19. Unlike image-based location retrieval or social-post text embedding-based location inference, the combined effect of multi-modal information (i.e., post images, text, and hashtags) for social post positioning receives less attention. In this work, we collect real datasets of social posts with images, texts, and hashtags from Instagram and propose a novel Multi-modal Representation Learning Framework (MRLF) capable of fusing different modalities of social posts for location inference. MRLF integrates a multi-head attention mechanism to enhance location-salient information extraction while significantly improving location inference compared with single domain-based methods. To overcome the noisy user-generated textual content, we introduce a novel attention-based character-aware module that considers the relative dependencies between characters of social post texts and hashtags for flexible multimodel information fusion. The experimental results show that MRLF can make accurate location predictions and open a new door to understanding the multi-modal data of social posts for online inference tasks.

引用

页码：6331 / 6336

页数：6

共 50 条

[21] Learning Multi-Modal Word Representation Grounded in Visual Context
Zablocki, Eloi
Piwowarski, Benjamin
Soulier, Laure
Gallinari, Patrick
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5626 - 5633
[22] Multi-modal anchor adaptation learning for multi-modal summarization
Chen, Zhongfeng
Lu, Zhenyu
Rong, Huan
Zhao, Chuanjun
Xu, Fan
NEUROCOMPUTING, 2024, 570
[23] Multi-Modal Representation Learning with Text-Driven Soft Masks
Park, Jaeyoo
Han, Bohyung
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2798 - 2807
[24] SSDMM-VAE: variational multi-modal disentangled representation learning
Mondal, Arnab Kumar
Sailopal, Ajay
Singla, Parag
Ap, Prathosh
APPLIED INTELLIGENCE, 2023, 53 (07) : 8467 - 8481
[25] MMEarth: Exploring Multi-modal Pretext Tasks for Geospatial Representation Learning
Nedungadi, Vishal
Kariryaa, Ankit
Oehmcke, Stefan
Belongie, Serge
Igel, Christian
Lang, Nico
COMPUTER VISION - ECCV 2024, PT LXIV, 2025, 15122 : 164 - 182
[26] A Discriminant Information Theoretic Learning Framework for Multi-modal Feature Representation
Gao, Lei
Guan, Ling
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (03)
[27] Affective Interaction: Attentive Representation Learning for Multi-Modal Sentiment Classification
Zhang, Yazhou
Tiwari, Prayag
Rong, Lu
Chen, Rui
Alnajem, Nojoom A.
Hossain, M. Shamim
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (03)
[28] Understanding and Constructing Latent Modality Structures in Multi-Modal Representation Learning
Jiang, Qian
Chen, Changyou
Zhao, Han
Chen, Liqun
Ping, Qing
Tran, Son Dinh
Xu, Yi
Zeng, Belinda
Chilimbi, Trishul
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7661 - 7671
[29] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
Xiao, Yun
Huang, Yameng
Li, Chenglong
Liu, Lei
Zhou, Aiwu
Tang, Jin
COGNITIVE COMPUTATION, 2023, 15 (06) : 1868 - 1883
[30] SSDMM-VAE: variational multi-modal disentangled representation learning
Arnab Kumar Mondal
Ajay Sailopal
Parag Singla
Prathosh AP
Applied Intelligence, 2023, 53 : 8467 - 8481

← 1 2 3 4 5 →