Multi-modal Representation Learning for Social Post Location Inference

被引:0
|
作者
Dai, RuiTing [1 ]
Luo, Jiayi [1 ]
Luo, Xucheng [1 ]
Mo, Lisi [1 ]
Ma, Wanlun [2 ]
Zhou, Fan [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China
[2] Swinburne Univ Technol, Melbourne, Vic, Australia
来源
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS | 2023年
关键词
Social geographic location; multi-modal social post dataset; multi-modal representation learning; multi-head attention mechanism; PREDICTION;
D O I
10.1109/ICC45041.2023.10279649
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Inferring geographic locations via social posts is essential for many practical location-based applications such as product marketing, point-of-interest recommendation, and infector tracking for COVID-19. Unlike image-based location retrieval or social-post text embedding-based location inference, the combined effect of multi-modal information (i.e., post images, text, and hashtags) for social post positioning receives less attention. In this work, we collect real datasets of social posts with images, texts, and hashtags from Instagram and propose a novel Multi-modal Representation Learning Framework (MRLF) capable of fusing different modalities of social posts for location inference. MRLF integrates a multi-head attention mechanism to enhance location-salient information extraction while significantly improving location inference compared with single domain-based methods. To overcome the noisy user-generated textual content, we introduce a novel attention-based character-aware module that considers the relative dependencies between characters of social post texts and hashtags for flexible multimodel information fusion. The experimental results show that MRLF can make accurate location predictions and open a new door to understanding the multi-modal data of social posts for online inference tasks.
引用
收藏
页码:6331 / 6336
页数:6
相关论文
共 50 条
  • [21] Learning Multi-Modal Word Representation Grounded in Visual Context
    Zablocki, Eloi
    Piwowarski, Benjamin
    Soulier, Laure
    Gallinari, Patrick
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5626 - 5633
  • [22] Multi-modal anchor adaptation learning for multi-modal summarization
    Chen, Zhongfeng
    Lu, Zhenyu
    Rong, Huan
    Zhao, Chuanjun
    Xu, Fan
    NEUROCOMPUTING, 2024, 570
  • [23] Multi-Modal Representation Learning with Text-Driven Soft Masks
    Park, Jaeyoo
    Han, Bohyung
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2798 - 2807
  • [24] SSDMM-VAE: variational multi-modal disentangled representation learning
    Mondal, Arnab Kumar
    Sailopal, Ajay
    Singla, Parag
    Ap, Prathosh
    APPLIED INTELLIGENCE, 2023, 53 (07) : 8467 - 8481
  • [25] MMEarth: Exploring Multi-modal Pretext Tasks for Geospatial Representation Learning
    Nedungadi, Vishal
    Kariryaa, Ankit
    Oehmcke, Stefan
    Belongie, Serge
    Igel, Christian
    Lang, Nico
    COMPUTER VISION - ECCV 2024, PT LXIV, 2025, 15122 : 164 - 182
  • [26] A Discriminant Information Theoretic Learning Framework for Multi-modal Feature Representation
    Gao, Lei
    Guan, Ling
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (03)
  • [27] Affective Interaction: Attentive Representation Learning for Multi-Modal Sentiment Classification
    Zhang, Yazhou
    Tiwari, Prayag
    Rong, Lu
    Chen, Rui
    Alnajem, Nojoom A.
    Hossain, M. Shamim
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (03)
  • [28] Understanding and Constructing Latent Modality Structures in Multi-Modal Representation Learning
    Jiang, Qian
    Chen, Changyou
    Zhao, Han
    Chen, Liqun
    Ping, Qing
    Tran, Son Dinh
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7661 - 7671
  • [29] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
    Xiao, Yun
    Huang, Yameng
    Li, Chenglong
    Liu, Lei
    Zhou, Aiwu
    Tang, Jin
    COGNITIVE COMPUTATION, 2023, 15 (06) : 1868 - 1883
  • [30] SSDMM-VAE: variational multi-modal disentangled representation learning
    Arnab Kumar Mondal
    Ajay Sailopal
    Parag Singla
    Prathosh AP
    Applied Intelligence, 2023, 53 : 8467 - 8481