Multi-modal Representation Learning for Social Post Location Inference

被引:0
|
作者
Dai, RuiTing [1 ]
Luo, Jiayi [1 ]
Luo, Xucheng [1 ]
Mo, Lisi [1 ]
Ma, Wanlun [2 ]
Zhou, Fan [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China
[2] Swinburne Univ Technol, Melbourne, Vic, Australia
关键词
Social geographic location; multi-modal social post dataset; multi-modal representation learning; multi-head attention mechanism; PREDICTION;
D O I
10.1109/ICC45041.2023.10279649
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Inferring geographic locations via social posts is essential for many practical location-based applications such as product marketing, point-of-interest recommendation, and infector tracking for COVID-19. Unlike image-based location retrieval or social-post text embedding-based location inference, the combined effect of multi-modal information (i.e., post images, text, and hashtags) for social post positioning receives less attention. In this work, we collect real datasets of social posts with images, texts, and hashtags from Instagram and propose a novel Multi-modal Representation Learning Framework (MRLF) capable of fusing different modalities of social posts for location inference. MRLF integrates a multi-head attention mechanism to enhance location-salient information extraction while significantly improving location inference compared with single domain-based methods. To overcome the noisy user-generated textual content, we introduce a novel attention-based character-aware module that considers the relative dependencies between characters of social post texts and hashtags for flexible multimodel information fusion. The experimental results show that MRLF can make accurate location predictions and open a new door to understanding the multi-modal data of social posts for online inference tasks.
引用
收藏
页码:6331 / 6336
页数:6
相关论文
共 50 条
  • [41] Multi-modal Alignment using Representation Codebook
    Duan, Jiali
    Chen, Liqun
    Tran, Son
    Yang, Jinyu
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15630 - 15639
  • [42] Deep multi-modal learning for joint linear representation of nonlinear dynamical systems
    Qian, Shaodi
    Chou, Chun-An
    Li, Jr-Shin
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [43] Multi-modal Relation Distillation for Unified 3D Representation Learning
    Wang, Huiqun
    Bao, Yiping
    Pan, Panwang
    Li, Zeming
    Liu, Xiao
    Yang, Ruijie
    Huang, Di
    COMPUTER VISION - ECCV 2024, PT XXXIII, 2025, 15091 : 364 - 381
  • [44] Exploiting Multi-modal Fusion for Robust Face Representation Learning with Missing Modality
    Zhu, Yizhe
    Sun, Xin
    Zhou, Xi
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 283 - 294
  • [45] Molecular Joint Representation Learning via Multi-Modal Information of SMILES and Graphs
    Wu, Tianyu
    Tang, Yang
    Sun, Qiyu
    Xiong, Luolin
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (05) : 3044 - 3055
  • [46] MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning
    Lu, Xinyu
    Wang, Lifang
    Jiang, Zejun
    He, Shichang
    Liu, Shizhong
    APPLIED INTELLIGENCE, 2022, 52 (07) : 7480 - 7497
  • [47] MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning
    Xinyu Lu
    Lifang Wang
    Zejun Jiang
    Shichang He
    Shizhong Liu
    Applied Intelligence, 2022, 52 : 7480 - 7497
  • [48] Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
    Liang, Weixin
    Zhang, Yuhui
    Kwon, Yongchan
    Yeung, Serena
    Zou, James
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [49] Representation learning using step-based deep multi-modal autoencoders
    Bhatt, Gaurav
    Jha, Piyush
    Raman, Balasubramanian
    PATTERN RECOGNITION, 2019, 95 : 12 - 23
  • [50] MutualFormer: Multi-modal Representation Learning via Cross-Diffusion Attention
    Wang, Xixi
    Wang, Xiao
    Jiang, Bo
    Tang, Jin
    Luo, Bin
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 3867 - 3888