LEARNING UNIFIED SPARSE REPRESENTATIONS FOR MULTI-MODAL DATA

被引:0
|
作者
Wang, Kaiye [1 ]
Wang, Wei [1 ]
Wang, Liang [1 ]
机构
[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Ctr Res Intelligent Percept & Comp, Inst Automat, Beijing 100190, Peoples R China
关键词
Cross-modal retrieval; unified representation learning; joint dictionary learning; multi-modal data;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Cross-modal retrieval has become one of interesting and important research problem recently, where users can take one modality of data (e.g., text, image or video) as the query to retrieve relevant data of another modality. In this paper, we present a Multi-modal Unified Representation Learning (MURL) algorithm for cross-modal retrieval, which learns unified sparse representations for multi-modal data representing the same semantics via joint dictionary learning. The l(1)-norm is imposed on the unified representations to explicitly encourage sparsity, which makes our algorithm more robust. Furthermore, a constraint regularization term is imposed to force the representations to be similar if their corresponding multi-modal data have must-links or to be far apart if their corresponding multi-modal data have cannot-links. An iterative algorithm is also proposed to solve the objective function. The effectiveness of the proposed method is verified by extensive results on two real-world datasets.
引用
收藏
页码:3545 / 3549
页数:5
相关论文
共 50 条
  • [21] Learning Cross-Modality Representations From Multi-Modal Images
    van Tulder, Gijs
    de Bruijne, Marleen
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (02) : 638 - 648
  • [22] Using Multi-Modal Representations to Improve Learning in Junior Secondary Science
    Bruce Waldrip
    Vaughan Prain
    Jim Carolan
    Research in Science Education, 2010, 40 : 65 - 80
  • [23] Using Multi-Modal Representations to Improve Learning in Junior Secondary Science
    Waldrip, Bruce
    Prain, Vaughan
    Carolan, Jim
    RESEARCH IN SCIENCE EDUCATION, 2010, 40 (01) : 65 - 80
  • [24] Learning Shared and Specific Factors for Multi-modal Data
    Yin, Qiyue
    Huang, Yan
    Wu, Shu
    Wang, Liang
    COMPUTER VISION, PT II, 2017, 772 : 89 - 98
  • [25] Multi-modal Contrastive Learning for Healthcare Data Analytics
    Li, Rui
    Gao, Jing
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 120 - 127
  • [26] Multi-modal learning and its application for biomedical data
    Liu, Jin
    Zhang, Yu-Dong
    Cai, Hongming
    FRONTIERS IN MEDICINE, 2024, 10
  • [27] Learning multi-modal dictionaries: Application to audiovisual data
    Monaci, Gianluca
    Jost, Philippe
    Vandergheynst, Pierre
    Mailhe, Boris
    Lesage, Sylvain
    Gribonval, Remi
    MULTIMEDIA CONTENT REPRESENTATION, CLASSIFICATION AND SECURITY, 2006, 4105 : 538 - 545
  • [28] Learning Concept Taxonomies from Multi-modal Data
    Zhang, Hao
    Hu, Zhiting
    Deng, Yuntian
    Sachani, Mrinmaya
    Yan, Zhicheng
    Xing, Eric P.
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1791 - 1801
  • [29] Multi-modal degradation feature learning for unified image restoration based on contrastive learning
    Chen, Lei
    Xiong, Qingbo
    Zhang, Wei
    Liang, Xiaoli
    Gan, Zhihua
    Li, Liqiang
    He, Xin
    NEUROCOMPUTING, 2025, 616
  • [30] Exploiting Multi-Modal Interactions: A Unified Framework
    Li, Ming
    Xue, Xiao-Bing
    Zhou, Zhi-Hua
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1120 - 1125