Heterogeneous Dual-Task Clustering with Visual-Textual Information

被引:3
|
作者
Yan, Xiaoqiang [1 ]
Mao, Yiqiao [1 ]
Hu, Shizhe [1 ]
Ye, Yangdong [1 ]
机构
[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1137/1.9781611976236.74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing visual-textual cross-modal clustering techniques focus on finding a clustering partition of different modalities by dealing with each modality dependently or integrating multiple modalities into a shared space, which may results in unsatisfactory performance due to the heterogeneous gap of different modalities. Aiming at this problem, we propose a novel heterogeneous dual-task clustering (HDC) method, which is capable of exploring high-level relatedness between visual and textual data to improve the performance of individual task. Our intuition is that although the visual and textual data are heterogenous to each other, they may share related high-level semantics and rich latent correlations, which can lead to improved performance if we treat the clustering of visual and textual data as different but related learning tasks. Specifically, the problem of heterogeneous dual-task clustering is formulated as an information theoretic function, in which the low-level information in each modality and high-level relatedness between multiple modalities are maximally preserved. Then, a progressive optimization method is proposed to ensure a local optimal solution. Extensive experiments show noticeable performance of the HDC approach in comparison with several state-of-the-art baselines.
引用
收藏
页码:658 / 666
页数:9
相关论文
共 50 条
  • [1] Relational Visual-Textual Information Retrieval
    Messina, Nicola
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2020, 2020, 12440 : 405 - 411
  • [2] How visual information influences dual-task driving and tracking
    Laura Broeker
    Mathias Haeger
    Otmar Bock
    Bettina Kretschmann
    Harald Ewolds
    Stefan Künzell
    Markus Raab
    Experimental Brain Research, 2020, 238 : 675 - 687
  • [3] How visual information influences dual-task driving and tracking
    Broeker, Laura
    Haeger, Mathias
    Bock, Otmar
    Kretschmann, Bettina
    Ewolds, Harald
    Kuenzell, Stefan
    Raab, Markus
    EXPERIMENTAL BRAIN RESEARCH, 2020, 238 (03) : 675 - 687
  • [4] Visual-Textual Integration: Emoji as a Supplement in Health Information Design
    Lin, Tingyi S.
    Luo, Yue
    INTERNATIONAL JOURNAL OF DESIGN, 2024, 18 (02): : 37 - 58
  • [5] Dual-task interference and visual encoding
    Jolicoeur, P
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1999, 25 (03) : 596 - 616
  • [6] A Better Loss for Visual-Textual Grounding
    Rigoni, Davide
    Serafini, Luciano
    Sperduti, Alessandro
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 49 - 57
  • [7] Dual-task performance and visual attention switching
    Hager, DR
    Payne, DG
    PROCEEDINGS OF THE HUMAN FACTORS AND ERGONOMICS SOCIETY - 40TH ANNUAL MEETING, VOLS 1 AND 2: HUMAN CENTERED TECHNOLOGY - KEY TO THE FUTURE, 1996, : 546 - 550
  • [8] VISUAL-TEXTUAL SENTIMENT ANALYSIS IN PRODUCT REVIEWS
    Ye, Jin
    Peng, Xiaojiang
    Qiao, Yu
    Xing, Hao
    Li, Junli
    Ji, Rongrong
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 869 - 873
  • [9] Visual-Textual Semantic Alignment Network for Visual Question Answering
    Tian, Weidong
    Zhang, Yuzheng
    He, Bin
    Zhu, Junjun
    Zhao, Zhongqiu
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 259 - 270
  • [10] Automatic Generation of Visual-Textual Presentation Layout
    Yang, Xuyong
    Mei, Tao
    Xu, Ying-Qing
    Rui, Yong
    Li, Shipeng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2016, 12 (02)