Heterogeneous Dual-Task Clustering with Visual-Textual Information

被引:3
|
作者
Yan, Xiaoqiang [1 ]
Mao, Yiqiao [1 ]
Hu, Shizhe [1 ]
Ye, Yangdong [1 ]
机构
[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1137/1.9781611976236.74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing visual-textual cross-modal clustering techniques focus on finding a clustering partition of different modalities by dealing with each modality dependently or integrating multiple modalities into a shared space, which may results in unsatisfactory performance due to the heterogeneous gap of different modalities. Aiming at this problem, we propose a novel heterogeneous dual-task clustering (HDC) method, which is capable of exploring high-level relatedness between visual and textual data to improve the performance of individual task. Our intuition is that although the visual and textual data are heterogenous to each other, they may share related high-level semantics and rich latent correlations, which can lead to improved performance if we treat the clustering of visual and textual data as different but related learning tasks. Specifically, the problem of heterogeneous dual-task clustering is formulated as an information theoretic function, in which the low-level information in each modality and high-level relatedness between multiple modalities are maximally preserved. Then, a progressive optimization method is proposed to ensure a local optimal solution. Extensive experiments show noticeable performance of the HDC approach in comparison with several state-of-the-art baselines.
引用
收藏
页码:658 / 666
页数:9
相关论文
共 50 条
  • [21] VISTA: Visual-Textual Knowledge Graph Representation Learning
    Lee, Jaejun
    Chung, Chanyoung
    Lee, Hochang
    Jo, Sungho
    Whang, Joyce Jiyoung
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7314 - 7328
  • [22] Fine-Grained Visual-Textual Representation Learning
    He, Xiangteng
    Peng, Yuxin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (02) : 520 - 531
  • [23] ONLINE PROCESSING OF TEXTUAL ILLUSTRATIONS - IN THE VISUOSPATIAL SKETCHPAD - EVIDENCE FOR DUAL-TASK STUDIES
    KRULEY, P
    SCIAMA, SC
    GLENBERG, AM
    MEMORY & COGNITION, 1994, 22 (03) : 261 - 272
  • [24] Multimodal Logical Inference System for Visual-Textual Entailment
    Suzuki, Riko
    Yanaka, Hitomi
    Yoshikawa, Masashi
    Mineshima, Koji
    Bekki, Daisuke
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 386 - 392
  • [25] Visual-textual prototyping of 4D scenes
    Duecker, M
    Geiger, C
    Hunstock, R
    Lehrenfeld, G
    Mueller, W
    1997 IEEE SYMPOSIUM ON VISUAL LANGUAGES, PROCEEDINGS, 1997, : 328 - 335
  • [26] Additional effects of a cognitive task on dual-task training to reduce dual-task interference
    Kimura, Takehide
    Matsuura, Ryouta
    PSYCHOLOGY OF SPORT AND EXERCISE, 2020, 46
  • [27] EFFECTS OF SINGLE-TASK AND DUAL-TASK PRACTICE ON ACQUIRING DUAL-TASK SKILL
    DETWEILER, MC
    LUNDY, DH
    HUMAN FACTORS, 1995, 37 (01) : 193 - 211
  • [28] The effects of dual-task interference on visual search and verbal memory
    Jackson, Kenneth M.
    Shaw, Tyler H.
    Helton, William S.
    ERGONOMICS, 2023, 66 (01) : 125 - 135
  • [29] Alternating dual-task interference between visual words and faces
    Furubacke, Amanda
    Albonico, Andrea
    Barton, Jason J. S.
    BRAIN RESEARCH, 2020, 1746
  • [30] On the Organization of Task-Order and Task-Specific Information in Dual-Task Situations
    Kuebler, Sebastian
    Strobach, Tilo
    Schubert, Torsten
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2022, 48 (01) : 94 - 113