Heterogeneous Dual-Task Clustering with Visual-Textual Information

被引:3
|
作者
Yan, Xiaoqiang [1 ]
Mao, Yiqiao [1 ]
Hu, Shizhe [1 ]
Ye, Yangdong [1 ]
机构
[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1137/1.9781611976236.74
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing visual-textual cross-modal clustering techniques focus on finding a clustering partition of different modalities by dealing with each modality dependently or integrating multiple modalities into a shared space, which may results in unsatisfactory performance due to the heterogeneous gap of different modalities. Aiming at this problem, we propose a novel heterogeneous dual-task clustering (HDC) method, which is capable of exploring high-level relatedness between visual and textual data to improve the performance of individual task. Our intuition is that although the visual and textual data are heterogenous to each other, they may share related high-level semantics and rich latent correlations, which can lead to improved performance if we treat the clustering of visual and textual data as different but related learning tasks. Specifically, the problem of heterogeneous dual-task clustering is formulated as an information theoretic function, in which the low-level information in each modality and high-level relatedness between multiple modalities are maximally preserved. Then, a progressive optimization method is proposed to ensure a local optimal solution. Extensive experiments show noticeable performance of the HDC approach in comparison with several state-of-the-art baselines.
引用
收藏
页码:658 / 666
页数:9
相关论文
共 50 条
  • [31] Social Image Search exploiting Joint Visual-Textual information within a Fuzzy Hypergraph Framework
    Pliakos, Konstantinos
    Kotropoulos, Constantine
    2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
  • [32] Effect of template complexity on visual search and dual-task performance
    Bourke, PA
    Duncan, J
    PSYCHOLOGICAL SCIENCE, 2005, 16 (03) : 208 - 213
  • [33] Visual-Textual Attentive Semantic Consistency for Medical Report Generation
    Zhou, Yi
    Huang, Lei
    Zhou, Tao
    Fu, Huazhu
    Shao, Ling
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3965 - 3974
  • [34] Visual-textual framework for serverless computation: a Luna Language approach
    Moczurad, Piotr
    Malawski, Maciej
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING COMPANION (UCC COMPANION), 2018, : 169 - 174
  • [35] Visual-textual adversarial learning for person re-identification
    Yin, Pengqi
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [36] Processing of irrelevant location information under dual-task conditions
    Muesseler, Jochen
    Wuehr, Peter
    Umilta, Carlo
    PSYCHOLOGICAL RESEARCH-PSYCHOLOGISCHE FORSCHUNG, 2006, 70 (06): : 459 - 467
  • [37] Processing of irrelevant location information under dual-task conditions
    Jochen Müsseler
    Peter Wühr
    Carlo Umiltá
    Psychological Research, 2006, 70 : 459 - 467
  • [38] Attention Assignment of Dual-task Based on AircraftCockpit Alarm Information
    Chen, Yuefei
    Xue, Hongjun
    Su, Run'e
    2011 AASRI CONFERENCE ON APPLIED INFORMATION TECHNOLOGY (AASRI-AIT 2011), VOL 1, 2011, : 232 - 234
  • [39] MUTATT: VISUAL-TEXTUAL MUTUAL GUIDANCE FOR REFERRING EXPRESSION COMPREHENSION
    Wang, Shuai
    Lyu, Fan
    Feng, Wei
    Wang, Song
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [40] Visual-Textual Matching Attention for Lesion Segmentation in Chest Images
    Phuoc-Nguyen Bui
    Duc-Tai Le
    Choo, Hyunseung
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IX, 2024, 15009 : 702 - 711