Complementary expert balanced learning for long-tail cross-modal retrieval

被引:0
|
作者
Peifang Liu
Xueliang Liu
机构
[1] Hefei University of Technology,Institute of Artificial Intelligence
[2] Hefei Comprehensive National Science Center,undefined
来源
Multimedia Systems | 2024年 / 30卷
关键词
Cross-modal retrieval; Online distillation; Long-tailed learning;
D O I
暂无
中图分类号
学科分类号
摘要
Cross-modal retrieval aims to project the high-dimensional cross-model data to a common low-dimensional space. Previous work relies on balanced dataset for training. But with the growth of massive real datasets, the long-tail phenomenon has be found in more and more datasets and how to train with those imbalanced datasets is becoming an emerging challenge. In this paper, we propose the complementary expert balanced learning for long-tail cross-modal retrieval to alleviate the impact of long-tail data. In the solution, we design a multiple experts complementary to balance the difference between image and text modalities. Separately for each expert, to find the common feature space of images and texts, we design an individual pairs loss. Moreover, a balancing process is proposed to mitigate the impact of the long tail on the retrieval accuracy of each expert network. In addition, we propose complementary online distillation to enable collaborative operation between individual experts and improve image and text matching. Each expert allows mutual learning between individual modalities, and multiple experts can complement each other to learn the feature embedding between two modalities. Finally, to address the reduction in the number of data after long-tail processing, we propose high-score retraining which also helps the network capture global and robust features with meticulous discrimination. Experimental results on widely used benchmark datasets show that the proposed method is effective in long-tail cross-modal learning.
引用
收藏
相关论文
共 50 条
  • [1] Complementary expert balanced learning for long-tail cross-modal retrieval
    Liu, Peifang
    Liu, Xueliang
    MULTIMEDIA SYSTEMS, 2024, 30 (02)
  • [2] Long-Tail Cross Modal Hashing
    Gao, Zijun
    Wang, Jun
    Yu, Guoxian
    Yan, Zhongmin
    Domeniconi, Carlotta
    Zhang, Jinglin
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7642 - 7650
  • [3] HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval
    Zhang, Chengyuan
    Song, Jiayu
    Zhu, Xiaofeng
    Zhu, Lei
    Zhang, Shichao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
  • [4] Continual learning in cross-modal retrieval
    Wang, Kai
    Herranz, Luis
    van de Weijer, Joost
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3623 - 3633
  • [5] Learning DALTS for cross-modal retrieval
    Yu, Zheng
    Wang, Wenmin
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2019, 4 (01) : 9 - 16
  • [6] Sequential Learning for Cross-modal Retrieval
    Song, Ge
    Tan, Xiaoyang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4531 - 4539
  • [7] Cross-Modal Retrieval Using Deep Learning
    Malik, Shaily
    Bhardwaj, Nikhil
    Bhardwaj, Rahul
    Kumar, Saurabh
    PROCEEDINGS OF THIRD DOCTORAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE, DOSCI 2022, 2023, 479 : 725 - 734
  • [8] Learning Cross-Modal Retrieval with Noisy Labels
    Hu, Peng
    Peng, Xi
    Zhu, Hongyuan
    Zhen, Liangli
    Lin, Jie
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5399 - 5409
  • [9] Hybrid representation learning for cross-modal retrieval
    Cao, Wenming
    Lin, Qiubin
    He, Zhihai
    He, Zhiquan
    NEUROCOMPUTING, 2019, 345 : 45 - 57
  • [10] Multimodal Graph Learning for Cross-Modal Retrieval
    Xie, Jingyou
    Zhao, Zishuo
    Lin, Zhenzhou
    Shen, Ying
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 145 - 153