Relation-Aware Distribution Representation Network for Person Clustering With Multiple Modalities

被引:0
|
作者
Liu, Kaijian [1 ]
Tang, Shixiang [2 ]
Li, Ziyue [3 ,4 ]
Li, Zhishuai [1 ]
Bai, Lei [5 ]
Zhu, Feng [1 ]
Zhao, Rui [1 ,6 ]
机构
[1] SenseTime Res, Shanghai 200030, Peoples R China
[2] Univ Sydney, Sydney, NSW 2050, Australia
[3] Univ Cologne, D-50923 Cologne, Germany
[4] EWI gGmbH, D-50827 Cologne, Germany
[5] Shanghai AI Lab, Shanghai 200030, Peoples R China
[6] Shanghai Jiao Tong Univ, Qing Yuan Res Inst, Shanghai 200040, Peoples R China
关键词
Faces; Feature extraction; Streaming media; Task analysis; Measurement; Semantics; Motion pictures; Person clustering; Multi-modality clues; Distribution learning; Multi-modal representations; AFFINITY GRAPH; MULTIVIEW; FACES; RANK;
D O I
10.1109/TMM.2023.3304454
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Person clustering with multi-modal clues, including faces, bodies, and voices, is critical for various tasks, such as movie parsing and identity-based movie editing. Related methods such as multi-view clustering mainly project multi-modal features into a joint feature space. However, multi-modal clue features are usually rather weakly correlated due to the semantic gap from the modality-specific uniqueness. As a result, these methods are not suitable for person clustering. In this article, we propose a Relation-Aware Distribution representation Network (RAD-Net) to generate a distribution representation for multi-modal clues. The distribution representation of a clue is a vector consisting of the relation between this clue and all other clues from all modalities, thus being modality agnostic and good for person clustering. Accordingly, we introduce a graph-based method to construct distribution representation and employ a cyclic update policy to refine distribution representation progressively. Our method achieves substantial improvements of +6% and +8.2% in F-score on the Video Person-Clustering Dataset (VPCD) and VoxCeleb2 multi-view clustering dataset, respectively.
引用
收藏
页码:8371 / 8382
页数:12
相关论文
共 50 条
  • [1] RelationTrack: Relation-Aware Multiple Object Tracking With Decoupled Representation
    Yu, En
    Li, Zhuoling
    Han, Shoudong
    Wang, Hongwei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2686 - 2697
  • [2] RAN: A Relation-aware Network for Relation Extraction
    Li, Yile
    Gu, Xiaoyan
    Yue, Yinliang
    Wang, Zhuo
    Li, Bo
    Wang, Weiping
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] Selective arguments representation with dual relation-aware network for video situation recognition
    Liu W.
    He Q.
    Wang C.
    Peng Y.
    Xie S.
    Neural Computing and Applications, 2024, 36 (17) : 9945 - 9961
  • [4] Relation-aware non-local attention network for person re-identification
    Li, Sujuan
    Xie, Gengsheng
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [5] Relation-aware aggregation network with auxiliary guidance for text-based person search
    Zeng, Pengpeng
    Jing, Shuaiqi
    Song, Jingkuan
    Fan, Kaixuan
    Li, Xiangpeng
    We, Liansuo
    Guo, Yuan
    World Wide Web, 2022, 25 (04) : 1565 - 1582
  • [6] Relation-aware aggregation network with auxiliary guidance for text-based person search
    Pengpeng Zeng
    Shuaiqi Jing
    Jingkuan Song
    Kaixuan Fan
    Xiangpeng Li
    Liansuo We
    Yuan Guo
    World Wide Web, 2022, 25 : 1565 - 1582
  • [7] Person Re-Identification Using Local Relation-Aware Graph Convolutional Network
    Lian, Yu
    Huang, Wenmin
    Liu, Shuang
    Guo, Peng
    Zhang, Zhong
    Durrani, Tariq S.
    SENSORS, 2023, 23 (19)
  • [8] Relation-aware aggregation network with auxiliary guidance for text-based person search
    Zeng, Pengpeng
    Jing, Shuaiqi
    Song, Jingkuan
    Fan, Kaixuan
    Li, Xiangpeng
    We, Liansuo
    Guo, Yuan
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1565 - 1582
  • [9] A relation-aware representation approach for the question matching system
    Chen, Yanmin
    Chen, Enhong
    Zhang, Kun
    Liu, Qi
    Sun, Ruijun
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2024, 27 (02):
  • [10] Task Relation-aware Continual User Representation Learning
    Kim, Sein
    Lee, Namkyeong
    Kim, Donghyun
    Yang, Minchul
    Park, Chanyoung
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1107 - 1119