Multi-label video classification via coupling attentional multiple instance learning with label relation graph *

被引:12
|
作者
Li, Xuewei [1 ]
Wu, Hongjun [1 ]
Li, Mengzhu [1 ]
Liu, Hongzhe [1 ]
机构
[1] Beijing Union Univ, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China
关键词
Multi-label video classification; Multiple instance learning; Attentional feature learning; Label relation graph;
D O I
10.1016/j.patrec.2022.01.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label video classification is a challenging problem in pattern recognition field, as it is difficult to grasp the occurring localizations of a huge number of labels in videos. To solve this problem, we propose a general framework named MALL-CNN, i.e., Multi-Attention Label Relation Learning Convolutional Neural Network. MALL-CNN not only builds the correspondences between labels and videos by an attention mechanism, but also captures label co-occurrence by a graph learning approach. Specifically, we introduce multiple instance learning to composite a set of frame-level features into a video-level feature. Then, video-level feature is mapped into the content-aware category representations in an improved attentional manner. Further, these representations are enhanced by a series of label relation graphs, which transform global label relationships to the label relationships of current video. With the three processes, frame feature aggregation, video feature mapping, and label relationship construction can be achieved in MALL-CNN for multi-label video classification. Extensive experiments on real-world scene benchmark Youtube-8M verify that MALL-CNN with only frame feature surpasses the state of the arts with multi modal features as well as ensemble models.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:53 / 59
页数:7
相关论文
共 50 条
  • [21] Learning graph structure for multi-label image classification via clique generation
    Tan, Mingkui
    Shi, Qinfeng
    van den Hengel, Anton
    Shen, Chunhua
    Gao, Junbin
    Hu, Fuyuan
    Zhang, Zhen
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 4100 - 4109
  • [22] A Multi-instance Multi-label Dual Learning Approach for Video Captioning
    Ji, Wanting
    Wang, Ruili
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (02)
  • [23] Simultaneous Nonlinear Label-Instance Embedding for Multi-label Classification
    Kimura, Keigo
    Kudo, Mineichi
    Sun, Lu
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016, 2016, 10029 : 15 - 25
  • [24] Instance-Based Ensemble Pruning via Multi-Label Classification
    Markatopoulou, Fotini
    Tsoumakas, Grigorios
    Vlahavas, Ioannis
    22ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2010), PROCEEDINGS, VOL 1, 2010,
  • [25] Multi-Instance Multi-Label Learning for Image Classification with Large Vocabularies
    Yakhnenko, Oksana
    Honavar, Vasant
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [26] Multi-label learning with label relevance in advertising video
    Hou, Sujuan
    Zhou, Shangbo
    Chen, Ling
    Feng, Yong
    Awudu, Karim
    NEUROCOMPUTING, 2016, 171 : 932 - 948
  • [27] Partial Multi-label Learning with Instance Correlations
    Gao, Guangliang
    Zhan, Zhiwei
    Sun, Jiachen
    Sun, Aiqin
    Lan, Haoliang
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 432 - 438
  • [28] Fast Multi-Instance Multi-Label Learning
    Huang, Sheng-Jun
    Gao, Wei
    Zhou, Zhi-Hua
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1868 - 1874
  • [29] Learnability of multi-instance multi-label learning
    Wang Wei
    Zhou ZhiHua
    CHINESE SCIENCE BULLETIN, 2012, 57 (19): : 2488 - 2491
  • [30] Learnability of multi-instance multi-label learning
    WANG Wei & ZHOU ZhiHua National Key Laboratory for Novel Software Technology
    ChineseScienceBulletin, 2012, 57 (19) : 2492 - 2495