Adaptive Pooling in Multi-Instance Learning for Web Video Annotation

被引:28
|
作者
Zhou, Yizhou [1 ,2 ]
Sun, Xiaoyan [2 ]
Liu, Dong [1 ]
Zha, Zhengjun [1 ]
Zeng, Wenjun [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
关键词
IMAGE;
D O I
10.1109/ICCVW.2017.46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Web videos are usually weakly annotated, i.e., a tag is associated to a video once the corresponding concept appears in a frame of this video without indicating when and where it occurs. These weakly annotated tags pose big troubles to many Web video applications, e.g. search and recommendation. In this paper, we present a new Web video annotation approach based on multi-instance learning (MIL) with a learnable pooling function. By formulating the Web video annotation as a MIL problem, we present an end-to-end deep network framework to solve this problem in which the frame (instance) level annotation is estimated from tags given at the video (bag of instances) level via a convolutional neural network (CNN). A learnable pooling function is proposed to adaptively fuse the outputs of the CNN to determine tags at the video level. We further propose a new loss function that consists of both bag-level and instance-level losses, which enables the penalty term to be aware of the internal state of network rather than only an overall loss, thus makes the pooling function learned better and faster. Experimental results demonstrate that our proposed framework is able to not only enhance the accuracy of Web video annotation by outperforming the state-of-the-art Web video annotation methods on the large-scale video dataset FCVID, but also help to infer the most relevant frames in Web videos.
引用
收藏
页码:318 / 327
页数:10
相关论文
共 50 条
  • [31] Research on Ensemble Multi-Instance Learning
    Huang, Bo
    Cai, Zhihua
    Tao, Duoxiu
    Gu, Qiong
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 200 - 204
  • [32] Scalable Algorithms for Multi-Instance Learning
    Wei, Xiu-Shen
    Wu, Jianxin
    Zhou, Zhi-Hua
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (04) : 975 - 987
  • [33] Multi-Instance Learning with Incremental Classes
    Wei X.
    Xu S.
    An P.
    Yang J.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (08): : 1723 - 1731
  • [34] Diversified dictionaries for multi-instance learning
    Qiao, Maoying
    Liu, Liu
    Yu, Jun
    Xu, Chang
    Tao, Dacheng
    PATTERN RECOGNITION, 2017, 64 : 407 - 416
  • [35] Multi-Instance Learning for Bankruptcy Prediction
    Kotsiantis, Sotiris
    Kanellopoulos, Dimitris
    THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, : 1007 - +
  • [36] Multi-instance clustering with applications to multi-instance prediction
    Min-Ling Zhang
    Zhi-Hua Zhou
    Applied Intelligence, 2009, 31 : 47 - 68
  • [37] CASCADE OF MULTI-LEVEL MULTI-INSTANCE CLASSIFIERS FOR IMAGE ANNOTATION
    Cam-Tu Nguyen
    Ha Vu Le
    Tokuyama, Takeshi
    KDIR 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2011, : 14 - 23
  • [38] Feature Selection in Multi-instance Learning
    Zhang, Chun-Hua
    Tan, Jun-Yan
    Deng, Nai-Yang
    OPERATIONS RESEARCH AND ITS APPLICATIONS, 2010, 12 : 462 - +
  • [39] Multi-instance clustering with applications to multi-instance prediction
    Zhang, Min-Ling
    Zhou, Zhi-Hua
    APPLIED INTELLIGENCE, 2009, 31 (01) : 47 - 68
  • [40] Perceiver Hopfield Pooling for Dynamic Multi-modal and Multi-instance Fusion
    Roessle, Dominik
    Cremers, Daniel
    Schoen, Torsten
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT I, 2022, 13529 : 599 - 610