Adaptive Pooling in Multi-Instance Learning for Web Video Annotation

被引:28
|
作者
Zhou, Yizhou [1 ,2 ]
Sun, Xiaoyan [2 ]
Liu, Dong [1 ]
Zha, Zhengjun [1 ]
Zeng, Wenjun [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
关键词
IMAGE;
D O I
10.1109/ICCVW.2017.46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Web videos are usually weakly annotated, i.e., a tag is associated to a video once the corresponding concept appears in a frame of this video without indicating when and where it occurs. These weakly annotated tags pose big troubles to many Web video applications, e.g. search and recommendation. In this paper, we present a new Web video annotation approach based on multi-instance learning (MIL) with a learnable pooling function. By formulating the Web video annotation as a MIL problem, we present an end-to-end deep network framework to solve this problem in which the frame (instance) level annotation is estimated from tags given at the video (bag of instances) level via a convolutional neural network (CNN). A learnable pooling function is proposed to adaptively fuse the outputs of the CNN to determine tags at the video level. We further propose a new loss function that consists of both bag-level and instance-level losses, which enables the penalty term to be aware of the internal state of network rather than only an overall loss, thus makes the pooling function learned better and faster. Experimental results demonstrate that our proposed framework is able to not only enhance the accuracy of Web video annotation by outperforming the state-of-the-art Web video annotation methods on the large-scale video dataset FCVID, but also help to infer the most relevant frames in Web videos.
引用
收藏
页码:318 / 327
页数:10
相关论文
共 50 条
  • [1] EFFICIENT INSTANCE ANNOTATION IN MULTI-INSTANCE LEARNING
    Pham, Anh T.
    Raich, Raviv
    Fern, Xiaoli Z.
    2014 IEEE WORKSHOP ON STATISTICAL SIGNAL PROCESSING (SSP), 2014, : 137 - 140
  • [2] Instance Annotation for Multi-Instance Multi-Label Learning
    Briggs, Forrest
    Fern, Xiaoli Z.
    Raich, Raviv
    Lou, Qi
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (03)
  • [3] Dynamic Programming for Instance Annotation in Multi-Instance Multi-Label Learning
    Pham, Anh T.
    Raich, Raviv
    Fern, Xiaoli Z.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2381 - 2394
  • [4] SIMULTANEOUS INSTANCE ANNOTATION AND CLUSTERING IN MULTI-INSTANCE MULTI-LABEL LEARNING
    Pham, Anh T.
    Raich, Raviv
    Fern, Xiaoli Z.
    2015 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2015,
  • [5] Multi-Instance Learning Based Web Mining
    Zhi-Hua Zhou
    Kai Jiang
    Ming Li
    Applied Intelligence, 2005, 22 : 135 - 147
  • [6] Multi-instance learning based web mining
    Zhou, ZH
    Jiang, K
    Li, M
    APPLIED INTELLIGENCE, 2005, 22 (02) : 135 - 147
  • [7] KERNEL-BASED INSTANCE ANNOTATION IN MULTI-INSTANCE MULTI-LABEL LEARNING
    Pham, Anh T.
    Raich, Raviv
    2014 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2014,
  • [8] Deep Multi-Instance Multi-Label Learning for Image Annotation
    Guo, Hai-Feng
    Han, Lixin
    Su, Shoubao
    Sun, Zhou-Bao
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (03)
  • [9] Multi-instance multi-label learning for surgical image annotation
    Loukas, Constantinos
    Sgouros, Nicholas P.
    INTERNATIONAL JOURNAL OF MEDICAL ROBOTICS AND COMPUTER ASSISTED SURGERY, 2020, 16 (02):
  • [10] A Multi-Instance Multi-Label Learning Approach for Protein Domain Annotation
    Meng, Yang
    Deng, Lei
    Chen, Zhigang
    Zhou, Cheng
    Liu, Diwei
    Fan, Chao
    Yan, Ting
    INTELLIGENT COMPUTING IN BIOINFORMATICS, 2014, 8590 : 104 - 111