Adaptive Pooling in Multi-Instance Learning for Web Video Annotation

被引:28
|
作者
Zhou, Yizhou [1 ,2 ]
Sun, Xiaoyan [2 ]
Liu, Dong [1 ]
Zha, Zhengjun [1 ]
Zeng, Wenjun [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
关键词
IMAGE;
D O I
10.1109/ICCVW.2017.46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Web videos are usually weakly annotated, i.e., a tag is associated to a video once the corresponding concept appears in a frame of this video without indicating when and where it occurs. These weakly annotated tags pose big troubles to many Web video applications, e.g. search and recommendation. In this paper, we present a new Web video annotation approach based on multi-instance learning (MIL) with a learnable pooling function. By formulating the Web video annotation as a MIL problem, we present an end-to-end deep network framework to solve this problem in which the frame (instance) level annotation is estimated from tags given at the video (bag of instances) level via a convolutional neural network (CNN). A learnable pooling function is proposed to adaptively fuse the outputs of the CNN to determine tags at the video level. We further propose a new loss function that consists of both bag-level and instance-level losses, which enables the penalty term to be aware of the internal state of network rather than only an overall loss, thus makes the pooling function learned better and faster. Experimental results demonstrate that our proposed framework is able to not only enhance the accuracy of Web video annotation by outperforming the state-of-the-art Web video annotation methods on the large-scale video dataset FCVID, but also help to infer the most relevant frames in Web videos.
引用
收藏
页码:318 / 327
页数:10
相关论文
共 50 条
  • [21] Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning
    Li, Ying-Xin
    Ji, Shuiwang
    Kumar, Sudhir
    Ye, Jieping
    Zhou, Zhi-Hua
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1445 - 1450
  • [22] Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning
    Li, Ying-Xin
    Ji, Shuiwang
    Kumar, Sudhir
    Ye, Jieping
    Zhou, Zhi-Hua
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (01) : 98 - 112
  • [23] Automated skin biopsy histopathological image annotation using multi-instance representation and learning
    Gang Zhang
    Jian Yin
    Ziping Li
    Xiangyang Su
    Guozheng Li
    Honglai Zhang
    BMC Medical Genomics, 6
  • [24] Multi-instance multi-label learning
    Zhou, Zhi-Hua
    Zhang, Min-Ling
    Huang, Sheng-Jun
    Li, Yu-Feng
    ARTIFICIAL INTELLIGENCE, 2012, 176 (01) : 2291 - 2320
  • [25] Automated skin biopsy histopathological image annotation using multi-instance representation and learning
    Zhang, Gang
    Yin, Jian
    Li, Ziping
    Su, Xiangyang
    Li, Guozheng
    Zhang, Honglai
    BMC MEDICAL GENOMICS, 2013, 6
  • [26] SALE: Self-adaptive LSH encoding for multi-instance learning
    Xu, Dongkuan
    Wu, Jia
    Li, Dewei
    Tian, Yingjie
    Zhu, Xingquan
    Wu, Xindong
    PATTERN RECOGNITION, 2017, 71 : 460 - 482
  • [27] Multi-Instance Learning with Distribution Change
    Zhang, Wei-Jia
    Zhou, Zhi-Hua
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2184 - 2190
  • [28] Feature selection in multi-instance learning
    Rui Gan
    Jian Yin
    Neural Computing and Applications, 2013, 23 : 907 - 912
  • [29] Regularized Instance Embedding for Deep Multi-Instance Learning
    Lin, Yi
    Zhang, Honggang
    APPLIED SCIENCES-BASEL, 2020, 10 (01):
  • [30] Multi-Instance Nonparallel Tube Learning
    Xiao, Yanshan
    Liu, Bo
    Hao, Zhifeng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2563 - 2577