Automatic Annotation of Human Actions in Video

被引:105
|
作者
Duchenne, Olivier [1 ]
Laptev, Ivan [1 ]
Sivic, Josef [1 ]
Bach, Francis [1 ]
Ponce, Jean [1 ]
机构
[1] INRIA, Ecole Normale Super, Paris, France
关键词
D O I
10.1109/ICCV.2009.5459279
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the problem of automatic temporal annotation of realistic human actions in video using minimal manual supervision. To this end we consider two associated problems: (a) weakly-supervised learning of action models from readily available annotations, and (b) temporal localization of human actions in test videos. To avoid the prohibitive cost of manual annotation for training, we use movie scripts as a means of weak supervision. Scripts, however, provide only implicit, noisy, and imprecise information about the type and location of actions in video. We address this problem with a kernel-based discriminative clustering algorithm that locates actions in the weakly-labeled training data. Using the obtained action samples, we train temporal action detectors and apply them to locate actions in the raw video data. Our experiments demonstrate that the proposed method for weakly-supervised learning of action models leads to significant improvement in action detection. We present detection results for three action classes in four feature length movies with challenging and realistic video data.
引用
收藏
页码:1491 / 1498
页数:8
相关论文
共 50 条
  • [11] Semantic Video Search by Automatic Video Annotation using Tensorflow
    Ashangani, Kithmi
    Wickramasinghe, K. U.
    De Silva, D. W. N.
    Gamwara, V. M.
    Nugaliyadde, Anupiya
    Mallawarachchi, Yashas
    PROCEEDINGS OF THE 2016 MANUFACTURING & INDUSTRIAL ENGINEERING SYMPOSIUM (MIES): INNOVATIVE APPLICATIONS FOR INDUSTRY, 2016, : 49 - 52
  • [12] Statistical models for automatic video annotation and retrieval
    Lavrenko, V
    Feng, SL
    Manmatha, R
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 1044 - 1047
  • [13] Automatic extraction and annotation of soccer video highlights
    Assfalg, J
    Bertini, M
    Colombo, C
    Del Bimbo, A
    Nunziati, W
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 2, PROCEEDINGS, 2003, : 527 - 530
  • [14] Automatic video annotation using Bayesian inference
    Wang, Fangshi
    Xu, De
    Lu, Wei
    Wu, Weixin
    2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1468 - +
  • [15] Semi-automatic video content annotation
    Zhu, XQ
    Fan, JP
    Xue, XY
    Wu, L
    Elmagarmid, AK
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 245 - 252
  • [16] Towards Automatic Cinematography and Annotation for 360° Video
    Fassold, Hannes
    Takacs, Barnabas
    TVX 2019: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON INTERACTIVE EXPERIENCES FOR TV AND ONLINE VIDEO, 2019, : 157 - 166
  • [17] Semantic Keyword Selection for Automatic Video Annotation
    Imran, Ali Shariq
    Rahadianti, Laksmita
    Cheikh, Faouzi Alaya
    Yayilgan, Sule Yildirim
    2013 INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2013, : 241 - 246
  • [18] AUTOMATIC VIDEO ANNOTATION THROUGH SEARCH AND MINING
    Moxley, Emily
    Mei, Tao
    Hua, Xian-Sheng
    Ma, Wei-Ying
    Manjunath, B. S.
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 685 - +
  • [19] Automatic Annotation of Image and Video using Semantics
    Yasaswy, A. R.
    Manikanta, K.
    Vamshi, P. Sri
    Tapaswi, Shashikala
    SECOND INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING, 2010, 7546
  • [20] Directly Locating Actions in Video with Single Frame Annotation
    Tong, Haoran
    Liu, Xinyan
    Li, Guorong
    Qing, Laiyun
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1135 - 1139