Editing like Humans: A Contextual, Multimodal Framework for Automated Video Editing

被引:3
|
作者
Koorathota, Sharath [1 ,2 ]
Adelman, Patrick [2 ]
Cotton, Kelly [3 ]
Sajda, Paul [1 ]
机构
[1] Columbia Univ, Dept Biomed Engn, New York, NY 10027 USA
[2] Fovea Inc, New York, NY 10001 USA
[3] CUNY, Grad Ctr, Dept Psychol, New York, NY USA
关键词
D O I
10.1109/CVPRW53098.2021.00186
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an automated video editing model, which we term contextual and multimodal video editing (CMVE). The model leverages visual and textual metadata describing videos, integrating essential information from both modalities, and uses a learned editing style from a single example video to coherently combine clips. The editing model is useful for tasks such as generating news clip montages and highlight reels given a text query that describes the video storyline. The model exploits the perceptual similarity between video frames, objects in videos and text descriptions to emulate coherent video editing. Amazon Mechanical Turk participants made judgements comparing CMVE to expert human editing. Experimental results showed no significant difference in the CMVE vs human edited video in terms of matching the text query and the level of interest each generates, suggesting CMVE is able to effectively integrate semantic information across visual and textual modalities and create perceptually coherent quality videos typical of human video editors. We publicly release an online demonstration of our method.
引用
收藏
页码:1701 / 1709
页数:9
相关论文
共 50 条
  • [31] Interactive Intrinsic Video Editing
    Bonneel, Nicolas
    Sunkavalli, Kalyan
    Tompkin, James
    Sun, Deqing
    Paris, Sylvain
    Pfister, Hanspeter
    ACM TRANSACTIONS ON GRAPHICS, 2014, 33 (06):
  • [32] FILM EDITING THE VIDEO WAY
    LANG, S
    LANG, S
    INDUSTRIAL PHOTOGRAPHY, 1984, 33 (09): : 34 - 35
  • [33] Basic Thinking of Video Editing
    Cao, Yimei
    APPLIED ECONOMICS, BUSINESS AND DEVELOPMENT, 2011, 208 : 99 - 104
  • [34] Physically Based Video Editing
    Bazin, J-C.
    Pluss , C.
    Yu, G.
    Martin, T.
    Jacobson, A.
    Gross, M.
    COMPUTER GRAPHICS FORUM, 2016, 35 (07) : 421 - 429
  • [35] Analogies based video editing
    Yan, WQ
    Wang, J
    Kankanhalli, MS
    MULTIMEDIA SYSTEMS, 2005, 11 (01) : 3 - 18
  • [36] Narrative Annotation and Editing of Video
    Lombardo, Vincenzo
    Damiano, Rossana
    INTERACTIVE STORYTELLING, 2010, 6432 : 62 - +
  • [37] Timeline Editing of Objects in Video
    Lu, Shao-Ping
    Zhang, Song-Hai
    Wei, Jin
    Hu, Shi-Min
    Martin, Ralph R.
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2013, 19 (07) : 1218 - 1227
  • [38] VIDEO EDITING AND SPECIAL EFFECTS
    FERGUSON, PR
    CONFERENCE PROCEEDINGS FOR THE 1989 NAUI INTERNATIONAL CONFERENCE ON UNDERWATER EDUCATION, 1989, : 85 - 88
  • [39] Geodesic Image and Video Editing
    Criminisi, Antonio
    Sharp, Toby
    Rother, Carsten
    Perez, Patrick
    ACM TRANSACTIONS ON GRAPHICS, 2010, 29 (05):
  • [40] Nonlinear editing by generative video
    Jasinschi, RS
    Moura, JMF
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 1220 - 1223