Joint Summarization of Large-scale Collections of Web Images and Videos for Storyline Reconstruction

被引：78

作者：

Kim, Gunhee ^{[1
]}

Sigal, Leonid ^{[1
]}

Xing, Eric P. ^{[2
]}

机构：

[1] Disney Res Pittsburgh, Pittsburgh, PA 15213 USA

[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

来源：

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2014年

关键词：

LASSO;

D O I：

10.1109/CVPR.2014.538

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we address the problem of jointly summarizing large sets of Flickr images and YouTube videos. Starting from the intuition that the characteristics of the two media types are different yet complementary, we develop a fast and easily-parallelizable approach for creating not only high-quality video summaries but also novel structural summaries of online images as storyline graphs. The storyline graphs can illustrate various events or activities associated with the topic in a form of a branching network. The video summarization is achieved by diversity ranking on the similarity graphs between images and video frames. The reconstruction of storyline graphs is formulated as the inference of sparse time-varying directed graphs from a set of photo streams with assistance of videos. For evaluation, we collect the datasets of 20 outdoor activities, consisting of 2.7M Flickr images and 16K YouTube videos. Due to the large-scale nature of our problem, we evaluate our algorithm via crowdsourcing using Amazon Mechanical Turk. In our experiments, we demonstrate that the proposed joint summarization approach outperforms other baselines and our own methods using videos or images only.

引用

页码：4225 / 4232

页数：8

共 50 条

[1] Learning to Match Images in Large-Scale Collections
Cao, Song
Snavely, Noah
COMPUTER VISION - ECCV 2012: WORKSHOPS AND DEMONSTRATIONS, PT I, 2012, 7583 : 259 - 270
[2] Large-Scale Video Summarization Using Web-Image Priors
Khosla, Aditya
Hamid, Raffay
Lin, Chih-Jen
Sundaresan, Neel
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2698 - 2705
[3] CStory: A Chinese Large-scale News Storyline Dataset
Shi, Kaijie
Wang, Xiaozhi
Yu, Jifan
Hou, Lei
Li, Juanzi
Wu, Jingtong
Yong, Dingyu
Xiao, Jinghui
Liu, Qun
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4475 - 4479
[4] Large-scale evaluation of splicing localization algorithms for web images
Zampoglou, Markos
Papadopoulos, Symeon
Kompatsiaris, Yiannis
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (04) : 4801 - 4834
[5] Large-scale evaluation of splicing localization algorithms for web images
Zampoglou, Markos
Papadopoulos, Symeon
Kompatsiaris, Yiannis
Multimedia Tools and Applications, 2017, 76 (04): : 4801 - 4834
[6] Large-scale evaluation of splicing localization algorithms for web images
Markos Zampoglou
Symeon Papadopoulos
Yiannis Kompatsiaris
Multimedia Tools and Applications, 2017, 76 : 4801 - 4834
[7] Multimodal Event Detection and Summarization in Large Scale Image Collections
Schinas, Manos
Papadopoulos, Symeon
Petkos, Georgios
Kompatsiaris, Yiannis
Mitkas, Pericles A.
ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 421 - 422
[8] Recording of multiple videos in a large-scale space for large-scale virtualized reality
Kitahara, Itaru
Ohta, Yuichi
Saito, Hideo
Akimichi, Shinji
Ono, Tooru
Kanade, Takeo
Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2002, 56 (08): : 1328 - 1333
[9] Evaluation challenges in large-scale document summarization
Radev, DR
Teufel, S
Saggion, H
Lam, W
Blitzer, J
Qi, H
41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 375 - 382
[10] CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos
Han, Seungju
Hessel, Jack
Dziri, Nouha
Choi, Yejin
Yu, Youngjae
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15452 - 15463

← 1 2 3 4 5 →