Beyond audio and video retrieval: topic-oriented multimedia summarization

被引:13
|
作者
Metze, Florian [1 ]
Ding, Duo [1 ]
Younessian, Ehsan [1 ]
Hauptmann, Alexander [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
Multimedia summarization; Event detection and recounting; Natural language generation;
D O I
10.1007/s13735-012-0028-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given the deluge of multimedia content that is becoming available over the Internet, it is increasingly important to be able to effectively examine and organize these large stores of information inways that go beyond browsing or collaborative filtering. In this paper, we review previous work on audio and video processing, and define the task of topic-oriented multimedia summarization (TOMS) using natural language generation (NLG): given a set of automatically extracted features from a video, a TOMS system will automatically generate a paragraph of natural language, which summarizes the important information in a video belonging to a certain topic, and for example provides explanations for why a video was matched and retrieved. Possible features include visual semantic concepts, objects, and actions, environmental sounds, and transcripts from automatic speech recognition (ASR). We see this as a first step towards systems that will be able to discriminate visually similar, but semantically different videos, compare two videos and provide textual output or summarize a large number of videos at once. In this paper, we introduce our approach of solving the TOMS problem. We extract various visual concept features, environmental sounds and ASR transcription features from a given video, and develop a template-based NLG system to produce a textual recounting based on the extracted features. We also propose possible experimental designs for continuously evaluating and improving TOMS systems, and present results of a pilot evaluation of our initial system.
引用
收藏
页码:131 / 144
页数:14
相关论文
共 50 条
  • [41] Topic-oriented community detection of rating-based social networks
    Reihanian, Ali
    Minaei-Bidgoli, Behrouz
    Alizadeh, Hosein
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2016, 28 (03) : 303 - 310
  • [42] TOSOM: A topic-oriented self-organizing map for text organization
    Yang, Hsin-Chang
    Lee, Chung-Hong
    Ke, Kuo-Lung
    World Academy of Science, Engineering and Technology, 2010, 65 : 1100 - 1104
  • [43] On the Usability of Clustering for Topic-oriented Multi-level Security Models
    Engelstad, Paal E.
    UKSIM-AMSS NINTH IEEE EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS 2015), 2015, : 14 - 20
  • [44] TOSOM: A topic-oriented self-organizing map for text organization
    Yang, Hsin-Chang
    Lee, Chung-Hong
    Ke, Kuo-Lung
    World Academy of Science, Engineering and Technology, 2010, 41 : 1100 - 1104
  • [45] RATE-COVERAGE ANALYSIS AND OPTIMIZATION FOR JOINT AUDIO-VIDEO MULTIMEDIA RETRIEVAL
    Ning, Guanghan
    Zhang, Zhi
    Ren, Xiaobo
    Wang, Haohong
    He, Zhihai
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2911 - 2915
  • [46] Integrated Personalized Video Summarization and Retrieval
    Shafeian, Hessamoddin
    Bhanu, Bir
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 996 - 999
  • [47] Incorporating User Constraints into Topic-Oriented Self-Organizing Maps
    Yang, Hsin-Chang
    Lee, Chung-Hong
    Wu, Chun-Yen
    2013 IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTATIONAL INTELLIGENCE (FOCI), 2013, : 91 - 97
  • [48] Path prediction of information diffusion based on a topic-oriented relationship strength network
    Zhu, Hengmin
    Yang, Xinyi
    Wei, Jing
    INFORMATION SCIENCES, 2023, 631 : 108 - 119
  • [49] THE RA-SCHOOL - CONCEPTION, IMPLEMENTATION AND EVALUATION OF TOPIC-ORIENTED PATIENT SEMINARS
    MATTUSSEK, S
    ZEITSCHRIFT FUR RHEUMATOLOGIE, 1988, 47 (04): : 271 - 271
  • [50] Style-Oriented Landmark Retrieval and Summarization
    Chang, Wei-Yi
    Yeh, Yi-Ren
    Wang, Yu-Chiang Frank
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,