Improving multimedia retrieval with a video OCR

被引:0
|
作者
Das, Dipanjan [1 ]
Chen, Datong [2 ]
Hauptmann, Alexander G. [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Comp Sci Dept, Pittsburgh, PA 15213 USA
关键词
video OCR; OCR; multimedia retrieval; video retrieval; optical character recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a set of experiments with a video OCR system (VOCR) tailored for video information retrieval and establish its importance in multimedia search in general and for some specific queries in particular. The system, inspired by an existing work on text detection and recognition in images, has been developed using, techniques involving detailed analysis of video frames producing candidate text regions. The text regions are then binarized and sent to a commercial OCR resulting in ASCII text, that is finally used to create search indexes. The system is evaluated using the TREVID data.. We compare the system's performance from an information retrieval perspective with another VOCR developed, using multi-frame integration and empirically demonstrate that deep analysis on individual video frames result in better video retrieval. We also evaluate the effect of various textual sources on multimedia retrieval by combining the VOCR outputs with automatic speech recognition (ASR) transcripts. For general search queries, the VOCR system coupled with ASR sources outperforms the other system by a very large extent. For search queries that involve named entities, especially people names, the VOCR system even outperforms speech transcripts, demonstrating that source selection for particular query types is extremely essential.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Multimedia based Information Retrieval Approach based on ASR and OCR and Video Recommendation System
    Bhabad, Dnyaneshwar T.
    Therese, Shanthi
    Gedam, Madhuri
    2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 1168 - 1172
  • [2] Recent and technologies in multimedia retrieval (4): Sports video retrieval
    Nitta N.
    Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2010, 64 (04): : 495 - 501
  • [3] Recent technologies in multimedia retrieval (3): Retrieval of news video
    Ide I.
    Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2010, 64 (03): : 306 - 311
  • [4] On the efficient retrieval of VBR video in a multimedia server
    Sahu, S
    Zhang, ZL
    Kurose, J
    Towsley, D
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS '97, PROCEEDINGS, 1997, : 46 - 53
  • [5] Improving Causality in Interpretable Video Retrieval
    Devi, Varsha
    Mulhem, Philippe
    Quenot, Georges
    20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023, 2023, : 249 - 255
  • [6] Improving Video Retrieval by Adaptive Margin
    He, Feng
    Wang, Qi
    Feng, Zhifan
    Jiang, Wenbin
    Lu, Yajuan
    Zhu, Yong
    Tan, Xiao
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1359 - 1368
  • [7] Multimedia image and video retrieval based on an improved HMM
    Liu, Yanbing
    Dhakal, Sanjev
    Hao, Binyao
    MULTIMEDIA SYSTEMS, 2022, 28 (06) : 2093 - 2103
  • [8] Multimedia image and video retrieval based on an improved HMM
    Yanbing Liu
    Sanjev Dhakal
    Binyao Hao
    Multimedia Systems, 2022, 28 : 2093 - 2103
  • [9] Improving video event retrieval by user feedback
    de Boer, Maaike
    Pingen, Geert
    Knook, Douwe
    Schutte, Klamer
    Kraaij, Wessel
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (21) : 22361 - 22381
  • [10] Semantic annotation and retrieval of video events using multimedia ontologies
    Bagdanov, Andrew D.
    Bertini, Marco
    Del Birnbo, Alberto
    Serra, Giuseppe
    Torniai, Carlo
    ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 713 - +