Improving multimedia retrieval with a video OCR

被引:0
|
作者
Das, Dipanjan [1 ]
Chen, Datong [2 ]
Hauptmann, Alexander G. [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Comp Sci Dept, Pittsburgh, PA 15213 USA
关键词
video OCR; OCR; multimedia retrieval; video retrieval; optical character recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a set of experiments with a video OCR system (VOCR) tailored for video information retrieval and establish its importance in multimedia search in general and for some specific queries in particular. The system, inspired by an existing work on text detection and recognition in images, has been developed using, techniques involving detailed analysis of video frames producing candidate text regions. The text regions are then binarized and sent to a commercial OCR resulting in ASCII text, that is finally used to create search indexes. The system is evaluated using the TREVID data.. We compare the system's performance from an information retrieval perspective with another VOCR developed, using multi-frame integration and empirically demonstrate that deep analysis on individual video frames result in better video retrieval. We also evaluate the effect of various textual sources on multimedia retrieval by combining the VOCR outputs with automatic speech recognition (ASR) transcripts. For general search queries, the VOCR system coupled with ASR sources outperforms the other system by a very large extent. For search queries that involve named entities, especially people names, the VOCR system even outperforms speech transcripts, demonstrating that source selection for particular query types is extremely essential.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Development and Application of Tennis Match Video Retrieval Technology in Multimedia Education
    Cao, Shehua
    COMPUTER AND COMPUTING TECHNOLOGIES IN AGRICULTURE IV, PT 4, 2011, 347 : 192 - 197
  • [22] Trial Realization of Human-Centered Multimedia Navigation for Video Retrieval
    Haseyama, Miki
    Ogawa, Takahiro
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2013, 29 (02) : 96 - 109
  • [23] Improving Automatic Video Retrieval with Semantic Concept Detection
    Koskela, Markus
    Sjoberg, Mats
    Laaksonen, Jorma
    IMAGE ANALYSIS, PROCEEDINGS, 2009, 5575 : 480 - 489
  • [24] Improving Video Retrieval Using Multilingual Knowledge Transfer
    Madasu, Avinash
    Aflalo, Estelle
    Stan, Gabriela Ben Melech
    Tseng, Shao-Yen
    Bertasius, Gedas
    Lal, Vasudev
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT I, 2023, 13980 : 669 - 684
  • [25] Improving multimedia streaming with content-aware video scaling
    Tripathi, A
    Claypool, M
    PROCEEDINGS OF THE 6TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2002, : 1021 - 1024
  • [26] Evaluation framework for video OCR
    Soundararajan, Padmanabhan
    Boonstra, Matthew
    Manohar, Vasant
    Korzhova, Valentina
    Goldgof, Dmitry
    Kasturi, Rangachar
    Prasad, Shubha
    Raju, Harish
    Bowers, Rachel
    Garofolo, John
    COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING, PROCEEDINGS, 2006, 4338 : 829 - +
  • [27] Image and video retrieval from a user-centered mobile multimedia perspective
    Boll, S
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2005, 3568 : 18 - 27
  • [28] Combining Boolean and Multimedia Retrieval in vitrivr for Large-Scale Video Search
    Sauter, Loris
    Parian, Mahnaz Amiri
    Gasser, Ralph
    Heller, Silvan
    Rossetto, Luca
    Schuldt, Heiko
    MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 760 - 765
  • [29] Improving Youtube video retrieval by integrating crowdsourced timed metadata
    Pinto, Jose Pedro
    Viana, Paula
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (06) : 7207 - 7221
  • [30] Improving Video Corpus Moment Retrieval with Partial Relevance Enhancement
    Hou, Danyang
    Pang, Liang
    Shen, Huawei
    Cheng, Xueqi
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 394 - 403