Retrieving Instructional Video Content from Speech and Text Information

被引:3
|
作者
Kothawade, Ashwini Y. [1 ]
Patil, Dipak R. [1 ]
机构
[1] Amruvahini Coll Engn, Dept Informat Technol, Sangamner, India
关键词
OCR; ASR; Video content retrieval; Instructional videos; e-Learning; Tele-lecture; Tesseract OCR; Video lectures indexing;
D O I
10.1007/978-981-10-0755-2_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The interest of today's generation to learn from video lectures is becoming popular due to its considerable advantages and easy availability than classroom learning. To involve into this, many institutes and organizations are using this method for teaching and learning. An enormous amount of data is generated in video lecturing form. To extract the desired information from the desired video from this vast video information available on internet becomes difficult. In this paper, we have used techniques for automatically retrieving the information from video files to collect it as a metadata for those files. For efficient retrieval of text from videos we use the OCR (Optical Character Recognition) tool to extract text from slides and ASR (Automatic Speech Recognition) tool for recognizing information from speech given by the speaker. First, we do segmentation and classification of video frames for identifying the key frames. Then the OCR and ASR tool is used for extracting the information from video slides and audio speech respectively. The collected data can be stored as a metadata for the file. Finally, the search can be made more efficient by applying clustering and ontology concept.
引用
收藏
页码:311 / 322
页数:12
相关论文
共 50 条
  • [21] Will Video Kill the Text Content Star?
    Affelt, Amy
    ECONTENT, 2016, 39 (08) : 10 - 10
  • [22] Video Retrieval Using Speech and Text InVideo
    Radha, N.
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 2, 2016, : 115 - 119
  • [23] From image to text to speech: the effects of speech prosody on information sequencing in audio description
    Hirvonen, Maija
    Wiklund, Mari
    TEXT & TALK, 2021, 41 (03) : 309 - 334
  • [24] Distributed video documents indexing and content-based retrieving
    Mostefaoui, A
    Favory, L
    PROTOCOLS AND SYSTEMS FOR INTERACTIVE DISTRIBUTED MULTIMEDIA, PROCEEDINGS, 2002, 2515 : 190 - 201
  • [25] Effects of information search tasks on the comprehension of instructional text
    Rouet, JF
    Vidal-Abarca, E
    Erboul, AB
    Millogo, V
    DISCOURSE PROCESSES, 2001, 31 (02) : 163 - 186
  • [26] Detecting Concealed Information in Text and Speech
    Hu, Shengli
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 402 - 412
  • [27] THE INFORMATION CONTENT OF DEMODULATED SPEECH
    Sell, Gregory
    Slaney, Malcolm
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5470 - 5473
  • [28] Integrated Coarse-Fine Scheme for Text Information Extraction from Instructional Videos
    Hamad, Ahmed
    El-Ghonaimy, Said
    Soliman, Taysir
    Afifi, Marwa
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (IACSIT ICMLC 2009), 2009, : 559 - 566
  • [29] Video2Text: Learning to Annotate Video Content
    Aradhye, Hrishikesh
    Toderici, George
    Yagnik, Jay
    2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 144 - 151
  • [30] Method for Retrieving Digital Agricultural Text Information Based on Local Matching
    Song, Yue
    Wang, Minjuan
    Gao, Wanlin
    SYMMETRY-BASEL, 2020, 12 (07):