Summarizing Lecture Videos by Key Handwritten Content Regions

被引:9
|
作者
Kota, Bhargava Urala [1 ]
Ahmed, Saleem [1 ]
Stone, Alexander [1 ]
Davila, Kenny [1 ]
Setlur, Srirangaraj [1 ]
Govindaraju, Venu [1 ]
机构
[1] SUNY BUFFALO, Dept Comp Sci & Engn, Buffalo, NY 14260 USA
基金
美国国家科学基金会;
关键词
RECOGNITION;
D O I
10.1109/ICDARW.2019.30058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a novel method for summarization of whiteboard lecture videos using key handwritten content regions. A deep neural network is used for detecting bounding boxes that contain semantically meaningful groups of handwritten content. A neural network embedding is learnt, under triplet loss, from the detected regions in order to discriminate between unique handwritten content. The detected regions along with embeddings at every frame of the lecture video are used to extract unique handwritten content across the video which are presented as the video summary. Additionally, a spatiotemporal index is constructed from the video which records the time and location of each individual summary region in the video which can potentially be used for content-based search and navigation. We train and test our methods on the publicly available AccessMath dataset. We use the DetEval scheme to benchmark our summarization by recall of unique ground truth objects (92.09%) and average number of summary regions (128) compared to the ground truth (88).
引用
收藏
页码:13 / 18
页数:6
相关论文
共 50 条
  • [21] User and Device Adaptation in Summarizing Sports Videos
    Nitta, Naoko
    Babaguchi, Noboru
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (06): : 1280 - 1288
  • [22] Audio-video based character recognition for handwritten mathematical content in classroom videos
    Vemulapalli, Smita
    Hayes, Monson
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2014, 21 (03) : 219 - 234
  • [23] TVSum: Summarizing Web Videos Using Titles
    Song, Yale
    Vallmitjana, Jordi
    Stent, Amanda
    Jaimes, Alejandro
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 5179 - 5187
  • [24] An Unsupervised Method for Summarizing Egocentric Sport Videos
    Habibi Aghdam, Hamed
    Jahani Heravi, Elnaz
    Puig, Domenec
    EIGHTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2015), 2015, 9875
  • [25] Key frame extraction method for lecture videos based on spatio-temporal subtitles
    Zhang, Yunzuo
    Li, Yi
    Cai, Zhaoquan
    Wang, Xuejun
    Zhang, Jiayu
    Lam, Shui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 5437 - 5450
  • [26] Key frame extraction method for lecture videos based on spatio-temporal subtitles
    Yunzuo Zhang
    Yi Li
    Zhaoquan Cai
    Xuejun Wang
    Jiayu Zhang
    Shui Lam
    Multimedia Tools and Applications, 2024, 83 : 5437 - 5450
  • [27] Summarizing videos into a target language: Methodology, architectures and evaluation
    Smaili, Kamel
    Fohr, Dominique
    Gonzalez-Gallardo, Carlos-Emiliano
    Grega, Michal
    Janowski, Lucjan
    Jouvet, Denis
    Kozbial, Arian
    Langlois, David
    Leszczuk, Mikolaj
    Mella, Odile
    Menacer, Mohamed-Amine
    Mendez, Amaia
    Pontes, Elvys Linhares
    SanJuan, Eric
    Torres-Moreno, Juan-Manuel
    Garcia-Zapirain, Begona
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (06) : 7415 - 7426
  • [28] An efficient technique for summarizing videos using visual contents
    Oh, J
    Hua, KA
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1167 - 1170
  • [29] Together Recognizing, Localizing and Summarizing Actions in Egocentric Videos
    Sahu, Abhimanyu
    Chowdhury, Ananda S.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4330 - 4340
  • [30] Using Live Tags for Summarizing Surgical Videos Collaboratively
    Haddad, Alexandre
    Bailly, Gilles
    Choussy, Olivier
    Taouachi, Rabah
    Avellino, Ignacio
    EXTENDED ABSTRACTS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2024, 2024,