A speaker identification system for video content analysis

被引:0
|
作者
Bi, Jing [1 ]
Liu, Shu-Chang [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100088, Peoples R China
关键词
D O I
10.1109/IIH-MSP.2008.215
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, more literatures proposed to apply audio content analysis techniques in content-based video parsing. This paper presents our current works on a speaker identification system for video content analysis. The system is different from normal ones in the following aspects: firstly, soundtrack extracted from video stream includes not only silence and speech, but also music and environmental sound; secondly, the number of speakers in video content are uncertain; thirdly, the presence of noise in the video can significantly deteriorate system performance. According to these considerations, our speaker identification system involves such basic parts: audio classification and segmentation using rule and Support Vector Machine(SVM) based classifier; speech clustering using spectral clustering technique and speaker identification based on Gaussian Mixture Model(GMM); speech enhancement based on spectral subtraction. Experiments are carried on a database extracted from news, conversation and movie videos. The obtained results confirm the validity of the proposed system architecture.
引用
收藏
页码:200 / 203
页数:4
相关论文
共 50 条
  • [1] Automatic audio classification and speaker identification for video content analysis
    Liu, Shu-Chang
    Bi, Jing
    Jia, Zhi-Qiang
    Chen, Rui
    Chen, Jie
    Zhou, Min-Min
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 2, PROCEEDINGS, 2007, : 91 - +
  • [2] Speaker Identification for the Analysis of Joint Attention in Video
    Gonzalez Contreras, Carlos Eduardo
    De-la-Torre, Miguel
    Gonzalez Becerra, Victor Hugo
    Avila-George, Himer
    Hernandez Palacio, Raul
    2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE PROCESS IMPROVEMENT (CIMPS), 2019,
  • [3] Speaker identification and video analysis for hierarchical video shot classification
    Nam, JH
    Cetin, AE
    Tewfik, AH
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL II, 1997, : 550 - 553
  • [4] Video classification using speaker identification
    Patel, NV
    Sethi, IK
    STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 : 218 - 225
  • [5] Adaptive speaker identification with audiovisual cues for movie content analysis
    Li, Y
    Narayanan, SS
    Kuo, CCJ
    PATTERN RECOGNITION LETTERS, 2004, 25 (07) : 777 - 791
  • [6] Online Video Content Analysis System
    Oyucu, Saadin
    Polat, Huseyin
    2018 2ND INTERNATIONAL SYMPOSIUM ON MULTIDISCIPLINARY STUDIES AND INNOVATIVE TECHNOLOGIES (ISMSIT), 2018, : 27 - 31
  • [7] Performance Analysis of Text-Independent Speaker Identification System
    Sekar, K.
    INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 1925 - 1934
  • [8] Performance Analysis of Text-Independent Speaker Identification System
    Sekar, K.
    INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 1735 - 1744
  • [9] Multimodal speaker identification with audio-video processing
    Yemez, Y
    Kanak, A
    Erzin, E
    Tekalp, AM
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 3, PROCEEDINGS, 2003, : 5 - 8
  • [10] Video editing support system based on video grammar and content analysis
    Kumano, M
    Ariki, Y
    Amano, M
    Uehara, K
    Shunto, K
    Tsukada, K
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL II, PROCEEDINGS, 2002, : 1031 - 1036