Audio content analysis for online audiovisual data segmentation and classification

被引:235
|
作者
Zhang, T [1 ]
Kuo, CCJ [1 ]
机构
[1] Univ So Calif, Integrated Media Syst Ctr, Los Angeles, CA 90089 USA
来源
关键词
audio analysis; audio indexing; audio segmentation; audiovisual content parsing; information filtering and retrieval; multimedia database management;
D O I
10.1109/89.917689
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While current approaches for audiovisual data segmentation and classification are mostly focused on visual cues, audio signals may actually play a more important role in content parsing for many applications, An approach to automatic segmentation and classification of audiovisual data based on audio content analysis is proposed, The audio signal from movies or TV programs is segmented and classified into basic types such as speech, music, song, environmental sound, speech with music background, environmental sound with music background, silence, etc. Simple audio features Including the energy function, the average zero-crossing rate, the fundamental frequency, and the spectral peak tracks are extracted to ensure the feasibility of real-time processing. A heuristic rule-based procedure is proposed to segment and classify audio signals and built upon morphological and statistical analysis of the time-varying functions of these audio features. Experimental results show that the proposed scheme achieves an accuracy rate of more than 90% in audio classification.
引用
收藏
页码:441 / 457
页数:17
相关论文
共 50 条
  • [1] Content analysis for audio classification and segmentation
    Lu, L
    Zhang, HJ
    Jiang, H
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07): : 504 - 516
  • [2] Audio-guided audiovisual data segmentation, indexing, and retrieval
    Zhang, T
    Kuo, CCJ
    STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES VII, 1998, 3656 : 316 - 327
  • [3] On classification and segmentation of massive audio data streams
    Charu C. Aggarwal
    Knowledge and Information Systems, 2009, 20 : 137 - 156
  • [4] On classification and segmentation of massive audio data streams
    Aggarwal, Charu C.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2009, 20 (02) : 137 - 156
  • [5] Audio Segmentation in AAC Domain for Content Analysis
    Zhu, Rong
    Ai, Haojun
    Hu, Ruimin
    2009 5TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-8, 2009, : 2043 - 2046
  • [6] Video segmentation with the assistance of audio content analysis
    Jiang, H
    Lin, T
    Zhang, HJ
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1507 - 1510
  • [7] A Framework for Classification and Segmentation of Massive Audio Data Streams
    Aggarwal, Charu C.
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 1013 - 1017
  • [8] Audio feature extraction and analysis for scene segmentation and classification
    Liu, Z
    Wang, Y
    Chen, TH
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 1998, 20 (1-2): : 61 - 79
  • [9] Audio segmentation and classification based on a selective analysis scheme
    Ghaemmaghami, S
    10TH INTERNATIONAL MULTIMEDIA MODELLING CONFERENCE, PROCEEDINGS, 2004, : 42 - 48
  • [10] Audio Feature Extraction and Analysis for Scene Segmentation and Classification
    Zhu Liu
    Yao Wang
    Tsuhan Chen
    Journal of VLSI signal processing systems for signal, image and video technology, 1998, 20 : 61 - 79