Pattern classification models for classifying and indexing audio signals

被引:9
|
作者
Dhanalakshmi, P. [1 ]
Palanivel, S. [1 ]
Ramalingam, V. [1 ]
机构
[1] Annamalai Univ, Dept Comp Sci & Engn, Annamalainagar 608002, Tamil Nadu, India
关键词
Autoassociative neural network; Gaussian mixture models; Linear predictive coefficients; Linear predictive cepstral coefficients; Mel-frequency cepstral coefficients; Audio indexing; k-Means clustering; SEGMENTATION;
D O I
10.1016/j.engappai.2010.10.011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the age of digital information, audio data has become an important part in many modern computer applications. Audio classification and indexing has been becoming a focus in the research of audio processing and pattern recognition. In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie. For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content. The autoassociative neural network model (AANN) is used to capture the distribution of the acoustic feature vectors. Then the proposed method uses a Gaussian mixture model (GMM)-based classifier where the feature vectors from each class were used to train the GMM models for those classes. During testing, the likelihood of a test sample belonging to each model is computed and the sample is assigned to the class whose model produces the highest likelihood. Audio clip extraction, feature extraction, creation of index, and retrieval of the query clip are the major issues in automatic audio indexing and retrieval. A method for indexing the classified audio using LPCC features and k-means clustering algorithm is proposed. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:350 / 357
页数:8
相关论文
共 50 条
  • [1] Spectral frequency tracking for classifying audio signals
    Taniguchi, Toru
    Tohyama, Mikio
    Shirai, Katsuhiko
    2006 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2006, : 300 - +
  • [2] Representing Nonspeech Audio Signals through Speech Classification Models
    Phan, Huy
    Hertel, Lars
    Maass, Marco
    Mazur, Radoslaw
    Mertins, Alfred
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3441 - 3445
  • [3] Classifying Flies Based on Reconstructed Audio Signals
    Flynn, Michael
    Bagnall, Anthony
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING (IDEAL 2019), PT II, 2019, 11872 : 249 - 258
  • [4] Audio indexing: primary components retrievalRobust classification in audio documents
    Julien Pinquier
    Régine André-Obrecht
    Multimedia Tools and Applications, 2006, 30 : 313 - 330
  • [5] An Overview on Perceptually Motivated Audio Indexing and Classification
    Richard, Gael
    Sundaram, Shiva
    Narayanan, Shrikanth
    PROCEEDINGS OF THE IEEE, 2013, 101 (09) : 1939 - 1954
  • [6] Layered indexing of home video based on audio signals
    Ogawa, T
    Aizawa, K
    STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2004, 2004, 5307 : 92 - 100
  • [7] Motivic Pattern Classification of Music Audio Signals Combining Residual and LSTM Networks
    Arronte Alvarez, Aitor
    Gomez, Francisco
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (06): : 208 - 214
  • [8] Audio indexing:: primary components retrieval -: Robust classification in audio documents
    Pinquier, Julien
    Andre-Obrecht, Regine
    MULTIMEDIA TOOLS AND APPLICATIONS, 2006, 30 (03) : 313 - 330
  • [9] Intelligent preprocessing and classification of audio signals
    Bai, Mingsain R.
    Chen, Meng-Chun
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2007, 55 (05): : 372 - 384
  • [10] Musical genre classification of audio signals
    Tzanetakis, G
    Cook, P
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05): : 293 - 302