Real-Time monophonic and polyphonic audio classification from power spectra

被引:7
|
作者
Baelde, Maxime [1 ,2 ]
Biernacki, Christophe [2 ]
Greff, Raphael [1 ]
机构
[1] A Volute, 19 Rue Ladrie, F-59491 Villeneuve Dascq, France
[2] Univ Lille, INRIA, Modal team, CNRS,UMR 8524,Lab Paul Painleve, F-59000 Lille, France
关键词
Real-time; Audio classification; Machine learning; Monophonic; Polyphonic; Generative model; Nonparametric estimation; MODEL;
D O I
10.1016/j.patcog.2019.03.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work addresses the recurring challenge of real-time monophonic and polyphonic audio source classification. The whole normalized power spectrum (NPS) is directly involved in the proposed process, avoiding complex and hazardous traditional feature extraction. It is also a natural candidate for polyphonic events thanks to its additive property in such cases. The classification task is performed through a nonparametric kernel-based generative modeling of the power spectrum. Advantage of this model is twofold: it is almost hypothesis free and it allows to straightforwardly obtain the maximum a posteriori classification rule of online signals. Moreover it makes use of the monophonic dataset to build the polyphonic one. Then, to reach the real-time target, the complexity of the method can be tuned by using a standard hierarchical clustering preprocessing of the prototypes, revealing a particularly efficient computation time and classification accuracy trade-off. The proposed method, called RARE (for Real-time Audio Recognition Engine) reveals encouraging results both in monophonic and polyphonic classification tasks on benchmark and owned datasets, including also the targeted real-time situation. In particular, this method benefits from several advantages compared to the state-of-the-art methods including a reduced training time, no feature extraction, the ability to control the computation - accuracy trade-off and no training on already mixed sounds for polyphonic classification. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:82 / 92
页数:11
相关论文
共 50 条
  • [21] A Real-Time Audio Upmixing Method from Stereo to 7.1-Channel Audio
    Chun, Chan Jun
    Lee, Young Han
    Kim, Yong Guk
    Kim, Hong Kook
    Cho, Choong Sang
    COMMUNICATION AND NETWORKING, PT II, 2010, 120 : 162 - +
  • [22] Real-Time Polyphonic Pitch Detection on Acoustic Musical Signals
    Goodman, Thomas A.
    Batten, Ian
    2018 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2018, : 656 - 661
  • [23] Robust feature extraction and classification of EEG spectra for real-time classification of cognitive state
    Wallerius, J
    Trejo, LJ
    Matthews, R
    Rosipal, R
    Caldwell, JA
    Foundations of Augmented Cognition, Vol 11, 2005, : 302 - 311
  • [24] Real-time system for automatic classification of power quality disturbances
    Ribeiro, E. G.
    Dias, G. L.
    Barbosa, B. H. G.
    Ferreira, D. D.
    PROCEEDINGS OF 2016 17TH INTERNATIONAL CONFERENCE ON HARMONICS AND QUALITY OF POWER (ICHQP), 2016, : 908 - 913
  • [25] A Novel Cascaded Approach for Classification of Tuberculosis Using Cough Audio in Real-Time Environment
    Mahmood, Haroon
    Iftikhar, Manal
    Wali, Aamir
    Ali, Arshad
    Gulzar, Maryam
    IEEE ACCESS, 2024, 12 : 191980 - 191993
  • [26] Score-Informed Source Separation Based on Real-time Polyphonic Score-to-Audio Alignment and Bayesian Harmonic Model
    Cai, Juanjuan
    Guo, Yiyun
    Wang, Hui
    Wang, Ying
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 672 - 680
  • [27] Real-Time Audio Similarity Comparison Algorithm
    Jaiyen, Nantawat
    Hantula, Panya
    Tongta, Rangsan
    PROCEEDINGS OF THE 2017 IEEE 15TH STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT (SCORED), 2017, : 477 - 480
  • [28] Audio real-time processing for multimedia computer
    Zhang, Chengyun
    Xie, Zhiwen
    Xie, Bosun
    Diansheng Jishu/Audio Engineering, 2000, (01): : 19 - 21
  • [29] AudioWiz: Nearly Real-time Audio Transcriptions
    White, Samuel
    ASSETS 2010: PROCEEDINGS OF THE 12TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2010, : 307 - 308
  • [30] Real-time audio watermarking system prototype
    Hernandez, Jose Juan Garcia
    Miyatake, Mariko Nakano
    Meana, Hector Perez
    ISM 2006: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2006, : 792 - +