Real-Time monophonic and polyphonic audio classification from power spectra

被引:7
|
作者
Baelde, Maxime [1 ,2 ]
Biernacki, Christophe [2 ]
Greff, Raphael [1 ]
机构
[1] A Volute, 19 Rue Ladrie, F-59491 Villeneuve Dascq, France
[2] Univ Lille, INRIA, Modal team, CNRS,UMR 8524,Lab Paul Painleve, F-59000 Lille, France
关键词
Real-time; Audio classification; Machine learning; Monophonic; Polyphonic; Generative model; Nonparametric estimation; MODEL;
D O I
10.1016/j.patcog.2019.03.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work addresses the recurring challenge of real-time monophonic and polyphonic audio source classification. The whole normalized power spectrum (NPS) is directly involved in the proposed process, avoiding complex and hazardous traditional feature extraction. It is also a natural candidate for polyphonic events thanks to its additive property in such cases. The classification task is performed through a nonparametric kernel-based generative modeling of the power spectrum. Advantage of this model is twofold: it is almost hypothesis free and it allows to straightforwardly obtain the maximum a posteriori classification rule of online signals. Moreover it makes use of the monophonic dataset to build the polyphonic one. Then, to reach the real-time target, the complexity of the method can be tuned by using a standard hierarchical clustering preprocessing of the prototypes, revealing a particularly efficient computation time and classification accuracy trade-off. The proposed method, called RARE (for Real-time Audio Recognition Engine) reveals encouraging results both in monophonic and polyphonic classification tasks on benchmark and owned datasets, including also the targeted real-time situation. In particular, this method benefits from several advantages compared to the state-of-the-art methods including a reduced training time, no feature extraction, the ability to control the computation - accuracy trade-off and no training on already mixed sounds for polyphonic classification. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:82 / 92
页数:11
相关论文
共 50 条
  • [1] Augmentation Methods on Monophonic Audio for Instrument Classification in Polyphonic Music
    Kratimenos, Agelos
    Avramidis, Kleanthis
    Garoufis, Christos
    Zlatintsi, Athanasia
    Maragos, Petros
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 156 - 160
  • [2] DEEP CONVOLUTIONAL AND RECURRENT NETWORKS FOR POLYPHONIC INSTRUMENT CLASSIFICATION FROM MONOPHONIC RAW AUDIO WAVEFORMS
    Avramidis, Kleanthis
    Kratimenos, Agelos
    Garoufis, Christos
    Zlatintsi, Athanasia
    Maragos, Petros
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3010 - 3014
  • [3] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
    Zhu, Bilei
    Wu, Fuzhang
    Li, Ke
    Wu, Yongjian
    Huang, Feiyue
    Wu, Yunsheng
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
  • [4] On-Device Intelligence for Real-Time Audio Classification and Enhancement
    Hwang, Inwoo
    Kim, Kibeom
    Kim, Sunmin
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2023, 71 (10): : 719 - 728
  • [5] Real-Time Pattern Recognition of Symbolic Monophonic Music
    Silva, Nishal
    Turchet, Luca
    PROCEEDINGS OF THE 19TH INTERNATIONAL AUDIO MOSTLY CONFERENCE, AM 2024, 2024, : 308 - 317
  • [6] REAL-TIME POLYPHONIC SCORE FOLLOWING SYSTEM
    Chou, Ting-Ting
    Chen, Wen-Chieh
    Wnag, Siang-An
    Chang, Ken-Ning
    Chen, Herng-Yow
    2012 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2012, : 205 - 210
  • [7] REAL-TIME ACOUSTIC ANALYSIS OF POLYPHONIC MUSIC
    MCGEE, WF
    MERKLEY, P
    PROCEEDINGS : 1989 INTERNATIONAL COMPUTER MUSIC CONFERENCE, NOVEMBER 2-5, 1989, : 199 - 202
  • [8] Real-Time GPU Audio
    Hsu, Bill
    Sosnick-Perez, Marc
    COMMUNICATIONS OF THE ACM, 2013, 56 (06) : 54 - 62
  • [9] REAL-TIME POWER SPECTRA BY BOX-FUNCTIONS
    SCHLOSSER, W
    BUCHHOLZ, M
    MAITZEN, HM
    ASTRONOMY & ASTROPHYSICS, 1976, 50 (01) : 91 - 92
  • [10] Real-Time Classification of Real-Time Communications
    Perna, Gianluca
    Markudova, Dena
    Trevisan, Martino
    Garza, Paolo
    Meo, Michela
    Munafo, Maurizio Matteo
    Carofiglio, Giovanna
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2022, 19 (04): : 4676 - 4690