Real-Time monophonic and polyphonic audio classification from power spectra

被引:7
|
作者
Baelde, Maxime [1 ,2 ]
Biernacki, Christophe [2 ]
Greff, Raphael [1 ]
机构
[1] A Volute, 19 Rue Ladrie, F-59491 Villeneuve Dascq, France
[2] Univ Lille, INRIA, Modal team, CNRS,UMR 8524,Lab Paul Painleve, F-59000 Lille, France
关键词
Real-time; Audio classification; Machine learning; Monophonic; Polyphonic; Generative model; Nonparametric estimation; MODEL;
D O I
10.1016/j.patcog.2019.03.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work addresses the recurring challenge of real-time monophonic and polyphonic audio source classification. The whole normalized power spectrum (NPS) is directly involved in the proposed process, avoiding complex and hazardous traditional feature extraction. It is also a natural candidate for polyphonic events thanks to its additive property in such cases. The classification task is performed through a nonparametric kernel-based generative modeling of the power spectrum. Advantage of this model is twofold: it is almost hypothesis free and it allows to straightforwardly obtain the maximum a posteriori classification rule of online signals. Moreover it makes use of the monophonic dataset to build the polyphonic one. Then, to reach the real-time target, the complexity of the method can be tuned by using a standard hierarchical clustering preprocessing of the prototypes, revealing a particularly efficient computation time and classification accuracy trade-off. The proposed method, called RARE (for Real-time Audio Recognition Engine) reveals encouraging results both in monophonic and polyphonic classification tasks on benchmark and owned datasets, including also the targeted real-time situation. In particular, this method benefits from several advantages compared to the state-of-the-art methods including a reduced training time, no feature extraction, the ability to control the computation - accuracy trade-off and no training on already mixed sounds for polyphonic classification. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:82 / 92
页数:11
相关论文
共 50 条
  • [31] RESPONSIVE REAL-TIME COLLABORATOR WITH AUDIO CHANNEL
    Aldave, Jomelyn R.
    Maravillas, Elmer A.
    2015 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY,COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2015, : 490 - +
  • [32] Real-Time Audio Multicasting on Bluetooth Network
    Pinkumphi, Sayam
    Phonphoem, Anan
    ECTI-CON: 2009 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2009, : 958 - 961
  • [33] Ethernet Real-time Audio Transmission to FPGA
    Cochard, Pierre
    Weber, Jurek
    Michon, Romain
    Risset, Tanguy
    Letz, Stephane
    2024 IEEE 5TH INTERNATIONAL SYMPOSIUM ON THE INTERNET OF SOUNDS, IS2 2024, 2024, : 36 - 42
  • [34] Improved Real-Time Monophonic Pitch Tracking with the Extended Complex Kalman Filter
    Das, Orchisama
    Smith, Julius O., III
    Chafe, Chris
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2020, 68 (1-2): : 78 - 86
  • [35] Improved Real-Time Monophonic Pitch Tracking with the Extended Complex Kalman Filter
    Das O.
    Smith J.O., III
    Chafe C.
    AES: Journal of the Audio Engineering Society, 2020, 68 (1-2): : 78 - 86
  • [36] THE CONTINUOUS REAL-TIME COMPUTATION AND DISPLAY OF POWER SPECTRA WITH APPLICATION TO TREMOR
    ROSS, HF
    JOURNAL OF PHYSIOLOGY-LONDON, 1983, 338 (MAY): : P8 - P8
  • [37] Real-time gender classification
    Wu, B
    Ai, HZ
    Huang, C
    THIRD INTERNATIONAL SYMPOSIUM ON MULTISPECTRAL IMAGE PROCESSING AND PATTERN RECOGNITION, PTS 1 AND 2, 2003, 5286 : 498 - 503
  • [38] Real-time classification of petroleum products using near-infrared spectra
    Kim, M
    Lee, YH
    Han, CG
    COMPUTERS & CHEMICAL ENGINEERING, 2000, 24 (2-7) : 513 - 517
  • [39] Audio streaming on the Internet - Experiences with real-time streaming of audio streams
    Jonas, K
    Kanzow, P
    Kretschmer, M
    ISIE '97 - PROCEEDINGS OF THE IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, VOLS 1-3, 1997, : SS71 - SS76
  • [40] Real-time emotion control system for polyphonic MIDI musical excerpts
    Ohno, Masataka
    Miura, Masanobu
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2013, 34 (05) : 344 - 347