Real-Time monophonic and polyphonic audio classification from power spectra

被引:7
|
作者
Baelde, Maxime [1 ,2 ]
Biernacki, Christophe [2 ]
Greff, Raphael [1 ]
机构
[1] A Volute, 19 Rue Ladrie, F-59491 Villeneuve Dascq, France
[2] Univ Lille, INRIA, Modal team, CNRS,UMR 8524,Lab Paul Painleve, F-59000 Lille, France
关键词
Real-time; Audio classification; Machine learning; Monophonic; Polyphonic; Generative model; Nonparametric estimation; MODEL;
D O I
10.1016/j.patcog.2019.03.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work addresses the recurring challenge of real-time monophonic and polyphonic audio source classification. The whole normalized power spectrum (NPS) is directly involved in the proposed process, avoiding complex and hazardous traditional feature extraction. It is also a natural candidate for polyphonic events thanks to its additive property in such cases. The classification task is performed through a nonparametric kernel-based generative modeling of the power spectrum. Advantage of this model is twofold: it is almost hypothesis free and it allows to straightforwardly obtain the maximum a posteriori classification rule of online signals. Moreover it makes use of the monophonic dataset to build the polyphonic one. Then, to reach the real-time target, the complexity of the method can be tuned by using a standard hierarchical clustering preprocessing of the prototypes, revealing a particularly efficient computation time and classification accuracy trade-off. The proposed method, called RARE (for Real-time Audio Recognition Engine) reveals encouraging results both in monophonic and polyphonic classification tasks on benchmark and owned datasets, including also the targeted real-time situation. In particular, this method benefits from several advantages compared to the state-of-the-art methods including a reduced training time, no feature extraction, the ability to control the computation - accuracy trade-off and no training on already mixed sounds for polyphonic classification. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:82 / 92
页数:11
相关论文
共 50 条
  • [41] Resources and Power Efficient FPGA Accelerators for Real-Time Image Classification
    Kyriakos, Angelos
    Papatheofanous, Elissaios-Alexios
    Bezaitis, Charalampos
    Reisis, Dionysios
    JOURNAL OF IMAGING, 2022, 8 (04)
  • [42] A systematic review of real-time detection and classification of power quality disturbances
    Joaquín E. Caicedo
    Daniel Agudelo-Martínez
    Edwin Rivas-Trujillo
    Jan Meyer
    Protection and Control of Modern Power Systems, 2023, 8
  • [43] Real-time voltage sag detection and classification for power quality diagnostics
    Nagata, Erick A.
    Ferreira, Danton D.
    Bollen, Math H. J.
    Barbosa, Bruno H. G.
    Ribeiro, Eduardo G.
    Duque, Carlos A.
    Ribeiro, Paulo F.
    MEASUREMENT, 2020, 164
  • [44] A systematic review of real-time detection and classification of power quality disturbances
    Caicedo, Joaquin E.
    Agudelo-Martinez, Daniel
    Rivas-Trujillo, Edwin
    Meyer, Jan
    PROTECTION AND CONTROL OF MODERN POWER SYSTEMS, 2023, 8 (01)
  • [45] The power of real-time PCR
    Valasek, MA
    Repa, JJ
    ADVANCES IN PHYSIOLOGY EDUCATION, 2005, 29 (03) : 151 - 159
  • [46] Real-Time Context Aware Audio Augmented Reality
    Arvanitis, Gerasimos
    Moustakas, Konstantinos
    Fakotakis, Nikos
    SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 333 - 340
  • [47] COMPRESSION CHIP HANDLES REAL-TIME VIDEO AND AUDIO
    LEONARD, M
    ELECTRONIC DESIGN, 1990, 38 (23) : 43 - &
  • [48] Cycle saving hardware for real-time audio processing
    Park, SW
    Yoo, SK
    Jeong, NH
    Kim, JS
    Ko, WS
    Lee, KS
    Youn, DH
    ELECTRONICS LETTERS, 1998, 34 (09) : 847 - 848
  • [49] Implementation of Real-Time Audio Watermarking Based on DSP
    Zhang, Qiuyu
    Deng, Jiabin
    Yuan, Zhanting
    IITAW: 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATIONS WORKSHOPS, 2009, : 145 - 148
  • [50] Load adaptive real-time audio playout algorithm
    Tan, Yu'an
    Ai, Benren
    Cao, Yuan-Da
    Zhang, Xue-Lan
    Jisuanji Gongcheng/Computer Engineering, 2006, 32 (14): : 199 - 201