Overlapping sound event recognition using local spectrogram features and the generalised hough transform

被引:51
|
作者
Dennis, J. [1 ,2 ]
Tran, H. D. [1 ]
Chng, E. S. [2 ]
机构
[1] Inst Infocomm Res, Singapore 138632, Singapore
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
关键词
Overlapping sound event recognition; Local spectrogram features; Keypoint detection; Generalised Hough Transform; AUTOMATIC SPEECH RECOGNITION; SCALE; NOISE;
D O I
10.1016/j.patrec.2013.02.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the challenging task of simultaneous recognition of overlapping sound events from single channel audio. Conventional frame-based methods are not well suited to the problem, as each time frame contains a mixture of information from multiple sources. Missing feature masks are able to improve the recognition in such cases, but are limited by the accuracy of the mask, which is a non-trivial problem. In this paper, we propose an approach based on Local Spectrogram Features (LSFs) which represent local spectral information that is extracted from the two-dimensional region surrounding "keypoints" detected in the spectrogram. The keypoints are designed to locate the sparse, discriminative peaks in the spectrogram, such that we can model sound events through a set of representative LSF clusters and their occurrences in the spectrogram. To recognise overlapping sound events, we use a Generalised Hough Transform (GHT) voting system, which sums the information over many independent keypoints to produce onset hypotheses, that can detect any arbitrary combination of sound events in the spectrogram. Each hypothesis is then scored against the class distribution models to recognise the existence of the sound in the spectrogram. Experiments on a set of five overlapping sound events, in the presence of non-stationary background noise, demonstrate the potential of our approach. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:1085 / 1093
页数:9
相关论文
共 50 条
  • [41] Recognition of Sheep Feeding Behavior in Sheepfolds Using Fusion Spectrogram Depth Features and Acoustic Features
    Yu, Youxin
    Zhu, Wenbo
    Ma, Xiaoli
    Du, Jialei
    Liu, Yu
    Gan, Linhui
    An, Xiaoping
    Li, Honghui
    Wang, Buyu
    Fu, Xueliang
    ANIMALS, 2024, 14 (22):
  • [42] Cracking automation recognition of cement pavement based on Hough Transform and geometrical features of connected component
    Liu, FanFan
    Xu, GuoAi
    Yang, YiXian
    Niu, XinXin
    Pan, YuLi
    2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 3, 2008, : 859 - 863
  • [43] Rotation invariant image recognition using Hough transform and support vector machines
    Ruiz-Pinales, Jose
    Acosta-Reyes, Juan Jorge
    Jaime-Rivas, Rene
    Salazar-Garibay, Adan
    MEP 2006: PROCEEDINGS OF MULTICONFERENCE ON ELECTRONICS AND PHOTONICS, 2006, : 196 - +
  • [44] Multifont Arabic character recognition using Hough transform and hidden Markov models
    Ben Amor, N
    Ben Amara, NE
    ISPA 2005: Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005, : 285 - 288
  • [45] SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection
    Nguyen, Thi Ngoc Tho
    Watcharasupat, Karn N.
    Nguyen, Ngoc Khanh
    Jones, Douglas L.
    Gan, Woon-Seng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1749 - 1762
  • [46] Sound-spectrogram based automatic bird species recognition using MLP classifier
    Pahuja, Roop
    Kumar, Avijeet
    APPLIED ACOUSTICS, 2021, 180
  • [47] SHAPES RECOGNITION USING THE STRAIGHT-LINE HOUGH TRANSFORM - THEORY AND GENERALIZATION
    PAO, DCW
    LI, HF
    JAYAKUMAR, R
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1992, 14 (11) : 1076 - 1089
  • [48] Real-time object recognition using a modified generalized Hough transform
    Ulrich, M
    Steger, C
    Baumgartner, A
    PATTERN RECOGNITION, 2003, 36 (11) : 2557 - 2570
  • [49] An intelligent Lane markers recognition and localization system using improved Hough Transform
    bin Ghazali, Kamarul Hawari
    Xiao, Rui
    Ma, Jie
    FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE II, PTS 1-6, 2012, 121-126 : 1186 - 1190
  • [50] Real-time pattern recognition using an optical generalized Hough transform
    Fernandez, Ariel
    Flores, Jorge L.
    Alonso, Julia R.
    Ferrari, Jose A.
    APPLIED OPTICS, 2015, 54 (36) : 10586 - 10591