Overlapping sound event recognition using local spectrogram features and the generalised hough transform

被引:51
|
作者
Dennis, J. [1 ,2 ]
Tran, H. D. [1 ]
Chng, E. S. [2 ]
机构
[1] Inst Infocomm Res, Singapore 138632, Singapore
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
关键词
Overlapping sound event recognition; Local spectrogram features; Keypoint detection; Generalised Hough Transform; AUTOMATIC SPEECH RECOGNITION; SCALE; NOISE;
D O I
10.1016/j.patrec.2013.02.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the challenging task of simultaneous recognition of overlapping sound events from single channel audio. Conventional frame-based methods are not well suited to the problem, as each time frame contains a mixture of information from multiple sources. Missing feature masks are able to improve the recognition in such cases, but are limited by the accuracy of the mask, which is a non-trivial problem. In this paper, we propose an approach based on Local Spectrogram Features (LSFs) which represent local spectral information that is extracted from the two-dimensional region surrounding "keypoints" detected in the spectrogram. The keypoints are designed to locate the sparse, discriminative peaks in the spectrogram, such that we can model sound events through a set of representative LSF clusters and their occurrences in the spectrogram. To recognise overlapping sound events, we use a Generalised Hough Transform (GHT) voting system, which sums the information over many independent keypoints to produce onset hypotheses, that can detect any arbitrary combination of sound events in the spectrogram. Each hypothesis is then scored against the class distribution models to recognise the existence of the sound in the spectrogram. Experiments on a set of five overlapping sound events, in the presence of non-stationary background noise, demonstrate the potential of our approach. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:1085 / 1093
页数:9
相关论文
共 50 条
  • [21] Cursive handwriting recognition using the Hough transform and a neural network
    Ruiz-Pinales, J
    Lecolinet, E
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 231 - 234
  • [22] Handwritten digits recognition using Hough transform and neural networks
    Castellano, G
    Sandler, MB
    ISCAS 96: 1996 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - CIRCUITS AND SYSTEMS CONNECTING THE WORLD, VOL 3, 1996, : 313 - 316
  • [23] Parallel implementation of a track recognition system using Hough transform
    Dantas, ACH
    de Seixas, JM
    Franca, FMG
    VECTOR AND PARALLEL PROCESSING - VECPAR 2000, 2001, 1981 : 467 - 480
  • [24] NONANALYTIC OBJECT RECOGNITION USING THE HOUGH TRANSFORM WITH THE MATCHING TECHNIQUE
    SER, PK
    SIU, WC
    IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES, 1994, 141 (01): : 11 - 16
  • [25] MODIFICATION OF HOUGH TRANSFORM FOR OBJECT RECOGNITION USING A DIMENSIONAL ARRAY
    YIP, RKK
    TAM, PKS
    LEUNG, DNK
    PATTERN RECOGNITION, 1995, 28 (11) : 1733 - 1744
  • [26] Face Recognition using Hough Transform based Feature Extraction
    Varun, R.
    Kini, Yadunandan Vivekanand
    Manikantan, K.
    Ramachandran, S.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES, ICICT 2014, 2015, 46 : 1491 - 1500
  • [27] Singular features detection and classification of fingerprints using Hough transform
    Novikov, SO
    Kot, VS
    6TH INTERNATIONAL WORKSHOP ON DIGITAL IMAGE PROCESSING AND COMPUTER GRAPHICS (DIP-97): APPLICATIONS IN HUMANITIES AND NATURAL SCIENCES, 1998, 3346 : 259 - 269
  • [28] Speech Emotion Recognition Using Auditory Spectrogram and Cepstral Features
    Zhao, Shujie
    Yang, Yan
    Cohen, Israel
    Zhang, Lijun
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 136 - 140
  • [29] Sound recognition method for white feather broilers based on spectrogram features and the fusion classification model
    Lv, Meixuan
    Sun, Zhigang
    Zhang, Min
    Geng, Renxuan
    Gao, Mengmeng
    Wang, Guotao
    MEASUREMENT, 2023, 222
  • [30] Multifont arabic characters recognition using Hough transform and neural networks
    Ben Amor, Nadia
    Ben Amara, Najoua Essoukri
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 2, PROCEEDINGS, 2006, 3972 : 293 - 298