Overlapping sound event recognition using local spectrogram features and the generalised hough transform

被引:51
|
作者
Dennis, J. [1 ,2 ]
Tran, H. D. [1 ]
Chng, E. S. [2 ]
机构
[1] Inst Infocomm Res, Singapore 138632, Singapore
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
关键词
Overlapping sound event recognition; Local spectrogram features; Keypoint detection; Generalised Hough Transform; AUTOMATIC SPEECH RECOGNITION; SCALE; NOISE;
D O I
10.1016/j.patrec.2013.02.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the challenging task of simultaneous recognition of overlapping sound events from single channel audio. Conventional frame-based methods are not well suited to the problem, as each time frame contains a mixture of information from multiple sources. Missing feature masks are able to improve the recognition in such cases, but are limited by the accuracy of the mask, which is a non-trivial problem. In this paper, we propose an approach based on Local Spectrogram Features (LSFs) which represent local spectral information that is extracted from the two-dimensional region surrounding "keypoints" detected in the spectrogram. The keypoints are designed to locate the sparse, discriminative peaks in the spectrogram, such that we can model sound events through a set of representative LSF clusters and their occurrences in the spectrogram. To recognise overlapping sound events, we use a Generalised Hough Transform (GHT) voting system, which sums the information over many independent keypoints to produce onset hypotheses, that can detect any arbitrary combination of sound events in the spectrogram. Each hypothesis is then scored against the class distribution models to recognise the existence of the sound in the spectrogram. Experiments on a set of five overlapping sound events, in the presence of non-stationary background noise, demonstrate the potential of our approach. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:1085 / 1093
页数:9
相关论文
共 50 条
  • [1] Overlapping Sound Event Recognition using Local Spectrogram Features with the Generalised Hough Transform
    Dennis, Jonathan
    Huy Dat Tran
    Chng, Eng Siong
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2263 - 2266
  • [2] TEMPORAL CODING OF LOCAL SPECTROGRAM FEATURES FOR ROBUST SOUND RECOGNITION
    Dennis, Jonathan
    Qiang, Yu
    Tang Huajin
    Tran Huy Dat
    Li Haizhou
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 803 - 807
  • [3] Enhanced Local Feature Approach for Overlapping Sound Event Recognition
    Dennis, Jonathan
    Huy Dat Tran
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [4] An improved generalized Hough transform for the recognition of overlapping objects
    Tsai, DM
    IMAGE AND VISION COMPUTING, 1997, 15 (12) : 877 - 888
  • [5] Iris Recognition Using Hough Transform
    Rajabhushanam, C.
    Shirke, Swati D.
    JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, : 178 - 183
  • [6] RECOGNITION OF HANDPRINTED HEBREW CHARACTERS USING FEATURES SELECTED IN THE HOUGH TRANSFORM SPACE
    KUSHNIR, M
    ABE, K
    MATSUMOTO, K
    PATTERN RECOGNITION, 1985, 18 (02) : 103 - 114
  • [7] A landmark matching algorithm using the improved generalised Hough transform
    Chen, Binbin
    Deng, Xinpu
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXI, 2015, 9643
  • [8] Acoustic Event Classification Using Spectrogram Features
    Mulimani, Manjunath
    Koolagudi, Shashidhar G.
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 1460 - 1464
  • [9] Flow Pattern Recognition Using Spectrogram of Flow Generated Sound with New Adaptive LBP Features
    Parsai, Soroosh
    Ahmadi, Majid
    PROCEEDINGS OF SEVENTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2022, VOL. 3, 2023, 464 : 401 - 413
  • [10] Colour image detection and matching using modified generalised Hough transform
    Lo, RC
    Tsai, WH
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1996, 143 (04): : 201 - 209