Generalized Hough Transform for Speech Pattern Classification

被引:2
|
作者
Dennis, Jonathan [1 ]
Huy Dat Tran [1 ]
Li, Haizhou [1 ]
机构
[1] A STAR Inst Infocomm Res, Singapore 138632, Singapore
关键词
Codebook activation map; generalized Hough transform; speech pattern classification; TIMIT; OBJECT DETECTION; NEURAL-NETWORKS; IMAGE FEATURE; FEATURES;
D O I
10.1109/TASLP.2015.2459599
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While typical hybrid neural network architectures for automatic speech recognition (ASR) use a context window of frame-based features, this may not be the best approach to capture the wider temporal context, which contains phonetic and linguistic information that is equally important. In this paper, we introduce a system that integrates both the spectral and geometrical shape information from the acoustic spectrum, inspired by research in the field of machine vision. In particular, we focus on the Generalized Hough Transform (GHT), which is a sophisticated technique that can model the geometrical distribution of speech information over the wider temporal context. To integrate the GHT as part of a hybrid-ASR system, we propose to use a neural network, with features derived from the probabilistic Hough voting step of the GHT, to implement an improved version of the GHT where the output of the network represents the conventional target class posteriors. A major advantage of our approach is that each step of the GHT is highly interpretable, particularly compared to deep neural network (DNN) systems which are commonly treated as powerful black-box classifiers that give little insight into how the output is achieved. Experiments are carried out on two speech pattern classification tasks. The first is the TIMIT phoneme classification, which demonstrates the performance of the approach on a standard ASR task. The second is a spoken word recognition challenge, which highlights the flexibility of the approach to capture phonetic information within a longer temporal context.
引用
收藏
页码:1963 / 1972
页数:10
相关论文
共 50 条
  • [1] Generalized Hough Transform For Object Classification in the Maritime Domain
    Rerkngamsanga, Pornrerk
    Tummala, Murali
    Scrofani, James
    McEachen, John
    2016 11TH SYSTEMS OF SYSTEM ENGINEERING CONFERENCE (SOSE), IEEE, 2016,
  • [2] Spiking Neural Networks and the Generalised Hough Transform for Speech Pattern Detection
    Dennis, Jonathan
    Huy Dat Tran
    Li, Haizhou
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1997 - 2001
  • [3] THE DYNAMIC GENERALIZED HOUGH TRANSFORM
    LEAVERS, VF
    LECTURE NOTES IN COMPUTER SCIENCE, 1990, 427 : 592 - 594
  • [4] GENERALIZING THE GENERALIZED HOUGH TRANSFORM
    WOLFSON, HJ
    PATTERN RECOGNITION LETTERS, 1991, 12 (09) : 565 - 573
  • [5] FAST GENERALIZED HOUGH TRANSFORM
    JENG, SC
    TSAI, WH
    PATTERN RECOGNITION LETTERS, 1990, 11 (11) : 725 - 733
  • [6] Classification of Voting Patterns to Improve the Generalized Hough Transform for Epiphyses Localization
    Hahmann, Ferdinand
    Boeer, Gordon
    Gabriel, Eric
    Deserno, Thomas M.
    Meyer, Carsten
    Schramm, Hauke
    MEDICAL IMAGING 2016: COMPUTER-AIDED DIAGNOSIS, 2015, 9785
  • [7] Incoherent optical generalized Hough transform: pattern recognition and feature extraction applications
    Fernandez, Ariel
    Ferrari, Jose A.
    OPTICAL ENGINEERING, 2017, 56 (05)
  • [8] Real-time pattern recognition using an optical generalized Hough transform
    Fernandez, Ariel
    Flores, Jorge L.
    Alonso, Julia R.
    Ferrari, Jose A.
    APPLIED OPTICS, 2015, 54 (36) : 10586 - 10591
  • [9] Finger localization and classification in images based on generalized hough transform and probabilistic models
    Barrho, Jorg
    Adam, Mathias
    Kiencke, Uwe
    2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 943 - +
  • [10] QUALITATIVE FEATURES AND THE GENERALIZED HOUGH TRANSFORM
    BHANDARKAR, SM
    SUK, M
    PATTERN RECOGNITION, 1992, 25 (09) : 987 - 1006