Generalized Hough Transform for Speech Pattern Classification

被引:2
|
作者
Dennis, Jonathan [1 ]
Huy Dat Tran [1 ]
Li, Haizhou [1 ]
机构
[1] A STAR Inst Infocomm Res, Singapore 138632, Singapore
关键词
Codebook activation map; generalized Hough transform; speech pattern classification; TIMIT; OBJECT DETECTION; NEURAL-NETWORKS; IMAGE FEATURE; FEATURES;
D O I
10.1109/TASLP.2015.2459599
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While typical hybrid neural network architectures for automatic speech recognition (ASR) use a context window of frame-based features, this may not be the best approach to capture the wider temporal context, which contains phonetic and linguistic information that is equally important. In this paper, we introduce a system that integrates both the spectral and geometrical shape information from the acoustic spectrum, inspired by research in the field of machine vision. In particular, we focus on the Generalized Hough Transform (GHT), which is a sophisticated technique that can model the geometrical distribution of speech information over the wider temporal context. To integrate the GHT as part of a hybrid-ASR system, we propose to use a neural network, with features derived from the probabilistic Hough voting step of the GHT, to implement an improved version of the GHT where the output of the network represents the conventional target class posteriors. A major advantage of our approach is that each step of the GHT is highly interpretable, particularly compared to deep neural network (DNN) systems which are commonly treated as powerful black-box classifiers that give little insight into how the output is achieved. Experiments are carried out on two speech pattern classification tasks. The first is the TIMIT phoneme classification, which demonstrates the performance of the approach on a standard ASR task. The second is a spoken word recognition challenge, which highlights the flexibility of the approach to capture phonetic information within a longer temporal context.
引用
收藏
页码:1963 / 1972
页数:10
相关论文
共 50 条
  • [21] COMPRESSING THE PARAMETER SPACE OF THE GENERALIZED HOUGH TRANSFORM
    THOMAS, ADH
    PATTERN RECOGNITION LETTERS, 1992, 13 (02) : 107 - 112
  • [22] LIGHT: Local invariant generalized Hough Transform
    Artolazabal, Jose A. R.
    Illingworth, John
    Aguado, Alberto S.
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS, 2006, : 304 - +
  • [23] APPLICATION OF THE GENERALIZED HOUGH TRANSFORM TO CORNER DETECTION
    DAVIES, ER
    IEE PROCEEDINGS-E COMPUTERS AND DIGITAL TECHNIQUES, 1988, 135 (01): : 49 - 54
  • [24] LINEAR GENERALIZED HOUGH TRANSFORM AND ITS PARALLELIZATION
    LI, ZN
    YAO, B
    TONG, F
    IMAGE AND VISION COMPUTING, 1993, 11 (01) : 11 - 24
  • [25] A modified Generalized Hough Transform for image search
    Tipwai, Preeyakorn
    Madarasmi, Suthep
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (01) : 165 - 172
  • [26] A concurrent modified algorithm for Generalized Hough Transform
    Achalakul, T
    Madarasmi, S
    IEEE ICIT' 02: 2002 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS I AND II, PROCEEDINGS, 2002, : 965 - 969
  • [27] A chip-set for the generalized Hough transform
    Albanesi, MG
    Antola, A
    Ferretti, M
    Negrini, R
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 1996, 12 (02): : 115 - 134
  • [28] Discriminative Generalized Hough Transform for Object Detection
    Okada, Ryuzo
    2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 2000 - 2005
  • [29] FINDING ELLIPSES USING THE GENERALIZED HOUGH TRANSFORM
    DAVIES, ER
    PATTERN RECOGNITION LETTERS, 1989, 9 (02) : 87 - 96
  • [30] Road Detection by Using a Generalized Hough Transform
    Liu, Weifeng
    Zhang, Zhenqing
    Li, Shuying
    Tao, Dapeng
    REMOTE SENSING, 2017, 9 (06):