Lexical tone recognition with an artificial neural network

被引:15
|
作者
Zhou, Ning [1 ]
Zhang, Wenle [2 ]
Lee, Chao-Yang [1 ]
Xu, Li [1 ]
机构
[1] Ohio Univ, Sch Hearing Speech & Language Sci, Athens, OH 45701 USA
[2] Ohio Univ, Sch Elect Engn & Comp Sci, Athens, OH 45701 USA
来源
EAR AND HEARING | 2008年 / 29卷 / 03期
关键词
D O I
10.1097/AUD.0b013e3181662c42
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objectives: Tone production is particularly important for communicating in tone languages such as Mandarin Chinese. In the present study, an artificial neural network was used to recognize tones produced by adult native speakers. The purposes of the study were (1) to test the sensitivity of the neural network to speaker variation typically in adult speaker groups, (2) to evaluate two normalization procedures to overcome the effects of speaker variation, and (3) to compare tone recognition performance of the neural network with that of the human listeners. Design: A feedforward multilayer neural network was used. Twenty-nine adult native Mandarin Chinese speakers were recruited to record tone samples. The F0 contours of the vowel part of the 1044 monosyllabic words recorded were extracted using an autocorrelation method. Samples from the F0 contours were used as inputs to the neural network. The efficacy of the neural network was first tested by varying the number of inputs and the number of neurons in the hidden layer from 1 to 16. The sensitivity of the neural network to speaker variation was tested by (1) using the raw F0 data from speech tokens of a number of randomly drawn speakers that varied from 1 to 29, (2) using the raw F0 data from speech tokens of either male-only or female-only speakers, and (3) using two sets of normalized F0 data (i.e., tone 1-based normalization and first-order derivative) from speech tokens from a number of randomly drawn speakers that varied from 1 to 29. The recognition performance of the neural network under several experimental conditions was compared with the corresponding recognition performance of 10 normal-hearing, native Mandarin Chinese speaking adult listeners. Results: Three inputs and four hidden neurons were found to be sufficient for the neural network to perform at about 85% correct using speech samples without normalization. The performance of the neural network was affected by variation across speakers particularly between genders. Using the tone 1-based normalization procedure, the performance of the neural network improved significantly. The recognition accuracy of the neural network as a whole or for each tone was comparable with that of the human listeners. Conclusions: The neural network can be used to evaluate the tone production of Mandarin Chinese speaking adults with human listener-like recognition accuracy. The tone 1-based normalization procedure improves the performance of the neural network to human listener-like accuracy. The success of our neural network in recognizing tones from multiple speakers supports its utility for evaluating tone production. Further testing of the neural network with hearing-impaired speakers might reveal its potential use for clinical evaluation of tone production.
引用
收藏
页码:326 / 335
页数:10
相关论文
共 50 条
  • [21] Tone Recognition of Continuous Speech of Standard Chinese Using Neural Network and Tone Nucleus Model
    Hirose, Keikichi
    Hu, Hui
    Wang, Xiaodong
    Minematsu, Nobuaki
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2394 - +
  • [22] HIDDEN MARKOV MODEL FOR MANDARIN LEXICAL TONE RECOGNITION
    YANG, WJ
    LEE, JC
    CHANG, YC
    WANG, HC
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1988, 36 (07): : 988 - 992
  • [23] Artificial neural network for bubbles pattern recognition on the images
    Poletaev, I. E.
    Pervunin, K. S.
    Tokarev, M. P.
    ALL RUSSIAN CONFERENCE WITH THE SCHOOL FOR YOUNG SCIENTISTS THERMOPHYSICS AND PHYSICAL HYDRODYNAMICS - 2016, 2016, 754 (1-10):
  • [24] A Fingerprint Recognition Framework Using Artificial Neural Network
    Oulhiq, Ridouane
    Ibntahir, Saad
    Sebgui, Marouane
    Guennoun, Zouhair
    2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2015,
  • [25] Application on Lithology Recognition With BP Artificial Neural Network
    Zhou, Jinhui
    Yan, Jienian
    Pan, Li
    2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL 1, PROCEEDINGS, 2009, : 56 - +
  • [26] An optical character recognition using artificial neural network
    Mani, N
    Voumard, P
    INFORMATION INTELLIGENCE AND SYSTEMS, VOLS 1-4, 1996, : 2244 - 2247
  • [27] Recognition of ECG patterns using artificial neural network
    He, Lin
    Hou, Wensheng
    Zhen, Xiaolin
    Peng, Chenglin
    ISDA 2006: SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 2, 2006, : 477 - +
  • [28] Artificial neural network for temporal impedance recognition of neurotoxins
    Slaughter, Gymama E.
    Hobson, Rosalyn S.
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 2001 - +
  • [29] Artificial retinal neural network for visual pattern recognition
    Guo, DH
    Cheng, LM
    Cheng, LL
    Chen, ZX
    Liu, RT
    Wu, BX
    APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN IMAGE PROCESSING, 1996, 2664 : 153 - 162
  • [30] Offline Signature Recognition: Artificial Neural Network Approach
    Deore, Madhuri R.
    Handore, Shubhangi M.
    2015 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2015, : 1708 - 1712