Relative functional comparison of neural and non-neural approaches for syllable segmentation in Devnagari TTS system

被引:0
|
作者
机构
[1] Kawachale, Smita
[2] Chitode, J.S.
来源
Kawachale, S. | 1600年 / International Journal of Computer Science Issues (IJCSI)卷 / 09期
关键词
Speech communication - Speech synthesis - Simulated annealing;
D O I
暂无
中图分类号
学科分类号
摘要
This paper presents methods for automatic speech signal segmentation using neural network. Speech signal segmentation is carried out to form syllables. Syllable is a common unit for concatenative TTS systems. Concatenative TTS being using speech segments of recorded speech is natural as compare to Formant or Articulatory TTS systems. This TTS stores small segments of speech and join them together to form new word. This helps to generate more number of words based on very small database. As manual segmentation is very time consuming and it has certain limitation on naturalness, some neural network models are used to improve naturalness of resulting segments in speech synthesis. The proposed work explains how neural network approaches like Maxnet, K-means outweighs in performance than traditional non neural approaches like slope detection and simulated annealing. About more than 90% accuracy is achieved with neural network models for syllable segmentation which resulted in naturalness improvement of Marathi TTS. © 2012 International Journal of Computer Science Issues.
引用
收藏
页码:3 / 2
相关论文
共 33 条