A kind of continuous digit speech recognition method

被引：0

作者：

Cao, WM ^{[1
]}

机构：

[1] Zhejiang Univ Technol, Inst Intelligent Informat Syst, Informat Coll, Hangzhou 310032, Peoples R China

来源：

ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS II | 2005年 / 187卷

关键词：

high-dimension space; high-dimension space covering theory; continuous speech of speaker-independent;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the light of descriptive geometry and notions in set theory, this paper redefines the basic elements in space such as curve and surface and so on, presents some fundamental notions with respect to the point cover based oil the High-Dimension Space(HDS) point covering theory, finally takes points from mapping part of speech signals to HDS, so as to analyze distribution information of these speech points in HDS, and various geometric covering objects for speech points and their relationship. Besides, this paper also proposes a new algorithm for speaker independent continuous digit speech recognition based on the HDS point dynamic searching theory without endpoints detection and segmentation. First from the different digit syllables in real continuous digit speech, we establish the covering area in feature space for continuous speech. During recognition, we make use of the point covering dynamic searching theory in HDS to do recognition, and then get the satisfying recognized results. At last, compared to HMM-based method, from the development trend of the comparing results, as sample amount increasing, the difference of recognition rate between two methods will decrease slowly, while sample amount approaching to be very large, two recognition rates all close to 100% little by little. As seen from the results, the recognition rate of HDS point covering method is higher than that of in HMM-based method, because, the point covering describes the morphological distribution for speech in HDS, whereas HMM-based method is only a probability distribution. whose accuracy is certainly inferior to point covering.

引用

页码：213 / 222

页数：10

共 50 条

[31] Efficient decoding algorithms for Mandarin Connected Digit Speech Recognition
Zhu, X
Li, HS
Lu, J
Liu, RS
PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 555 - 558
[32] Bell labs connected digit databases for a telephone speech recognition
Zhou, Q
Zitouni, I
Li, Q
2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 351 - 356
[33] Subspace Distribution Clustering HMM for Chinese Digit Speech Recognition
秦伟
韦岗
JournalofElectronicScienceandTechnologyofChina, 2006, (01) : 43 - 46
[34] GARCH coefficients as feature for speech recognition in Persian isolated digit
Abdolahi, M
Amindavar, H
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 957 - 960
[35] CONTINUOUS VISUAL SPEECH RECOGNITION FOR AUDIO SPEECH ENHANCEMENT
Benhaim, Eric
Sahbi, Hichem
Vitte, Guillaume
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2244 - 2248
[36] Recognition of Isolated Digit Using Random Forest for Audio-Visual Speech Recognition
Prashant Borde
Sadanand Kulkarni
Bharti Gawali
Pravin Yannawar
Proceedings of the National Academy of Sciences, India Section A: Physical Sciences, 2022, 92 : 103 - 110
[37] Recognition of Isolated Digit Using Random Forest for Audio-Visual Speech Recognition
Borde, Prashant
Kulkarni, Sadanand
Gawali, Bharti
Yannawar, Pravin
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES INDIA SECTION A-PHYSICAL SCIENCES, 2022, 92 (01) : 103 - 110
[38] Handwritten Persian digit recognition by a structural method
Ketabdar, H
DOCUMENT RECOGNITION AND RETRIEVAL VII, 2000, 3967 : 52 - 56
[39] SYLLABLE RECOGNITION FOR CONTINUOUS JAPANESE SPEECH RECOGNITION.
Watanabe, Takao
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1986, : 2295 - 2298
[40] Robust Mizo Continuous Speech Recognition
Dey, Abhishek
Sarma, Biswajit Dev
Lalhminghlui, Wendy
Ngente, Lalnunsiami
Gogoi, Parismita
Sarmah, Priyankoo
Prasanna, S. R. M.
Sinha, Rohit
Nirmala, S. R.
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1036 - 1040

← 1 2 3 4 5 →