DETECTION, SEPARATION AND RECOGNITION OF SPEECH FROM CONTINUOUS SIGNALS USING SPECTRAL FACTORISATION

被引：0

作者：

Hurmalainen, Antti ^{[1
]}

Gemmeke, Jort F. ^{[2
]}

Virtanen, Tuomas ^{[1
]}

机构：

[1] Tampere Univ Technol, POB 553, FI-33101 Tampere, Finland

[2] Katholieke Univ Leuven, Dept ESAT PSI, B-3001 Heverlee, Belgium

来源：

2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2012年

关键词：

Spectral factorization; speech recognition; speaker recognition; voice activity detection; speech separation;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In real world speech processing, the signals are often continuous and consist of momentary segments of speech over non-stationary background noise. It has been demonstrated that spectral factorisation using multi-frame atoms can be successfully employed to separate and recognise speech in adverse conditions. While in previous work full knowledge of utterance endpointing and speaker identity was used for noise modelling and speech recognition, this study proposes spectral factorisation and sparse classification techniques to detect, identify, separate and recognise speech from a continuous noisy input. Speech models are trained beforehand, but noise models are acquired adaptively from the input by using voice activity detection without prior knowledge of noise-only locations. The results are evaluated on the CHiME corpus, containing utterances from 34 speakers over highly non-stationary multi-source noise.

引用

页码：2649 / 2653

页数：5

共 50 条

[31] Combining spectral representations for large-vocabulary continuous speech recognition
Garau, Giulia
Renals, Steve
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03): : 508 - 518
[32] MEETING RECOGNITION WITH CONTINUOUS SPEECH SEPARATION AND TRANSCRIPTION-SUPPORTED DIARIZATION
von Neumann, Thilo
Boeddeker, Christoph
Cord-Landwehr, Tobias
Delcroix, Marc
Haeb-Umbach, Reinhold
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 775 - 779
[33] KEYWORD DETECTION IN CONVERSATIONAL SPEECH UTTERANCES USING HIDDEN MARKOV MODEL-BASED CONTINUOUS SPEECH RECOGNITION
ROSE, RC
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (04): : 309 - 333
[34] CONTINUOUS SPEECH RECOGNITION FROM PHONETIC TRANSCRIPTION
LEVINSON, SE
LJOLJE, A
SPEECH AND NATURAL LANGUAGE, 1989, : 292 - 292
[35] Dialect recognition from Telugu speech utterances using spectral and prosodic features
Shivaprasad, S.
Sadanandam, M.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 27 (2) : 515 - 515
[36] SPEECH EMOTION RECOGNITION USING CYCLOSTATIONARY SPECTRAL ANALYSIS
Jalili, Amin
Sahami, Sadid
Chi, Chong-Yung
Amirfattahi, Rassoul
2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,
[37] SPEAKER ADAPTATION USING SPECTRAL INTERPOLATION FOR SPEECH RECOGNITION
SHINODA, K
ISO, KI
WATANABE, T
ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 1994, 77 (10): : 1 - 11
[38] Speech Emotion Recognition using Spectral Normalized CycleGAN
Ding Wan
Huang Dong-Yan
Luo Danqing
Zou Yuexian
2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2019, : 93 - 99
[39] Emotion recognition and evaluation from Mandarin speech signals
Pao, Tsanglong
Chen, Yute
Yeh, Junheng
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2008, 4 (07): : 1695 - 1709
[40] FPGA based emotions recognition from speech signals
Rajasekhar, B.
Kamaraju, M.
Sumalatha, V.
2017 THIRD INTERNATIONAL CONFERENCE ON BIOSIGNALS, IMAGES AND INSTRUMENTATION (ICBSII), 2017,

← 1 2 3 4 5 →