Analysis of speech-based speech transmission index methods with implications for nonlinear operations

被引：160

作者：

Goldsworthy, RL

Greenberg, JE

机构：

[1] MIT, Elect Res Lab, Cambridge, MA 02139 USA

[2] Harvard Mit Div Hlth Sci & Technol, Cambridge, MA 02139 USA

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2004年 / 116卷 / 06期

关键词：

D O I：

10.1121/1.1804628

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The Speech Transmission Index (STI) is a physical. metric that is well correlated with the intelligibility of speech degraded by additive noise and reverberation. The traditional STI uses modulated noise as a probe signal and is valid for assessing degradations that result from linear operations on the speech signal. Researchers have attempted to extend the STI to predict the intelligibility of nonlinearly processed speech by proposing variations that use speech as a probe signal. This work considers four previously proposed speech-based STI methods and four novel methods, studied under conditions of additive noise, reverberation, and two nonlinear operations (envelope thresholding and spectral subtraction). Analyzing intermediate metrics in the STI calculation reveals why some methods fail for nonlinear operations. Results indicate that none of the previously proposed methods is adequate for all of the conditions considered, while four proposed methods produce qualitatively reasonable results and warrant further study. The discussion considers the relevance of this work to predicting the intelligibility of cochlear-implant processed speech. (C) 2004 Acoustical Society of America.

引用

页码：3679 / 3689

页数：11

共 50 条

[31] The SRI Speech-Based Collaborative Learning Corpus
Richey, Colleen
D'Angelo, Cynthia
Alozie, Nonye
Bratt, Harry
Shriberg, Elizabeth
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1550 - 1554
[32] Speech-Based Interface For Visually Impaired Users
Huang, Yi-Chin
Tsai, Cheng-Hung
IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 1223 - 1228
[33] Speech-Based Automated Cognitive Status Assessment
Hakkani-Tuer, Dilek
Vergyri, Dimitra
Tur, Gokhan
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 258 - +
[34] Contemporary Reflections on Speech-Based Language Learning
Gustafson, Marianne
VOLTA REVIEW, 2009, 109 (2-3) : 143 - 153
[35] Automatic Speech-Based Smoking Status Identification
Ma, Zhizhong
Singh, Satwinder
Qiu, Yuanhang
Hou, Feng
Wang, Ruili
Bullen, Christopher
Chu, Joanna Ting Wai
INTELLIGENT COMPUTING, VOL 3, 2022, 508 : 193 - 203
[36] Speech-Based Activity Recognition for Trauma Resuscitation
Abdulbaqi, Jalal
Gu, Yue
Xu, Zhichao
Gao, Chenyang
Marsic, Ivan
Burd, Randall S.
2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 376 - 383
[37] ECHO: A speech recognition package for the design of robust interactive speech-based applications
Kabré H.
International Journal of Speech Technology, 1997, 2 (2) : 133 - 143
[38] Verifying Human Users in Speech-Based Interactions
Shirali-Shahreza, Sajad
Ganjali, Yashar
Balakrishnan, Ravin
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1596 - 1599
[39] Effect of Reverberation in Speech-based Emotion Recognition
Zhao, Shujie
Yang, Yan
Chen, Jingdong
2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,
[40] An architecture and applications for speech-based accessibility systems
Turunen, M
Hakulinen, J
Räihä, KJ
Salonen, EP
Kainulainen, A
Prusi, P
IBM SYSTEMS JOURNAL, 2005, 44 (03) : 485 - 504

← 1 2 3 4 5 →