Analysis and Tuning of a Voice Assistant System for Dysfluent Speech

被引:6
|
作者
Mitra, Vikramjit [1 ]
Huang, Zifang [1 ]
Lea, Colin [1 ]
Tooley, Lauren [1 ]
Wu, Sarah [1 ]
Botten, Darren [1 ]
Palekar, Ashwini [1 ]
Thelapurath, Shrinath [1 ]
Georgiou, Panayiotis [1 ]
Kajarekar, Sachin [1 ]
Bigham, Jefferey [1 ]
机构
[1] Apple, Cupertino, CA 95014 USA
来源
关键词
dysfluent speech recognition; stutter detection; domain recognition; intent recognition; dysfluencies;
D O I
10.21437/Interspeech.2021-2006
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Dysfluencies and variations in speech pronunciation can severely degrade speech recognition performance, and for many individuals with moderate-to-severe speech disorders, voice operated systems do not work. Current speech recognition systems are trained primarily with data from fluent speakers and as a consequence do not generalize well to speech with dysfluencies such as sound or word repetitions, sound prolongations, or audible blocks. The focus of this work is on quantitative analysis of a consumer speech recognition system on individuals who stutter and production-oriented approaches for improving performance for common voice assistant tasks (i.e., "what is the weather?"). At baseline, this system introduces a significant number of insertion and substitution errors resulting in intended speech Word Error Rates (isWER) that are 13.64% worse (absolute) for individuals with fluency disorders. We show that by simply tuning the decoding parameters in an existing hybrid speech recognition system one can improve isWER by 24% (relative) for individuals with fluency disorders. Tuning these parameters translates to 3.6% better domain recognition and 1.7% better intent recognition relative to the default setup for the 18 study participants across all stuttering severities.
引用
收藏
页码:4848 / 4852
页数:5
相关论文
共 50 条
  • [1] Speech Assistant System
    Czap, Laszlo
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1486 - 1487
  • [2] PATTERN SEARCH IN DYSFLUENT SPEECH
    Palfy, Juraj
    Pospichal, Jiri
    2012 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2012,
  • [3] Concept of a Speech Assistant System
    Czap, Laszlo
    Varga, Attila Karoly
    Illes, Bela
    2013 FOURTH WORLD CONGRESS ON SOFTWARE ENGINEERING (WCSE), 2013, : 207 - 211
  • [4] Voice assistant technology continues to underperform on children's speech
    Bradley, Holly
    Yu, Madeleine E.
    Johnson, Elizabeth K.
    JASA EXPRESS LETTERS, 2025, 5 (03):
  • [5] DYSFLUENT SPEECH DETECTION BY IMAGE FORENSICS TECHNIQUES
    Palfy, Juraj
    Darjaa, Sakhia
    Pospichal, Jiri
    2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 96 - 101
  • [6] Understanding Dementia Speech: Towards an Adaptive Voice Assistant for Enhanced Communication
    Ma, Yong
    Nordberg, Oda Elise
    Zhang, Yuchong
    Rongve, Arvid
    Bachinski, Miroslav
    Fjeld, Morten
    COMPANION OF THE 2024 ACM SIGCHI SYMPOSIUM ON ENGINEERING INTERACTIVE COMPUTING SYSTEMS, EICS 2024, 2024, : 15 - 21
  • [7] Threat Modeling and Analysis of Voice Assistant Applications
    Cho, Geumhwan
    Choi, Jusop
    Kim, Hyoungshick
    Hyun, Sangwon
    Ryoo, Jungwoo
    INFORMATION SECURITY APPLICATIONS, WISA 2018, 2019, 11402 : 197 - 209
  • [8] Development of an Industrial Safety System Based on Voice Assistant
    Taco, Jaime Paul Ayala
    Jacome, Oswaldo Alexander Ibarra
    Pico, Jaime Luciano Ayala
    Castro, Brian Andres Lopez
    APPLIED SCIENCES-BASEL, 2023, 13 (21):
  • [9] Interactive home healthcare system with integrated voice assistant
    Dojchinovski, Dimitri
    Ilievski, Andrej
    Gusev, Marjan
    2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 284 - 288
  • [10] VPASS: Voice Privacy Assistant System for Monitoring In-home Voice Commands
    Tran, Bang
    Kona, Sai Harshavardhan Reddy
    Liang, Xiaohui
    Ghinita, Gabriel
    Summerour, Caroline
    Batsis, John A.
    2023 20TH ANNUAL INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST, PST, 2023, : 415 - 424