Disfluent Cues for Enhanced Speech Understanding in Large Language Models

被引:0
|
作者
Rohanian, Morteza [1 ]
Nooralahzadeh, Farhad [1 ]
Rohanian, Omid [2 ]
Clifton, David [2 ]
Krauthammer, Michael [1 ]
机构
[1] Univ Zurich, Dept Quantit Biomed, Zurich, Switzerland
[2] Univ Oxford, Dept Engn Sci, Oxford, England
关键词
REPAIR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In computational linguistics, the common practice is to "clean" disfluent content from spontaneous speech. However, we hypothesize that these disfluencies might serve as more than mere noise, potentially acting as informative cues. We use a range of pre-trained models for a reading comprehension task involving disfluent queries, specifically featuring different types of speech repairs. The findings indicate that certain disfluencies can indeed improve model performance, particularly those stemming from context-based adjustments. However, large-scale language models struggle to handle repairs involving decision-making or the correction of lexical or syntactic errors, suggesting a crucial area for potential improvement. This paper thus highlights the importance of a nuanced approach to disfluencies, advocating for their potential utility in enhancing model performance rather than their removal.
引用
收藏
页码:3676 / 3684
页数:9
相关论文
共 50 条
  • [21] Phrase language models for detection and verification-based speech understanding
    Kawahara, T
    Doshita, S
    Lee, CH
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 49 - 56
  • [22] Improving Speech Understanding Accuracy with Limited Training Data Using Multiple Language Models and Multiple Understanding Models
    Katsumaru, Masaki
    Nakano, Mikio
    Komatani, Kazunori
    Funakoshi, Kotaro
    Ogata, Tetsuya
    Okuno, Hiroshi G.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2699 - +
  • [23] Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models
    Wang, Zhiyi
    Mao, Shaoguang
    Wu, Wenshan
    Xia, Yan
    Deng, Yan
    Tien, Jonathan
    INTERSPEECH 2023, 2023, : 4194 - 4198
  • [24] Leverage Large Language Models For Enhanced Aviation Safety
    Fox, Kevin L.
    Niewoehner, Kevin R.
    Rahmes, Mark
    Wong, Josiah
    Razdan, Rahul
    2024 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE, ICNS, 2024,
  • [25] LSCP: Enhanced Large Scale Colloquial Persian Language Understanding
    Khojasteh, Hadi Abdi
    Ansari, Ebrahim
    Bohlouli, Mandi
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6323 - 6327
  • [26] Cues That Language Users Exploit to Segment Speech
    陈冰茹
    校园英语, 2015, (03) : 227 - 228
  • [27] Are Large Language Models Ready for Healthcare? A Comparative Study on Clinical Language Understanding
    Wang, Yuqing
    Zhao, Yun
    Petzold, Linda
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 219, 2023, 219
  • [28] Reliable Natural Language Understanding with Large Language Models and Answer Set Programming
    Rajasekharan, Abhiramon
    Zeng, Yankai
    Padalkar, Parth
    Gupta, Gopal
    Electronic Proceedings in Theoretical Computer Science, EPTCS, 2023, 385 : 274 - 287
  • [29] Reliable Natural Language Understanding with Large Language Models and Answer Set Programming
    Rajasekharan, Abhiramon
    Zeng, Yankai
    Padalkar, Parth
    Gupta, Gopal
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2023, (385): : 274 - 287
  • [30] Language modeling for spontaneous speech recognition based on disfluency labeling and generation of disfluent text
    Horii, Koharu
    Ohta, Kengo
    Nishimura, Ryota
    Ogawa, Atsunori
    Kitaoka, Norihide
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1851 - 1856