Interpreting Pretrained Speech Models for Automatic Speech Assessment of Voice Disorders

被引:0
|
作者
Lau, Hok Shing [1 ]
Huntly, Mark [1 ]
Morgan, Nathon [1 ]
Iyenoma, Adesua [1 ]
Zeng, Biao [2 ]
Bashford, Tim [1 ]
机构
[1] Univ Wales Trinty St David, Wales Inst Digital Informat, Swansea, W Glam, Wales
[2] Univ South Wales, Psychol Dept, Pontypridd, M Glam, Wales
关键词
Speech Biomarker; Interpretable Machine Learning; Voice Disorder Detection;
D O I
10.1007/978-3-031-67278-1_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech contains information that is clinically relevant to some diseases, which has the potential to be used for health assessment. Recent work shows an interest in applying deep learning algorithms, especially pretrained large speech models to the applications of Automatic Speech Assessment. One question that has not been explored is how these models output the results based on their inputs. In this work, we train and compare two configurations of Audio Spectrogram Transformer [1] in the context of Voice Disorder Detection and apply the attention rollout method [2] to produce model relevance maps, the computed relevance of the spectrogram regions when the model makes predictions. We use these maps to analyse how models make predictions in different conditions and to show that the spread of attention is reduced as a model is finetuned, and the model attention is concentrated on specific phoneme regions.
引用
收藏
页码:59 / 72
页数:14
相关论文
共 50 条
  • [41] QUANTIFYING VOICE DISORDERS USING A SPEECH ANALYZER
    ARENSON, JW
    KUNOV, H
    PHYSICS IN MEDICINE AND BIOLOGY, 1980, 25 (05): : 1004 - 1004
  • [42] DIAGNOSIS OF SPEECH AND VOICE DISORDERS - WULFF,H
    KLUGE, G
    HEILPADAGOGISCHE FORSCHUNG, 1984, 11 (03): : 366 - 366
  • [43] Identification of voice disorders using speech samples
    Nayak, J
    Bhat, PS
    IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 951 - 953
  • [44] Principles of Nomenclature And of Classification of Speech And Voice Disorders
    Robbins, Samuel D.
    JOURNAL OF SPEECH DISORDERS, 1947, 12 (01): : 17 - 22
  • [45] DISORDERS OF VOICE AND SPEECH IN PARKINSONS-DISEASE
    UZIEL, A
    CADILHAC, J
    PASSOUANT, P
    FOLIA PHONIATRICA, 1975, 27 (03): : 166 - 176
  • [46] Signals voice biofeedback for speech fluency disorders
    Martín, JF
    Fernández-Ramos, R
    Romero-Sánchez, J
    Ríos, F
    BIOENGINEERED AND BIOINSPIRED SYSTEMS, 2003, 5119 : 258 - 264
  • [47] Assessment of Motor Speech Disorders
    Schalling, Ellika
    INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS, 2014, 49 (06) : 780 - 780
  • [48] SYNTHETIC SPEECH REFERENCES FOR AUTOMATIC PATHOLOGICAL SPEECH INTELLIGIBILITY ASSESSMENT
    Janbakhshi, Parvaneh
    Kodrasi, Ina
    Bourlard, Herve
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6099 - 6103
  • [49] Speech Technology for Automatic Recognition and Assessment of Dysarthric Speech: An Overview
    Bhat, Chitralekha
    Strik, Helmer
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2025, 68 (02): : 547 - 577
  • [50] ON AUTOMATIC VOICE CASTING FOR EXPRESSIVE SPEECH: SPEAKER RECOGNITION VS. SPEECH CLASSIFICATION
    Obin, Nicolas
    Roebel, Axel
    Bachman, Gregoire
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,