Diagnosis of pathological speech with streamlined features for long short-term memory learning

被引:4
|
作者
Pham, Tuan D. [1 ]
Holmes, Simon B. [1 ]
Zou, Lifong [1 ]
Patel, Mangala [1 ]
Coulthard, Paul [1 ]
机构
[1] Queen Mary Univ London, Barts & London Fac Med & Dent, Turner St, London E1 2AD, England
关键词
Pathological voice; Diagnosis; Feature extraction; Deep learning; Artificial intelligence; PARKINSONS-DISEASE; WAVE-PROPAGATION; SAMPLING THEORY; CLASSIFICATION; SCATTERING;
D O I
10.1016/j.compbiomed.2024.107976
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Pathological speech diagnosis is crucial for identifying and treating various speech disorders. Accurate diagnosis aids in developing targeted intervention strategies, improving patients' communication abilities, and enhancing their overall quality of life. With the rising incidence of speech -related conditions globally, including oral health, the need for efficient and reliable diagnostic tools has become paramount, emphasizing the significance of advanced research in this field. Methods: This paper introduces novel features for deep learning in the analysis of short voice signals. It proposes the incorporation of time -space and time-frequency features to accurately discern between two distinct groups: Individuals exhibiting normal vocal patterns and those manifesting pathological voice conditions. These advancements aim to enhance the precision and reliability of diagnostic procedures, paving the way for more targeted treatment approaches. Results: Utilizing a publicly available voice database, this study carried out training and validation using long short-term memory (LSTM) networks learning on the combined features, along with a data balancing strategy. The proposed approach yielded promising performance metrics: 90% accuracy, 93% sensitivity, 87% specificity, 88% precision, an F1 score of 0.90, and an area under the receiver operating characteristic curve of 0.96. The results surpassed those obtained by the networks trained using wavelet -time scattering coefficients, as well as several algorithms trained with alternative feature types. Conclusions: The incorporation of time-frequency and time -space features extracted from short segments of voice signals for LSTM learning demonstrates significant promise as an AI tool for the diagnosis of speech pathology. The proposed approach has the potential to enhance the accuracy and allow for real-time pathological speech assessment, thereby facilitating more targeted and effective therapeutic interventions.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Speech Dereverberation Using Long Short-Term Memory
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2435 - 2439
  • [2] Long Short-term Memory for Tibetan Speech Recognition
    Wang, Weizhe
    Chen, Ziyan
    Yang, Hongwu
    PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1059 - 1063
  • [3] Reinforcement learning with long short-term memory
    Bakker, B
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1475 - 1482
  • [4] Enhanced Deep Hierarchical Long Short-Term Memory and Bidirectional Long Short-Term Memory for Tamil Emotional Speech Recognition using Data Augmentation and Spatial Features
    Fernandes, Bennilo
    Mannepalli, Kasiprasad
    PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY, 2021, 29 (04): : 2967 - 2992
  • [5] LONG SHORT-TERM MEMORY LANGUAGE MODELS WITH ADDITIVE MORPHOLOGICAL FEATURES FOR AUTOMATIC SPEECH RECOGNITION
    Renshaw, Daniel
    Hall, Keith B.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5246 - 5250
  • [6] Deep Long Short-Term Memory Networks for Speech Recognition
    Chien, Jen-Tzung
    Misbullah, Alim
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [7] LEARNING AND LONG-TERM AND SHORT-TERM MEMORY IN COCKROACHES
    CHEN, WY
    ARANDA, LC
    LUCO, JV
    ANIMAL BEHAVIOUR, 1970, 18 (NOV) : 725 - &
  • [8] Time Series-based Spoof Speech Detection Using Long Short-term Memory and Bidirectional Long Short-term Memory
    Mirza, Arsalan R.
    Al-Talabani, Abdulbasit K.
    ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2024, 12 (02): : 119 - 129
  • [9] Part-of-Speech-Based Long Short-Term Memory Network for Learning Sentence Representations
    Zhu, Wenhao
    Yao, Tengjun
    Zhang, Wu
    Wei, Baogang
    IEEE ACCESS, 2019, 7 : 51810 - 51816
  • [10] Long short-term memory
    Hochreiter, S
    Schmidhuber, J
    NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780