Unraveling the complexities of pathological voice through saliency analysis

被引:1
|
作者
Shaikh, Abdullah Abdul Sattar [1 ]
Bhargavi, M. S. [1 ]
Naik, Ganesh R. [2 ]
机构
[1] Bangalore Inst Technol, Dept Comp Sci & Engn, Bangalore 560004, Karnataka, India
[2] Flinders Univ S Australia, Adelaide Inst Sleep Hlth, Adelaide, SA 5042, Australia
关键词
Pathological voice; Saliency analysis; Autoencoders; Multi-class classification; UNet plus plus; AUTOMATIC DETECTION; CLASSIFICATION; SPEECH; IMPAIRMENTS; FEATURES; HEALTHY;
D O I
10.1016/j.compbiomed.2023.107566
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The human voice is an essential communication tool, but various disorders and habits can disrupt it. Diagnosis of pathological and abnormal voices is very important. Conventional diagnosis of these voice pathologies can be invasive and costly. Voice pathology disorders can be effectively detected using Artificial Intelligence and computer-aided voice pathology classification tools. Previous studies focused primarily on binary classification, leaving limited attention to multi-class classification. This study proposes three different neural network architectures to investigate the feature characteristics of three voice pathologies-Hyperkinetic Dysphonia, Hypokinetic Dysphonia, Reflux Laryngitis, and healthy voices using multi-class classification and the Voice ICar fEDerico II (VOICED) dataset. The study proposes UNet++ autoencoder-based denoiser techniques for accurate feature extraction to overcome noisy data. The architectures include a Multi-Layer Perceptron (MLP) trained on structured feature sets, a Short-Time Fourier Transform (STFT) model, and a Mel-Frequency Cepstral Coefficients (MFCC) model. The MLP model on 143 features achieved 97.1% accuracy, while the STFT model showed similar performance with increased sensitivity of 99.8%. The MFCC model maintained 97.1% accuracy but with a smaller model size and improved accuracy on the Reflux Laryngitis class. The study identifies crucial features through saliency analysis and reveals that detecting voice abnormalities requires the identification of regions of inaudible high-pitch sounds. Additionally, the study highlights the challenges posed by limited and disjointed pathological voice databases and proposes solutions for enhancing the performance of voice abnormality classification. Overall, the study's findings have potential applications in clinical applications and specialized audio-capturing tools.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] UNRAVELING THE COMPLEXITIES OF MOSAICISM IN HUMAN BLASTOCYSTS.
    McReynolds, S.
    Schweitz, M.
    McCormick, S.
    Parks, J. C.
    Schoolcraft, W. B.
    Katz-Jaffe, M.
    FERTILITY AND STERILITY, 2016, 106 (03) : E19 - E19
  • [22] Unraveling the complexities of adolescent depression: A call for action
    Zhang, Yuan
    Hei, Ming-Yan
    Wang, Min-Zhong
    Zhang, Jian-Guo
    Wang, Shu
    WORLD JOURNAL OF PSYCHIATRY, 2024, 14 (11):
  • [23] Attributional explanation: Unraveling structural and qualitative complexities
    Shi-Xu
    JOURNAL OF LANGUAGE AND SOCIAL PSYCHOLOGY, 1999, 18 (04) : 356 - 376
  • [24] Implantation and Decidualization in PCOS: Unraveling the Complexities of Pregnancy
    Matsuyama, Satoko
    Whiteside, Sarah
    Li, Shu-Yun
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (02)
  • [25] Unraveling the Complexities of the Lipoprotein(a) Particle and its Metabolism
    Reyes-Soffer, Gissette
    FASEB JOURNAL, 2021, 35
  • [26] Unraveling the Complexities of Mast Cells in Health and Disease
    Firinu, Davide
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (07)
  • [27] Refractory Hypothyroidism: Unraveling the Complexities of Diagnosis and Management
    Quiroz-Aldave, Juan Eduardo
    Concepcion-Zavaleta, Marcio Jose
    Durand-Vasquez, Maria del Carmen
    Concepcion-Urteaga, Luis Alberto
    Gamarra-Osorio, Elman Rolando
    Suarez-Rojas, Jacsel
    Paz-Ibarra, Jose
    Rafael-Robles, Luciana del Pilar
    Roman-Gonzalez, Alejandro
    ENDOCRINE PRACTICE, 2023, 29 (12) : 1007 - 1016
  • [28] Unraveling the Complexities of Statistical Presentation Why it Is Important
    Fuster, Valentin
    JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2015, 66 (25) : 2909 - 2910
  • [29] A Novel Voice Feature AVA and its Application to the Pathological Voice Detection Through Machine Learning
    Altaf, Abdulrehman
    Mahdin, Hairulnizam
    Maskat, Ruhaila
    Shaharudin, Shazlyn Milleana
    Altaf, Abdullah
    Mahmood, Awais
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 1085 - 1092
  • [30] Ex(er)cising Student Voice in Pedagogy for Decolonizing: Exploring Complexities Through Duoethnography
    Madden, Brooke
    McGregor, Heather
    REVIEW OF EDUCATION PEDAGOGY AND CULTURAL STUDIES, 2013, 35 (05) : 371 - 391