Toward Real-World Voice Disorder Classification

被引:0
|
作者
Kuo, Heng-Cheng [1 ]
Hsieh, Yu-Peng [1 ]
Tseng, Huan-Hsin [1 ]
Wang, Chi-Te [2 ,3 ]
Fang, Shih-Hau [2 ,4 ]
Tsao, Yu [5 ,6 ]
机构
[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei, Taiwan
[2] Yuan Ze Univ, Dept Elect Engn, Taoyuan City, Taiwan
[3] Far Eastern Mem Hosp, Dept Otolaryngol Head & Neck Surg, New Taipei City, Taiwan
[4] Yuan Ze Univ, AI Res Ctr, Taoyuan City, Taiwan
[5] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 11529, Taiwan
[6] Chung Yuan Christian Univ, Dept Elect Engn, Taoyuan 32023, Taiwan
关键词
Voice disorder classification; model compression; domain adaptation; real-world application; AUTOMATIC DETECTION; PATHOLOGICAL VOICE; DISCRIMINATION; POPULATION; PREVALENCE; PARAMETERS; DYSPHONIA; CLOUD; FOLD;
D O I
10.1109/TBME.2023.3270532
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective: Voice disorders significantly compromise individuals' ability to speak in their daily lives. Without early diagnosis and treatment, these disorders may deteriorate drastically. Thus, automatic classification systems at home are desirable for people who are inaccessible to clinical disease assessments. However, the performance of such systems may be weakened due to the constrained resources and domain mismatch between the clinical data and noisy real-world data. Methods: This study develops a compact and domain-robust voice disorder classification system to identify the utterances of health, neoplasm, and benign structural diseases. Our proposed system utilizes a feature extractor model composed of factorized convolutional neural networks and subsequently deploys domain adversarial training to reconcile the domain mismatch by extracting domain-invariant features. Results: The results show that the unweighted average recall in the noisy real-world domain improved by 13% and remained at 80% in the clinic domain with only slight degradation. The domain mismatch was effectively eliminated. Moreover, the proposed system reduced the usage of both memory and computation by over 73.9%. Conclusion: By deploying factorized convolutional neural networks and domain adversarial training, domain-invariant features can be derived for voice disorder classification with limited resources. The promising results confirm that the proposed system can significantly reduce resource consumption and improve classification accuracy by considering the domain mismatch. Significance: To the best of our knowledge, this is the first study that jointly considers real-world model compression and noise-robustness issues in voice disorder classification. The proposed system is intended for application to embedded systems with limited resources.
引用
收藏
页码:2922 / 2932
页数:11
相关论文
共 50 条
  • [1] Voice biometrics: Real-world issues and solutions
    Gold S.
    Biometric Technology Today, 2010, 2010 (05) : 6 - 7
  • [2] Toward Real-World Computational Nephropathology
    Calumby, Rodrigo T.
    Duarte, Angelo A.
    Angelo, Michele F.
    Santos, Emanuele
    Sarder, Pinaki
    dos-Santos, Washington L. C.
    Oliveira, Luciano R.
    CLINICAL JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2023, 18 (06): : 809 - 812
  • [3] Toward Real-World Multi-View Object Classification: Dataset, Benchmark, and Analysis
    Wang, Ren
    Kim, Tae Sung
    Kim, Jin-Sung
    Lee, Hyuk-Jae
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5653 - 5664
  • [4] Toward real-world sequencing by microdevice electrophoresis
    Schmalzing, D
    Tsao, N
    Koutny, L
    Chisholm, D
    Srivastava, A
    Adourian, A
    Linton, L
    McEwan, P
    Matsudaira, P
    Ehrlich, D
    GENOME RESEARCH, 1999, 9 (09) : 853 - 858
  • [5] Toward Real-world Panoramic Image Enhancement
    Zhang, Yupeng
    Zhang, Hengzhi
    Li, Daojing
    Liu, Liyan
    Yi, Hong
    Wang, Wei
    Suitoh, Hiroshi
    Odamaki, Makoto
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2675 - 2684
  • [6] On the significance of real-world conditions for material classification
    Hayman, E
    Caputo, B
    Fritz, M
    Eklundh, JO
    COMPUTER VISION - ECCV 2004, PT 4, 2004, 2034 : 253 - 266
  • [7] Auditory chaos classification in real-world environments
    Khante, Priyanka
    Thomaz, Edison
    de Barbaro, Kaya
    FRONTIERS IN DIGITAL HEALTH, 2023, 5
  • [8] Multimodal Classification Fusion in Real-World Scenarios
    Gallo, Ignazio
    Calefati, Alessandro
    Nawaz, Shah
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2017), VOL 5, 2017, : 36 - 41
  • [9] Real-world study: from real-world data to real-world evidence
    Wen, Yi
    TRANSLATIONAL BREAST CANCER RESEARCH, 2020, 1
  • [10] Engineering Kirigami Frameworks Toward Real-World Applications
    Jin, Lishuai
    Yang, Shu
    ADVANCED MATERIALS, 2024, 36 (09)