Automatic speaker verification system for dysarthric speakers using prosodic features and out-of-domain data augmentation

被引:3
|
作者
Salim, Shinimol [1 ]
Shahnawazuddin, Syed [2 ]
Ahmad, Waquar [1 ]
机构
[1] Natl Inst Technol, Elect & Commun Dept, Calicut 673601, India
[2] Natl Inst Technol, Elect & Commun Dept, Patna 800005, India
关键词
Automatic speaker verification system; Dysarthria; Duration modification based data augmentation; MFCC; Prosody; i-vector; x-vector; SPEECH; LOUDNESS;
D O I
10.1016/j.apacoust.2023.109412
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A communication disorder is an impairment of a person's ability to talk or communicate appropriately. Dysarthria is a common neuro-motor speech communication disorder that can be caused by neurological damage. Dysarthria may affect the articulation, phonation, and prosody of a speaker. Dysarthria patients have poor neuromotor coordination and other physical impairments, making it difficult to utilize an interactive keyboard or other user interfaces. The ASV system can make biometric applications more accessible to dysarthric speakers by eliminating the need for them to remember cumbersome and unique authentication numbers and passwords. In this paper, we presented a study on developing an automatic speaker verification (ASV) system for dysarthria patients with varying speech intelligibility to assist them in remote access control and voice-based biometric applications. In the initial part of our proposed approach, we included a duration modification-based data augmentation module in the front end of the ASV system. Since prosody deficits are one of the early indicators of dysarthria, we investigated the role of prosodic variables in combination with the traditional Mel-frequency cepstral coefficients (MFCC). The prosodic variables explored in this study include pitch, loudness, and voicing probability. Separate i-vector and x-vector models are trained and compared using individual MFCC, prosodic vari-ables, and their combinations. The experimental results showed that the proposed approach based on combining MFCC and prosody features along with duration-modification-based data augmentation pro-duced promising results. & COPY; 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Secure Automatic Speaker Verification (SASV) System Through sm-ALTP Features and Asymmetric Bagging
    Aljasem, Muteb
    Irtaza, Aun
    Malik, Hafiz
    Saba, Noushin
    Javed, Ali
    Malik, Khalid Mahmood
    Meharmohammadi, Mohammad
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 3524 - 3537
  • [32] Supervised domain adaptation for text-independent speaker verification using limited data
    Sarfjoo, Seyyed Saeed
    Madikeri, Srikanth
    Motlicek, Petr
    Marcel, Sebastien
    INTERSPEECH 2020, 2020, : 3815 - 3819
  • [33] Speaker Trait Prediction for Automatic Personality Perception using Frequency Domain Linear Prediction features
    Jothilakshmi, S.
    Brindha, R.
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 2129 - 2132
  • [34] Automatic, Text-Independent, Speaker Identification and Verification System Using Mel Cepstrum and GMM
    Al Marashli, Ahmad
    Al Dakkak, Oumayma
    2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 657 - +
  • [35] Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions
    Kocielnik, Rafal
    Kangaslahti, Sara
    Prabhumoye, Shrimai
    Hari, Meena
    Alvarez, R. Michael
    Anandkumar, Anima
    TRANSFER LEARNING FOR NATURAL LANGUAGE PROCESSING WORKSHOP, VOL 203, 2022, 203 : 22 - 32
  • [36] Optimal transport-based transfer learning for smart manufacturing: Tool wear prediction using out-of-domain data
    Xie, Rui
    Wu, Dazhong
    MANUFACTURING LETTERS, 2021, 29 (29) : 104 - 107
  • [37] Role of Data Augmentation and Effective Conservation of High-Frequency Contents in the Context Children's Speaker Verification System
    Aziz, Shahid
    Shahnawazuddin, S.
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (05) : 3139 - 3159
  • [38] Role of Data Augmentation and Effective Conservation of High-Frequency Contents in the Context Children’s Speaker Verification System
    Shahid Aziz
    S. Shahnawazuddin
    Circuits, Systems, and Signal Processing, 2024, 43 : 3139 - 3159
  • [39] Automatic Handwriting Verification and Suspect Identification for Chinese Characters Using Space and Frequency Domain Features
    Liao, Wei-Chong
    Ding, Jian-Jiun
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 563 - 571
  • [40] Automatic Speaker Verification and Replay Attack Detection System using novel Glottal Flow Cepstrum Coefficients
    Banaras, Yusra
    Javed, Ali
    Hassan, Farman
    2021 INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT 2021), 2021, : 149 - 153