Robust model for speaker verification against session-dependent utterance variation

被引:0
|
作者
Matsui, T [1 ]
Aikawa, K
机构
[1] Inst Stat Math, Tokyo 1068569, Japan
[2] NTT Corp, NTT Commun Sci Labs, Tokyo 1008116, Japan
来源
关键词
speaker verification; speaker model; session dependent; utterance variation; handset dependent distortion;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper investigates a new method for creating robust speaker models to cope with inter-session variation of a speaker in a continuous HMM-based speaker verification system. The new method estimates session-independent parameters by decomposing inter-session variations into two distinct parts: session-dependent and -independent. The parameters of the speaker models are estimated using the speaker adaptive training algorithm in conjunction with the equalization of session-dependent variation. The resultant models capture the session-independent speaker characteristics more reliably than the conventional models and their discriminative power improves accordingly. Moreover we have made our models more invariant to handset variations in a public switched telephone network (PSTN) by focusing on session-dependent variation and handset-dependent distortion separately. Text-independent speech data recorded by 20 speakers in seven sessions over 16 months was used to evaluate the new approach. The proposed method reduces the error rate by 15% relatively. When compared with the popular cepstral mean normalization, the error rate is reduced by 24% relatively when the speaker models were recreated using speech data recorded in four or more sessions.
引用
收藏
页码:712 / 718
页数:7
相关论文
共 49 条
  • [31] Dual-model self-regularization and fusion for domain adaptation of robust speaker verification
    Duan, Yibo
    Long, Yanhua
    Liang, Jiaen
    SPEECH COMMUNICATION, 2023, 155
  • [32] Speaker verification robust to talking style variation using multiple kernel learning based on conditional entropy minimization
    Ogawa, Tetsuji
    Hino, Hideitsu
    Murata, Noboru
    Kobayashi, Tetsunori
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2752 - +
  • [33] Sentence-HMM state-based i-vector/PLDA modelling for improved performance in text dependent single utterance speaker verification
    Buyuk, Osman
    IET SIGNAL PROCESSING, 2016, 10 (08) : 918 - 923
  • [34] A ROBUST TO OUTLIERS HIDDEN MARKOV MODEL WITH APPLICATION IN TEXT-DEPENDENT SPEAKER IDENTIFICATION
    Chatzis, Sotirios
    Varvarigou, Theodora
    ICSPC: 2007 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1-3, PROCEEDINGS, 2007, : 804 - 807
  • [35] Text-dependent speaker verification under noisy conditions using parallel model combination
    Wong, LP
    Russell, M
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 457 - 460
  • [36] Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification
    Ravi, Vijay
    Fan, Ruchao
    Afshan, Amber
    Lu, Huanhua
    Alwan, Abeer
    INTERSPEECH 2020, 2020, : 766 - 770
  • [37] Cluster-dependent feature transformation with divergence-based out-of-handset rejection for robust speaker verification
    Tsang, CL
    Mak, MW
    Kung, SY
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1399 - 1403
  • [38] Robust Text-Dependent Speaker Verification via Character-Level Information Preservation for the SdSV Challenge 2020
    Mun, Sung Hwan
    Kang, Woo Hyun
    Han, Min Hyun
    Kim, Nam Soo
    INTERSPEECH 2020, 2020, : 741 - 745
  • [39] Joint Learning of J-Vector Extractor and Joint Bayesian Model for Text Dependent Speaker Verification
    Shi, Ziqiang
    Liu, Liu
    Lin, Huibin
    Liu, Rujie
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1076 - 1080
  • [40] Noise robust speaker verification using Mel-Frequency Discrete Wavelet Coefficients and parallel model compensation
    Tufekci, Z
    Gurbuz, S
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 657 - 660