Robust model for speaker verification against session-dependent utterance variation

被引：0

作者：

Matsui, T ^{[1
]}

Aikawa, K

机构：

[1] Inst Stat Math, Tokyo 1068569, Japan

[2] NTT Corp, NTT Commun Sci Labs, Tokyo 1008116, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2003年 / E86D卷 / 04期

关键词：

speaker verification; speaker model; session dependent; utterance variation; handset dependent distortion;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates a new method for creating robust speaker models to cope with inter-session variation of a speaker in a continuous HMM-based speaker verification system. The new method estimates session-independent parameters by decomposing inter-session variations into two distinct parts: session-dependent and -independent. The parameters of the speaker models are estimated using the speaker adaptive training algorithm in conjunction with the equalization of session-dependent variation. The resultant models capture the session-independent speaker characteristics more reliably than the conventional models and their discriminative power improves accordingly. Moreover we have made our models more invariant to handset variations in a public switched telephone network (PSTN) by focusing on session-dependent variation and handset-dependent distortion separately. Text-independent speech data recorded by 20 speakers in seven sessions over 16 months was used to evaluate the new approach. The proposed method reduces the error rate by 15% relatively. When compared with the popular cepstral mean normalization, the error rate is reduced by 24% relatively when the speaker models were recreated using speech data recorded in four or more sessions.

引用

页码：712 / 718

页数：7

共 49 条

[31] Dual-model self-regularization and fusion for domain adaptation of robust speaker verification
Duan, Yibo
Long, Yanhua
Liang, Jiaen
SPEECH COMMUNICATION, 2023, 155
[32] Speaker verification robust to talking style variation using multiple kernel learning based on conditional entropy minimization
Ogawa, Tetsuji
Hino, Hideitsu
Murata, Noboru
Kobayashi, Tetsunori
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2752 - +
[33] Sentence-HMM state-based i-vector/PLDA modelling for improved performance in text dependent single utterance speaker verification
Buyuk, Osman
IET SIGNAL PROCESSING, 2016, 10 (08) : 918 - 923
[34] A ROBUST TO OUTLIERS HIDDEN MARKOV MODEL WITH APPLICATION IN TEXT-DEPENDENT SPEAKER IDENTIFICATION
Chatzis, Sotirios
Varvarigou, Theodora
ICSPC: 2007 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1-3, PROCEEDINGS, 2007, : 804 - 807
[35] Text-dependent speaker verification under noisy conditions using parallel model combination
Wong, LP
Russell, M
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 457 - 460
[36] Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification
Ravi, Vijay
Fan, Ruchao
Afshan, Amber
Lu, Huanhua
Alwan, Abeer
INTERSPEECH 2020, 2020, : 766 - 770
[37] Cluster-dependent feature transformation with divergence-based out-of-handset rejection for robust speaker verification
Tsang, CL
Mak, MW
Kung, SY
ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1399 - 1403
[38] Robust Text-Dependent Speaker Verification via Character-Level Information Preservation for the SdSV Challenge 2020
Mun, Sung Hwan
Kang, Woo Hyun
Han, Min Hyun
Kim, Nam Soo
INTERSPEECH 2020, 2020, : 741 - 745
[39] Joint Learning of J-Vector Extractor and Joint Bayesian Model for Text Dependent Speaker Verification
Shi, Ziqiang
Liu, Liu
Lin, Huibin
Liu, Rujie
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1076 - 1080
[40] Noise robust speaker verification using Mel-Frequency Discrete Wavelet Coefficients and parallel model compensation
Tufekci, Z
Gurbuz, S
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 657 - 660

← 1 2 3 4 5 →