Robust model for speaker verification against session-dependent utterance variation

被引：0

作者：

Matsui, T ^{[1
]}

Aikawa, K

机构：

[1] Inst Stat Math, Tokyo 1068569, Japan

[2] NTT Corp, NTT Commun Sci Labs, Tokyo 1008116, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2003年 / E86D卷 / 04期

关键词：

speaker verification; speaker model; session dependent; utterance variation; handset dependent distortion;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates a new method for creating robust speaker models to cope with inter-session variation of a speaker in a continuous HMM-based speaker verification system. The new method estimates session-independent parameters by decomposing inter-session variations into two distinct parts: session-dependent and -independent. The parameters of the speaker models are estimated using the speaker adaptive training algorithm in conjunction with the equalization of session-dependent variation. The resultant models capture the session-independent speaker characteristics more reliably than the conventional models and their discriminative power improves accordingly. Moreover we have made our models more invariant to handset variations in a public switched telephone network (PSTN) by focusing on session-dependent variation and handset-dependent distortion separately. Text-independent speech data recorded by 20 speakers in seven sessions over 16 months was used to evaluate the new approach. The proposed method reduces the error rate by 15% relatively. When compared with the popular cepstral mean normalization, the error rate is reduced by 24% relatively when the speaker models were recreated using speech data recorded in four or more sessions.

引用

页码：712 / 718

页数：7

共 49 条

[41] The Distributions of Uncalibrated Speaker Verification Scores: A Generative Model for Domain Mismatch and Trial-Dependent Calibration
Cumani, Sandro
Sarni, Salvatore
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2204 - 2219
[42] Integrating DNN–HMM Technique with Hierarchical Multi-layer Acoustic Model for Text-Dependent Speaker Verification
Mohammad Azharuddin Laskar
Rabul Hussain Laskar
Circuits, Systems, and Signal Processing, 2019, 38 : 3548 - 3572
[43] FA-ExU-Net: The Simultaneous Training of an Embedding Extractor and Enhancement Model for a Speaker Verification System Robust to Short Noisy Utterances
Kim, Ju-ho
Heo, Jungwoo
Shin, Hyun-seo
Lim, Chan-yeong
Yu, Ha-Jin
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2269 - 2282
[44] Integrating DNN-HMM Technique with Hierarchical Multi-layer Acoustic Model for Text-Dependent Speaker Verification
Laskar, Mohammad Azharuddin
Laskar, Rabul Hussain
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (08) : 3548 - 3572
[45] Fuzzy restricted Boltzmann machine based probabilistic linear discriminant analysis for noise-robust text-dependent speaker verification on short utterances
Yoon, Sung-Hyun
Koh, Min-Sung
Yu, Ha-Jin
IAENG International Journal of Computer Science, 2020, 47 (03) : 468 - 480
[46] Robust Active Appearance Model Based Upon Multi-linear Analysis against Illumination Variation
Jo, Gyeong-Sic
Moon, Hyeon-Joon
Kim, Yong-Guk
UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION, PT II, PROCEEDINGS: INTELLIGENT AND UBIQUITOUS INTERACTION ENVIRONMENTS, 2009, 5615 : 667 - 673
[47] CloudNet: A LiDAR-Based Face Anti-Spoofing Model That Is Robust Against Light Variation
Kim, Yongrae
Gwak, Hyunmin
Oh, Jaehoon
Kang, Minho
Kim, Jinkyu
Kwon, Hyun
Kim, Sunghwan
IEEE ACCESS, 2023, 11 : 16984 - 16993
[48] Robust finite-control-set model predictive control for voltage source inverters against LC-filter parameter mismatch and variation
Van-Tien Le
Hong-Hee Lee
Journal of Power Electronics, 2022, 22 : 406 - 419
[49] Robust finite-control-set model predictive control for voltage source inverters against LC-filter parameter mismatch and variation
Le, Van-Tien
Lee, Hong-Hee
JOURNAL OF POWER ELECTRONICS, 2022, 22 (03) : 406 - 419

← 1 2 3 4 5 →