Device Features Based on Linear Transformation With Parallel Training Data for Replay Speech Detection

被引:3
|
作者
Xu, Longting [1 ]
Yang, Jichen [2 ,3 ]
You, Chang Huai [4 ]
Qian, Xinyuan [5 ]
Huang, Daiyu [1 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 200051, Peoples R China
[2] Guangdong Polytech Normal Univ, Sch Cyber Secur, Guangzhou 510665, Peoples R China
[3] South China Normal Univ, Sch Elect & Informat Engn, Foshan 510631, Peoples R China
[4] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[5] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Data mining; Mel frequency cepstral coefficient; Recording; Voice activity detection; Transforms; Cepstral analysis; Device feature; linear transformation; replay speech detection; SPEAKER VERIFICATION; INSTANTANEOUS FREQUENCY; EXTRACTION; SYSTEM;
D O I
10.1109/TASLP.2023.3267610
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Replay speech poses a growing threat to speaker verification systems, thus the detection of replay speech becomes increasingly important. A critical factor differentiating replay speech and genuine speech is the representation of device information. Replay speech carries physical device information that originates from recording device, playback device, and environmental noise. In this work, a device-related linear transformation strategy is proposed to disentangle non-device information from replay speech. First, we conduct factor analysis by introducing a common vector for both replay utterance and the corresponding genuine speech utterance on parallel training data; then, we derive an expectation maximization formula to obtain the parameters of the device-related linear transformation; subsequently, three device feature extraction methods are developed based on the device-related linear transformation. The developed device features are evaluated on ASVspoof 2017 version 2.0 and ASVspoof 2021 physical access corpora. The experimental results demonstrate that our proposed linear transformation strategy is effective for replay spoofing detection, and the resultant device features outperform many typical features. Moreover, our spoofing detection systems display superior performance over several competitive state-of-the-art systems.
引用
收藏
页码:1574 / 1586
页数:13
相关论文
共 50 条
  • [1] Device Feature Extraction Based on Parallel Neural Network Training for Replay Spoofing Detection
    You, Chang Huai
    Yang, Jichen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2308 - 2318
  • [2] Effectiveness of Speech Demodulation-Based Features for Replay Detection
    Kamble, Madhu R.
    Tak, Hemlata
    Patil, Hemant A.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 641 - 645
  • [4] Amplitude and Frequency Modulation-based features for detection of replay Spoof Speech
    Kamble, Madhu R.
    Tak, Hemlata
    Patil, Hemant A.
    SPEECH COMMUNICATION, 2020, 125 : 114 - 127
  • [5] Evolutionary fusion of classifiers trained on linear prediction based features for replay attack detection
    Nasersharif, Babak
    Yazdani, Morteza
    EXPERT SYSTEMS, 2021, 38 (03)
  • [6] Replay Attack Detection Using Linear Prediction Analysis-Based Relative Phase Features
    Phapatanaburi, Khomdet
    Wang, Longbiao
    Nakagawa, Seiichi
    Iwahashi, Masahiro
    IEEE ACCESS, 2019, 7 : 183614 - 183625
  • [7] Linear Prediction Residual based Short-term Cepstral Features for Replay Attacks Detection
    Singh, Madhusudan
    Pati, Debadatta
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 751 - 755
  • [8] Speech Replay Detection with x-Vector Attack Embeddings and Spectral Features
    Williams, Jennifer
    Rownicka, Joanna
    INTERSPEECH 2019, 2019, : 1053 - 1057
  • [9] A multi-branch ResNet with discriminative features for detection of replay speech signals
    Cheng, Xingliang
    Xu, Mingxing
    Zheng, Thomas Fang
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9
  • [10] ANALYSIS OF REVERBERATION VIA TEAGER ENERGY FEATURES FOR REPLAY SPOOF SPEECH DETECTION
    Kamble, Madhu R.
    Patil, Hemant A.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2607 - 2611