SV-DeiT: Speaker Verification with DeiTCap Spoofing Detection

被引:0
|
作者
Ranjan, Rishabh [1 ]
Vatsa, Mayank [1 ]
Singh, Richa [1 ]
机构
[1] Indian Inst Technol, Jodhpur, Rajasthan, India
关键词
D O I
10.1109/IJCB57857.2023.10449121
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As advancements in automatic speech generation continue to progress, the ability to distinguish between real and fake samples has diminished. In addition, current spoofing detection algorithms struggle to perform well on new and unseen test distributions. To address these challenges, this paper presents two contributions. First, inspired by the success of transformer and capsule networks in high representation capabilities, we propose the DeiTCap spoof detection network on spectrogram audio features. This framework utilizes multi-head attention, sub-entities (capsules) in the audio domain and a modified routing algorithm to identify capsule agreement. The proposed spoof detection algorithm is integrated into the spoofing aware speaker recognition framework SV-DeiT. Second, we introduce a novel text-to-speech dataset TRADIF created with cutting-edge transformers and diffusion models to evaluate the generalizability of countermeasure systems. Our proposed DeiTCap achieves an EER of 1.08% on the evaluation set of the ASVSpoof2019 LA dataset. Moreover, the proposed network demonstrates strength in cross-domain training-testing with two different datasets, highlighting its robustness and versatility.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Optimizing Tandem Speaker Verification and Anti-Spoofing Systems
    Kanervisto, Anssi
    Hautamaki, Ville
    Kinnunen, Tomi
    Yamagishi, Junichi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 477 - 488
  • [32] Spoofing Speaker Verification With Voice Style Transfer And Reconstruction Loss
    Thebaud, Thomas
    Le Lan, Gael
    Larcher, Anthony
    2021 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2021, : 7 - 13
  • [33] Optimizing a-DCF for Spoofing-Robust Speaker Verification
    Kurnaz, Oguzhan
    Mishra, Jagabandhu
    Kinnunen, Tomi H.
    Hanilci, Cemal
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 1081 - 1085
  • [34] IIIT-H Spoofing Countermeasures for Automatic Speaker Verification Spoofing and Countermeasures Challenge 2019
    Alluri, K. N. R. K. Raju
    Vuppala, Anil Kumar
    INTERSPEECH 2019, 2019, : 1043 - 1047
  • [35] Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection
    Li, Jiakang
    Sun, Meng
    Zhang, Xiongwei
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1517 - 1522
  • [36] Deep domain adaptation for anti-spoofing in speaker verification systems
    Himawan, Ivan
    Villavicencio, Fernando
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 377 - 402
  • [37] A new speaker verification spoofing countermeasure based on local binary patterns
    Alegre, Federico
    Vipperla, Ravichander
    Amehraye, Asmaa
    Evans, Nicholas
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 940 - 944
  • [38] Towards single integrated spoofing-aware speaker verification embeddings
    Mun, Sung Hwan
    Shim, Hye-jin
    Tak, Hemlata
    Wang, Xin
    Liu, Xuechen
    Sahidullah, Md
    Jeong, Myeonghun
    Han, Min Hyun
    Todisco, Massimiliano
    Lee, Kong Aik
    Yamagishi, Junichi
    Evans, Nicholas
    Kinnunen, Tomi
    Kim, Nam Soo
    Jung, Jee-weon
    INTERSPEECH 2023, 2023, : 3989 - 3993
  • [39] SPOOFING COUNTERMEASURES TO PROTECT AUTOMATIC SPEAKER VERIFICATION FROM VOICE CONVERSION
    Alegre, Federico
    Amehraye, Asmaa
    Evans, Nicholas
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3068 - 3072
  • [40] A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification
    Li, Lantian
    Chen, Yixiang
    Wang, Dong
    Zheng, Thomas Fang
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 92 - 96