Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection

被引:0
|
作者
Amir Mohammad Rostami
Mohammad Mehdi Homayounpour
Ahmad Nickabadi
机构
[1] Amirkabir University of Technology,Department of Computer Engineering
关键词
Automatic speaker verification; Spoof detection; ASVspoof; Efficient attention branch network; Combined loss function; EfficientNet-A0;
D O I
暂无
中图分类号
学科分类号
摘要
Many endeavors have sought to develop countermeasure techniques as enhancements on Automatic Speaker Verification (ASV) systems, in order to make them more robust against spoof attacks. As evidenced by the latest ASVspoof 2019 countermeasure challenge, models currently deployed for the task of ASV are, at their best, devoid of suitable degrees of generalization to unseen attacks. A joint improvement of components of ASV spoof detection systems including the classifier, feature extraction phase, and model loss function may lead to a better detection of attacks by these systems. Accordingly, the present study proposes the Efficient Attention Branch Network (EABN) architecture with a combined loss function to address the model generalization to unseen attacks. The EABN is based on attention and perception branches. The attention branch provides an attention mask that improves the classification performance and at the same time is interpretable from a human point of view. The perception branch, is used for our main purpose which is spoof detection. The new EfficientNet-A0 architecture was optimized and employed for the perception branch, with nearly ten times fewer parameters and approximately seven times fewer floating-point operations than the SE-Res2Net50 as the best existing network. The proposed method on ASVspoof 2019 dataset achieved EER = 0.86% and t-DCF = 0.0239 in the Physical Access (PA) scenario using the logPowSpec as the input feature extraction method. Furthermore, using the LFCC feature, and the SE-Res2Net50 for the perception branch, the proposed model achieved EER = 1.89% and t-DCF = 0.507 in the Logical Access (LA) scenario, which to the best of our knowledge, is the best single system ASV spoofing countermeasure method.
引用
收藏
页码:4252 / 4270
页数:18
相关论文
共 50 条
  • [31] Replay spoof detection for speaker verification system using magnitude-phase-instantaneous frequency and energy features
    K. P. Bharath
    M. Rajesh Kumar
    Multimedia Tools and Applications, 2022, 81 : 39343 - 39366
  • [32] Attention Based Network with DA-Loss for X-ray Contraband Automatic Detection
    Li, Peiwen
    Zhang, Lijun
    Zhou, Xiang-Dong
    Shi, Yu
    Shao, Xiaohu
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2237 - 2242
  • [33] Efficient dense attention fusion network with channel correlation loss for road damage detection
    Liu, Zihan
    Jing, Kaifeng
    Yang, Kai
    Zhang, ZhiJun
    Li, Xijie
    IET INTELLIGENT TRANSPORT SYSTEMS, 2024, 18 (10) : 1747 - 1759
  • [34] Anomaly Detection and Diagnosis for Automatic Radio Network Verification
    Ciocarlie, Gabriela F.
    Connolly, Christopher
    Cheng, Chih-Chieh
    Lindqvist, Ulf
    Novaczki, Szabolcs
    Sanneck, Henning
    Naseer-ul-Islam, Muhammad
    MOBILE NETWORKS AND MANAGEMENT, MONAMI 2014, 2015, 141 : 163 - 176
  • [35] An Efficient and Lightweight Model for Automatic Modulation Classification: A Hybrid Feature Extraction Network Combined with Attention Mechanism
    Ma, Zhao
    Fang, Shengliang
    Fan, Youchen
    Li, Gaoxing
    Hu, Haojie
    ELECTRONICS, 2023, 12 (17)
  • [36] Automatic text-independent speaker verification using convolutional deep belief network
    Rakhmanenko, I. A.
    Shelupanov, A. A.
    Kostyuchenko, E. Y.
    COMPUTER OPTICS, 2020, 44 (04) : 596 - +
  • [37] Speaker Identification using Triplet Loss Function Combined with Clustering Techniques
    Shalaby, Mohamed
    Hassan, Mohamed
    Omar, Yasser M. K.
    2021 62ND INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATION TECHNOLOGY AND MANAGEMENT SCIENCE OF RIGA TECHNICAL UNIVERSITY (ITMS), 2021,
  • [38] A hybrid noise robust model for multireplay attack detection in Automatic speaker verification systems
    Dua, Mohit
    Sadhu, Ambika
    Jindal, Anisha
    Mehta, Raman
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 74
  • [39] Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems
    Mingote, Victoria
    Miguel, Antonio
    Ortega, Alfonso
    Lleida, Eduardo
    INTERSPEECH 2021, 2021, : 2361 - 2365
  • [40] Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification
    Kaminishi, Ryota
    Miyamoto, Haruna
    Shiota, Sayaka
    Kiya, Hitoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (01) : 42 - 49