S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing With Statistical Tokens

被引:2
|
作者
Cai, Rizhao [1 ]
Yu, Zitong [2 ]
Kong, Chenqi [1 ]
Li, Haoliang [3 ]
Chen, Changsheng [4 ,5 ]
Hu, Yongjian [6 ,7 ]
Kot, Alex C. [1 ]
机构
[1] Nanyang Technol Univ, Sch EEE, ROSE Lab, Singapore 639798, Singapore
[2] Great Bay Univ, Sch Comp & Informat Technol, Shantou 523000, Peoples R China
[3] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[4] Shenzhen Univ, Guangdong Key Lab Intelligent Informat Proc, Shenzhen Key Lab Media Secur, State Key Lab Radiofrequency Heterogeneous Integra, Shenzhen 518060, Peoples R China
[5] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen 518060, Peoples R China
[6] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 511442, Peoples R China
[7] China Singapore Int Joint Res Inst, Guangzhou 510555, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptation models; Face recognition; Training; Histograms; Data models; Feature extraction; Faces; Vision transformer (ViT); adapter; histogram; face anti-spoofing; face presentation attack detection; domain generalization; PRESENTATION ATTACK DETECTION; ADAPTATION;
D O I
10.1109/TIFS.2024.3420699
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face recognition system by presenting spoofed faces. State-of-the-art FAS techniques predominantly rely on deep learning models but their cross-domain generalization capabilities are often hindered by the domain shift problem, which arises due to different distributions between training and testing data. In this study, we develop a generalized FAS method under the Efficient Parameter Transfer Learning (EPTL) paradigm, where we adapt the pre-trained Vision Transformer models for the FAS task. During training, the adapter modules are inserted into the pre-trained ViT model, and the adapters are updated while other pre-trained parameters remain fixed. We find the limitations of previous vanilla adapters in that they are based on linear layers, which lack a spoofing-aware inductive bias and thus restrict the cross-domain generalization. To address this limitation and achieve cross-domain generalized FAS, we propose a novel Statistical Adapter (S-Adapter) that gathers local discriminative and statistical information from localized token histograms. To further improve the generalization of the statistical tokens, we propose a novel Token Style Regularization (TSR), which aims to reduce domain style variance by regularizing Gram matrices extracted from tokens across different domains. Our experimental results demonstrate that our proposed S-Adapter and TSR provide significant benefits in both zero-shot and few-shot cross-domain testing, outperforming state-of-the-art methods on several benchmark tests. We will release the source code upon acceptance.
引用
收藏
页码:8385 / 8397
页数:13
相关论文
共 50 条
  • [1] Domain Invariant Vision Transformer Learning for Face Anti-spoofing
    Liao, Chen-Hao
    Chen, Wen-Cheng
    Liu, Hsuan-Tung
    Yeh, Yi-Ren
    Hu, Min-Chun
    Chen, Chu-Song
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6087 - 6096
  • [2] ROBUST FACE ANTI-SPOOFING FRAMEWORK WITH CONVOLUTIONAL VISION TRANSFORMER
    Lee, Yunseung
    Kwak, Youngjun
    Shin, Jinho
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1015 - 1019
  • [3] Rethinking Vision Transformer and Masked Autoencoder in Multimodal Face Anti-Spoofing
    Yu, Zitong
    Cai, Rizhao
    Cui, Yawen
    Liu, Xin
    Hu, Yongjian
    Kot, Alex C.
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (11) : 5217 - 5238
  • [4] Channel difference transformer for face anti-spoofing
    Huang, Pei-Kai
    Chong, Jun-Xiong
    Hsu, Ming-Tsung
    Hsu, Fang-Yu
    Hsu, Chiou-Ting
    INFORMATION SCIENCES, 2025, 702
  • [5] LDCFORMER: INCORPORATING LEARNABLE DESCRIPTIVE CONVOLUTION TO VISION TRANSFORMER FOR FACE ANTI-SPOOFING
    Huang, Pei-Kai
    Chiang, Cheng-Hsuan
    Chong, Jun-Xiong
    Chen, Tzu-Hsien
    Ni, Hui-Yu
    Hsu, Chiou-Ting
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 121 - 125
  • [6] KDFAS: Multi-stage Knowledge Distillation Vision Transformer for Face Anti-spoofing
    Zhang, Jun
    Zhang, Yunfei
    Shao, Feixue
    Ma, Xuetao
    Zhou, Daoxiang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V, 2024, 14429 : 159 - 171
  • [7] Surveillance Face Anti-Spoofing
    Fang, Hao
    Liu, Ajian
    Wan, Jun
    Escalera, Sergio
    Zhao, Chenxu
    Zhang, Xu
    Li, Stan Z.
    Lei, Zhen
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 1535 - 1546
  • [8] Face anti-spoofing methods
    Parveen, Sajida
    Ahmad, Sharifah Mumtazah Syed
    Hanafi, Marsyita
    Adnan, Wan Azizun Wan
    CURRENT SCIENCE, 2015, 108 (08): : 1491 - 1500
  • [9] Towards face anti-spoofing
    Syed, Muhammad Ibrahim
    Asif, Amina
    Shahzad, Mohsin
    Khan, Uzair
    Khan, Sumair
    Mahmood, Zahid
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2023,
  • [10] A Review on Face Anti-spoofing
    Jiang F.-L.
    Liu P.-C.
    Zhou X.-D.
    Zhou, Xiang-Dong (zhouxiangdong@cigit.ac.cn), 1799, Science Press (47): : 1799 - 1821