Dual-STI: Dual-path spatial-temporal interaction learning for dynamic facial expression recognition

被引:1
|
作者
Li, Min [1 ]
Zhang, Xiaoqin [1 ]
Fan, Chenxiang [1 ]
Liao, Tangfei [1 ]
Xiao, Guobao [2 ]
机构
[1] Wenzhou Univ, Coll Comp & Artificial Intelligence, Wenzhou 325035, Peoples R China
[2] Tongji Univ, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
基金
中国国家自然科学基金;
关键词
Dynamic facial expression recognition; Spatial-temporal feature; Spatial-temporal interaction; Comparative learning; NETWORK; AWARE;
D O I
10.1016/j.ins.2024.120953
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Learning facial evaluation is crucial for dynamic facial expression recognition. Current recognition methods typically extract temporal features after spatial features to achieve low computation complexity. However, these methods struggle to model complex facial evaluations due to a lack of interaction between spatial and temporal features. This paper proposes a novel Dualpath Spatial -Temporal Interaction (Dual-STI) framework that concurrently extracts spatial and temporal features through two efficient paths. Specifically, Dual-STI comprises a spatial path and a temporal path. The spatial path contains several spatial transformers to capture robust facial features from each sampled frame, while the temporal path includes several temporal transformers to learn rich contextual facial features from the sequence of frames. To facilitate spatial -temporal interaction, Dual-STI features a distinct dual-path interaction module that adaptively fuses spatial and temporal features by combining spatial and temporal attention mechanisms. Additionally, comparative learning is introduced into the loss function to enhance this interaction. To evaluate the proposed method, extensive experiments are conducted on three popular benchmarks, namely DFEW, AFEW, and FERV39k. The experimental results demonstrate that the proposed Dual-STI achieves state -of -the -art performance with low computational complexity across all datasets. Notably, Dual-STI shows significant improvements in the "disgust" and "fear" categories, with precision increases of 3 .45% and 2 .1% on the DFEW dataset, respectively.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] A joint local spatial and global temporal CNN-Transformer for dynamic facial expression recognition
    Wang, Linhuang
    Kang, Xin
    Ding, Fei
    Nakagawa, Satoshi
    Ren, Fuji
    APPLIED SOFT COMPUTING, 2024, 161
  • [42] DP-DWA: DUAL-PATH DYNAMIC WEIGHT ATTENTION NETWORK WITH STREAMING DFSMN-SAN FOR AUTOMATIC SPEECH RECOGNITION
    Ma, Dongpeng
    Wang, Yiwen
    He, Liqiang
    Jin, Mingjie
    Su, Dan
    Yu, Dong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7692 - 7696
  • [43] Dual subspace manifold learning based on GCN for intensity-invariant facial expression recognition
    Chen, Jingying
    Shi, Jinxin
    Xu, Ruyi
    PATTERN RECOGNITION, 2024, 148
  • [44] Remaining Useful Life Prediction Method Based on Dual-Path Interaction Network with Multiscale Feature Fusion and Dynamic Weight Adaptation
    Lu, Zhe
    Li, Bing
    Fu, Changyu
    Wu, Junbao
    Xu, Liang
    Jia, Siye
    Zhang, Hao
    ACTUATORS, 2024, 13 (10)
  • [45] DPCNet: Dual Path Multi-Excitation Collaborative Network for Facial Expression Representation Learning in Videos
    Wang, Yan
    Sun, Yixuan
    Song, Wei
    Gao, Shuyong
    Huang, Yiwen
    Chen, Zhaoyu
    Ge, Weifeng
    Zhang, Wenqiang
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [46] Facial micro-expression recognition using stochastic graph convolutional network and dual transferred learning
    Tang, Hui
    Chai, Li
    NEURAL NETWORKS, 2024, 178
  • [47] Phase Space Reconstruction Driven Spatio-Temporal Feature Learning for Dynamic Facial Expression Recognition
    Wang, Shanmin
    Shuai, Hui
    Liu, Qingshan
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (03) : 1466 - 1476
  • [48] Fault pattern recognition of rolling bearing based on smoothness prior approach and dual-input depth spatial-temporal fusion
    Zhang, M.
    Li, X. J.
    Xu, S. H.
    Meng, X. Y.
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (08)
  • [49] Spatial-temporal hypergraph based on dual-stage attention network for multi-view data lightweight action recognition
    Zhixuan, Wu
    Nan, Ma
    Cheng, Wang
    Cheng, Xu
    Genbao, Xu
    Mingxing, Li
    PATTERN RECOGNITION, 2024, 151
  • [50] A dual stream spatio-temporal deep network for micro-expression recognition using upper facial features
    Nikin Matharaarachchi
    Muhammad Fermi Pasha
    Neural Computing and Applications, 2025, 37 (3) : 1271 - 1287