Learning facial action units with spatiotemporal cues and multi-label sampling

被引:12
|
作者
Chu, Wen-Sheng [1 ]
De la Torre, Fernando [1 ]
Cohn, Jeffrey F. [1 ,2 ]
机构
[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
[2] Univ Pittsburgh, Dept Psychol, Pittsburgh, PA 15260 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Multi-label learning; Deep learning; Spatio-temporal learning; Multi-label sampling; Facial action unit detection; Video analysis; EXPRESSION; RECOGNITION; EMOTION;
D O I
10.1016/j.imavis.2018.10.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Facial action units (AUs) can be represented spatially, temporally, and in terms of their correlation. Previous research focuses on one or another of these aspects or addresses them disjointly. We propose a hybrid network architecture that jointly models spatial and temporal representations and their correlation. In particular, we use a Convolutional Neural Network (CNN) to learn spatial representations, and a Long Short-Term Memory (LSTM) to model temporal dependencies among them. The outputs of CNNs and LSTMs are aggregated into a fusion network to produce per-frame prediction of multiple AUs. The hybrid network was compared to previous state-of-the-art approaches in two large FACS-coded video databases, GFT and BP4D, with over 400,000 AU-coded frames of spontaneous facial behavior in varied social contexts. Relative to standard multi-label CNN and feature-based state-of-the-art approaches, the hybrid system reduced person-specific biases and obtained increased accuracy for AU detection. To address class imbalance within and between batches during network training, we introduce multi-labeling sampling strategies that further increase accuracy when AUs are relatively sparse. Finally, we provide visualization of the learned AU models, which, to the best of our best knowledge, reveal for the first time how machines see AUs. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [41] Multi-label Software Behavior Learning
    Feng, Yang
    Chen, Zhenyu
    2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2012, : 1305 - 1308
  • [42] Robust Extreme Multi-label Learning
    Xu, Chang
    Tao, Dacheng
    Xu, Chao
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1275 - 1284
  • [43] Multi-Label Learning for Activity Recognition
    Kumar, R.
    Qamar, I.
    Virdi, J. S.
    Krishnan, N. C.
    2015 INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS IE 2015, 2015, : 152 - 155
  • [44] Metric Learning for Multi-label Classification
    Brighi, Marco
    Franco, Annalisa
    Maio, Dario
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 24 - 33
  • [45] Collaboration Based Multi-Label Learning
    Feng, Lei
    An, Bo
    He, Shuo
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3550 - 3557
  • [46] Hyperspherical Learning in Multi-Label Classification
    Ke, Bo
    Zhu, Yunquan
    Li, Mengtian
    Shu, Xiujun
    Qiao, Ruizhi
    Ren, Bo
    COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 38 - 55
  • [47] Compact learning for multi-label classification
    Lv, Jiaqi
    Wu, Tianran
    Peng, Chenglun
    Liu, Yunpeng
    Xu, Ning
    Geng, Xin
    PATTERN RECOGNITION, 2021, 113
  • [48] Feature Selection for Multi-Label Learning
    Spolaor, Newton
    Monard, Maria Carolina
    Lee, Huei Diana
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 4401 - 4402
  • [49] Multi-Label Learning from Crowds
    Li, Shao-Yuan
    Jiang, Yuan
    Chawla, Nitesh V.
    Zhou, Zhi-Hua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (07) : 1369 - 1382
  • [50] On active learning in multi-label classification
    Brinker, K
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 206 - 213