Learning facial action units with spatiotemporal cues and multi-label sampling

被引：12

作者：

Chu, Wen-Sheng ^{[1
]}

De la Torre, Fernando ^{[1
]}

Cohn, Jeffrey F. ^{[1
,2
]}

机构：

[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA

[2] Univ Pittsburgh, Dept Psychol, Pittsburgh, PA 15260 USA

来源：

IMAGE AND VISION COMPUTING | 2019年 / 81卷

基金：

美国国家科学基金会; 美国国家卫生研究院;

关键词：

Multi-label learning; Deep learning; Spatio-temporal learning; Multi-label sampling; Facial action unit detection; Video analysis; EXPRESSION; RECOGNITION; EMOTION;

D O I：

10.1016/j.imavis.2018.10.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Facial action units (AUs) can be represented spatially, temporally, and in terms of their correlation. Previous research focuses on one or another of these aspects or addresses them disjointly. We propose a hybrid network architecture that jointly models spatial and temporal representations and their correlation. In particular, we use a Convolutional Neural Network (CNN) to learn spatial representations, and a Long Short-Term Memory (LSTM) to model temporal dependencies among them. The outputs of CNNs and LSTMs are aggregated into a fusion network to produce per-frame prediction of multiple AUs. The hybrid network was compared to previous state-of-the-art approaches in two large FACS-coded video databases, GFT and BP4D, with over 400,000 AU-coded frames of spontaneous facial behavior in varied social contexts. Relative to standard multi-label CNN and feature-based state-of-the-art approaches, the hybrid system reduced person-specific biases and obtained increased accuracy for AU detection. To address class imbalance within and between batches during network training, we introduce multi-labeling sampling strategies that further increase accuracy when AUs are relatively sparse. Finally, we provide visualization of the learned AU models, which, to the best of our best knowledge, reveal for the first time how machines see AUs. (C) 2018 Elsevier B.V. All rights reserved.

引用

页码：1 / 14

页数：14

共 50 条

[1] Learning Spatial and Temporal Cues for Multi-label Facial Action Unit Detection
Chu, Wen-Sheng
De la Torre, Fernando
Cohn, Jeffrey F.
2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, : 25 - 32
[2] Deep Region and Multi-label Learning for Facial Action Unit Detection
Zhao, Kaili
Chu, Wen-Sheng
Zhang, Honggang
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3391 - 3399
[3] Joint Patch and Multi-label Learning for Facial Action Unit Detection
Zhao, Kaili
Chu, Wen-Sheng
De la Torre, Fernando
Cohn, Jeffrey F.
Zhang, Honggang
2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 2207 - 2216
[4] Multi-label learning with missing labels for image annotation and facial action unit recognition
Wu, Baoyuan
Lyu, Siwei
Hu, Bao-Gang
Ji, Qiang
PATTERN RECOGNITION, 2015, 48 (07) : 2279 - 2289
[5] Joint Patch and Multi-label Learning for Facial Action Unit and Holistic Expression Recognition
Zhao, Kaili
Chu, Wen-Sheng
De la Torre, Fernando
Cohn, Jeffrey F.
Zhang, Honggang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (08) : 3931 - 3946
[6] Multi-label learning with prior knowledge for facial expression analysis
Zhao, Kaili
Zhang, Honggang
Ma, Zhanyu
Song, Yi-Zhe
Guo, Jun
NEUROCOMPUTING, 2015, 157 : 280 - 289
[7] Facial Action Unit Detection with Multilayer Fused Multi-Task and Multi-Label Deep Learning Network
He, Jun
Li, Dongliang
Bo, Sun
Yu, Lejun
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2019, 13 (11) : 5546 - 5559
[8] Multi-label learning for fault diagnosis of pumping units with one positive label
Qian, Kun
Tang, Jinyu
Zhao, Qimei
Zhao, Shu
Min, Fan
APPLIED SOFT COMPUTING, 2025, 174
[9] Discriminant Multi-Label Manifold Embedding for Facial Action Unit Detection
Yuce, Anil
Gao, Hua
Thiran, Jean-Philippe
2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 6, 2015,
[10] Facial action unit recognition under incomplete data based on multi-label learning with missing labels
Li, Yongqiang
Wu, Baoyuan
Ghanem, Bernard
Zhao, Yongping
Yao, Hongxun
Ji, Qiang
PATTERN RECOGNITION, 2016, 60 : 890 - 900

← 1 2 3 4 5 →