Deep Multi-Input Multi-Stream Ordinal Model for age estimation: Based on spatial attention learning

被引：4

作者：

Kong, Chang ^{[1
,2
]}

Wang, Haitao ^{[1
]}

Luo, Qiuming ^{[1
]}

Mao, Rui ^{[1
]}

Chen, Guoliang ^{[1
]}

机构：

[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China

[2] SUNIQUECo Ltd, Shenzhen, Peoples R China

来源：

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2023年 / 140卷

关键词：

D2MO; Spatial attention; Multi -hot vector; Age estimation; Multi-input; Multi-stream;

D O I：

10.1016/j.future.2022.10.009

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Face aging process is non-stationary since human matures in different ways. This property makes age estimation is an attractive and challenging research topic in the computer vision community. Most of previous work conventionally estimate age from the center area of the aligned face image. However, these methods ignore spatial context information and cannot pay attention to particular domain features due to the uncertainty in deep learning. In this work, we propose a novel Deep Multi-Input Multi-Stream Ordinal (D2MO) Model for facial age estimation, which learns deep fusion feature through a specific spatial attention mechanism. Our approach is motivated by the observations that there are some universal changes, like hair color turning to white and wrinkles increasing, for individuals during aging process. In order to focus these spatial features, our D2MO uses four scales of receptive fields for global and contextual feature learning, and meanwhile, four cropped face patches are utilized for local and detailed feature extraction. Benefiting from a multi-stream CNN architecture, differentiated feature maps are learned separately through each branch and then aggregated together by concatenate layer. We also introduce a novel representation for age label using a multi-hot vector and the final predicted age can be calculated by summing the vector. This representation cast age estimation task to solve a series of binary classification subproblems which is easier to learn and more consistent with human cognition rather than to regress a single age value directly. Finally, we employ a joint training loss to supervise our model to learn the ordinal ranking, label distribution and regression information simultaneously. Extensive experiments show that our D2MO model significantly outperforms other state-of-the-art age estimation methods on MORPH II, FG-NET and UAGD datasets.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：173 / 184

页数：12

共 50 条

[41] A Multi-Stream Approach to Mixed-Traffic Accident Recognition Using Deep Learning
Fu, Swee Tee
Theng, Lau Bee
Shiong, Brian Loh Chung
Mccarthy, Chris
Tsun, Mark Tee Kit
IEEE ACCESS, 2024, 12 : 185232 - 185249
[42] A Multi-Group Multi-Stream attribute Attention network for fine-grained zero-shot learning
Song, Lingyun
Shang, Xuequn
Zhou, Ruizhi
Liu, Jun
Ma, Jie
Li, Zhanhuai
Sun, Mingxuan
NEURAL NETWORKS, 2024, 179
[43] Multi-stream Gaussian Mixture Model based Facial Feature Localization
Kumatani, Kenichi
Ekenel, Hazim K.
Gao, Hua
Stiefelhagen, Rainer
Ercil, Aytuel
2008 IEEE 16TH SIGNAL PROCESSING, COMMUNICATION AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2008, : 869 - +
[44] Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition
Yu, Lubin
Tian, Lianfang
Du, Qiliang
Bhutto, Jameel Ahmed
IET COMPUTER VISION, 2022, 16 (02) : 143 - 158
[45] Multi-stream GCN for Sign Language Recognition Based on Asymmetric Convolution Channel Attention
Liu, Yuhong
Lu, Fei
Cheng, Xianpeng
Yuan, Ying
Tian, Guohui
2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 614 - 619
[46] Multi-stream Attention-based BLSTM with Feature Segmentation for Speech Emotion Recognition
Chiba, Yuya
Nose, Takashi
Ito, Akinori
INTERSPEECH 2020, 2020, : 3301 - 3305
[47] Multi-task multi-scale attention learning-based facial age estimation
Shi, Chaojun
Zhao, Shiwei
Zhang, Ke
Feng, Xiaohan
IET SIGNAL PROCESSING, 2023, 17 (02)
[48] A Comparative Study of Multiple Deep Learning Models Based on Multi-Input Resolution for Breast Ultrasound Images
Wu, Huaiyu
Ye, Xiuqin
Jiang, Yitao
Tian, Hongtian
Yang, Keen
Cui, Chen
Shi, Siyuan
Liu, Yan
Huang, Sijing
Chen, Jing
Xu, Jinfeng
Dong, Fajin
FRONTIERS IN ONCOLOGY, 2022, 12
[49] Multi-Input CNN-LSTM deep learning model for fear level classification based on EEG and peripheral physiological signals
Masuda, Nagisa
Yairi, Ikuko Eguchi
FRONTIERS IN PSYCHOLOGY, 2023, 14
[50] Age Estimation From Facial Parts Using Compact Multi-Stream Convolutional Neural Networks
Angeloni, Marcus de Assis
Pereira, Rodrigo de Freitas
Pedrini, Helio
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3039 - 3045

← 1 2 3 4 5 →