Deep Multi-Input Multi-Stream Ordinal Model for age estimation: Based on spatial attention learning

被引:4
|
作者
Kong, Chang [1 ,2 ]
Wang, Haitao [1 ]
Luo, Qiuming [1 ]
Mao, Rui [1 ]
Chen, Guoliang [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
[2] SUNIQUECo Ltd, Shenzhen, Peoples R China
关键词
D2MO; Spatial attention; Multi -hot vector; Age estimation; Multi-input; Multi-stream;
D O I
10.1016/j.future.2022.10.009
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Face aging process is non-stationary since human matures in different ways. This property makes age estimation is an attractive and challenging research topic in the computer vision community. Most of previous work conventionally estimate age from the center area of the aligned face image. However, these methods ignore spatial context information and cannot pay attention to particular domain features due to the uncertainty in deep learning. In this work, we propose a novel Deep Multi-Input Multi-Stream Ordinal (D2MO) Model for facial age estimation, which learns deep fusion feature through a specific spatial attention mechanism. Our approach is motivated by the observations that there are some universal changes, like hair color turning to white and wrinkles increasing, for individuals during aging process. In order to focus these spatial features, our D2MO uses four scales of receptive fields for global and contextual feature learning, and meanwhile, four cropped face patches are utilized for local and detailed feature extraction. Benefiting from a multi-stream CNN architecture, differentiated feature maps are learned separately through each branch and then aggregated together by concatenate layer. We also introduce a novel representation for age label using a multi-hot vector and the final predicted age can be calculated by summing the vector. This representation cast age estimation task to solve a series of binary classification subproblems which is easier to learn and more consistent with human cognition rather than to regress a single age value directly. Finally, we employ a joint training loss to supervise our model to learn the ordinal ranking, label distribution and regression information simultaneously. Extensive experiments show that our D2MO model significantly outperforms other state-of-the-art age estimation methods on MORPH II, FG-NET and UAGD datasets.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:173 / 184
页数:12
相关论文
共 50 条
  • [41] A Multi-Stream Approach to Mixed-Traffic Accident Recognition Using Deep Learning
    Fu, Swee Tee
    Theng, Lau Bee
    Shiong, Brian Loh Chung
    Mccarthy, Chris
    Tsun, Mark Tee Kit
    IEEE ACCESS, 2024, 12 : 185232 - 185249
  • [42] A Multi-Group Multi-Stream attribute Attention network for fine-grained zero-shot learning
    Song, Lingyun
    Shang, Xuequn
    Zhou, Ruizhi
    Liu, Jun
    Ma, Jie
    Li, Zhanhuai
    Sun, Mingxuan
    NEURAL NETWORKS, 2024, 179
  • [43] Multi-stream Gaussian Mixture Model based Facial Feature Localization
    Kumatani, Kenichi
    Ekenel, Hazim K.
    Gao, Hua
    Stiefelhagen, Rainer
    Ercil, Aytuel
    2008 IEEE 16TH SIGNAL PROCESSING, COMMUNICATION AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2008, : 869 - +
  • [44] Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition
    Yu, Lubin
    Tian, Lianfang
    Du, Qiliang
    Bhutto, Jameel Ahmed
    IET COMPUTER VISION, 2022, 16 (02) : 143 - 158
  • [45] Multi-stream GCN for Sign Language Recognition Based on Asymmetric Convolution Channel Attention
    Liu, Yuhong
    Lu, Fei
    Cheng, Xianpeng
    Yuan, Ying
    Tian, Guohui
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 614 - 619
  • [46] Multi-stream Attention-based BLSTM with Feature Segmentation for Speech Emotion Recognition
    Chiba, Yuya
    Nose, Takashi
    Ito, Akinori
    INTERSPEECH 2020, 2020, : 3301 - 3305
  • [47] Multi-task multi-scale attention learning-based facial age estimation
    Shi, Chaojun
    Zhao, Shiwei
    Zhang, Ke
    Feng, Xiaohan
    IET SIGNAL PROCESSING, 2023, 17 (02)
  • [48] A Comparative Study of Multiple Deep Learning Models Based on Multi-Input Resolution for Breast Ultrasound Images
    Wu, Huaiyu
    Ye, Xiuqin
    Jiang, Yitao
    Tian, Hongtian
    Yang, Keen
    Cui, Chen
    Shi, Siyuan
    Liu, Yan
    Huang, Sijing
    Chen, Jing
    Xu, Jinfeng
    Dong, Fajin
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [49] Multi-Input CNN-LSTM deep learning model for fear level classification based on EEG and peripheral physiological signals
    Masuda, Nagisa
    Yairi, Ikuko Eguchi
    FRONTIERS IN PSYCHOLOGY, 2023, 14
  • [50] Age Estimation From Facial Parts Using Compact Multi-Stream Convolutional Neural Networks
    Angeloni, Marcus de Assis
    Pereira, Rodrigo de Freitas
    Pedrini, Helio
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3039 - 3045