Deep Multi-Input Multi-Stream Ordinal Model for age estimation: Based on spatial attention learning

被引:4
|
作者
Kong, Chang [1 ,2 ]
Wang, Haitao [1 ]
Luo, Qiuming [1 ]
Mao, Rui [1 ]
Chen, Guoliang [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
[2] SUNIQUECo Ltd, Shenzhen, Peoples R China
关键词
D2MO; Spatial attention; Multi -hot vector; Age estimation; Multi-input; Multi-stream;
D O I
10.1016/j.future.2022.10.009
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Face aging process is non-stationary since human matures in different ways. This property makes age estimation is an attractive and challenging research topic in the computer vision community. Most of previous work conventionally estimate age from the center area of the aligned face image. However, these methods ignore spatial context information and cannot pay attention to particular domain features due to the uncertainty in deep learning. In this work, we propose a novel Deep Multi-Input Multi-Stream Ordinal (D2MO) Model for facial age estimation, which learns deep fusion feature through a specific spatial attention mechanism. Our approach is motivated by the observations that there are some universal changes, like hair color turning to white and wrinkles increasing, for individuals during aging process. In order to focus these spatial features, our D2MO uses four scales of receptive fields for global and contextual feature learning, and meanwhile, four cropped face patches are utilized for local and detailed feature extraction. Benefiting from a multi-stream CNN architecture, differentiated feature maps are learned separately through each branch and then aggregated together by concatenate layer. We also introduce a novel representation for age label using a multi-hot vector and the final predicted age can be calculated by summing the vector. This representation cast age estimation task to solve a series of binary classification subproblems which is easier to learn and more consistent with human cognition rather than to regress a single age value directly. Finally, we employ a joint training loss to supervise our model to learn the ordinal ranking, label distribution and regression information simultaneously. Extensive experiments show that our D2MO model significantly outperforms other state-of-the-art age estimation methods on MORPH II, FG-NET and UAGD datasets.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:173 / 184
页数:12
相关论文
共 50 条
  • [1] Multi-task learning とmulti-stream monocular depth estimation using integrated model with multi-task learning and multi-stream
    Takamine, Michiru
    Endo, Satoshi
    Transactions of the Japanese Society for Artificial Intelligence, 2021, 36 (05): : 1 - 9
  • [2] A Multi-stream Deep Learning Model for EEG-based Depression Identification
    Wu, Hao
    Liu, Jiyao
    Proceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022, 2022, : 2029 - 2034
  • [3] MUGGLE: MUlti-Stream Group Gaze Learning and Estimation
    Zhuang, Ning
    Ni, Bingbing
    Xu, Yi
    Yang, Xiaokang
    Zhang, Wenjun
    Li, Zefan
    Gao, Wen
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (10) : 3637 - 3650
  • [4] Multi-Input Deep Learning Based FMCW Radar Signal Classification
    Cha, Daewoong
    Jeong, Sohee
    Yoo, Minwoo
    Oh, Jiyong
    Han, Dongseog
    ELECTRONICS, 2021, 10 (10)
  • [5] Multi-Stream Single Network: Efficient Compressed Video Action Recognition With a Single Multi-Input Multi-Output Network
    Terao, Hayato
    Noguchi, Wataru
    Iizuka, Hiroyuki
    Yamamoto, Masahito
    IEEE ACCESS, 2024, 12 : 20983 - 20997
  • [6] Multi-stream Deep Learning Framework for Automated Presentation Assessment
    Li, Junnan
    Wong, Yongkang
    Kankanhalli, Mohan S.
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2016, : 222 - 225
  • [7] Multi-Stream Deep Similarity Learning Networks for Visual Tracking
    Li, Kunpeng
    Kong, Yu
    Fu, Yun
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2166 - 2172
  • [8] Multimodal Multi-stream Deep Learning for Egocentric Activity Recognition
    Song, Sibo
    Chandrasekhar, Vijay
    Mandal, Bappaditya
    Li, Liyuan
    Lim, Joo-Hwee
    Babu, Giduthuri Sateesh
    San, Phyo Phyo
    Cheung, Ngai-Man
    PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 378 - 385
  • [9] A spatiotemporal multi-stream learning framework based on attention mechanism for automatic modulation recognition
    Wang, Xu
    Liu, Dejun
    Zhang, Yuhao
    Li, Yang
    Wu, Shiwei
    DIGITAL SIGNAL PROCESSING, 2022, 130
  • [10] Multi-Input Deep Learning Model with RGB and Hyperspectral Imaging for Banana Grading
    Mesa, Armacheska Rivero
    Chiang, John Y.
    AGRICULTURE-BASEL, 2021, 11 (08):