End-to-end face parsing via interlinked convolutional neural networks

被引:22
|
作者
Yin, Zi [1 ]
Yiu, Valentin [2 ,3 ]
Hu, Xiaolin [2 ]
Tang, Liang [1 ]
机构
[1] Beijing Forestry Univ, Sch Technol, Beijing 100083, Peoples R China
[2] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Inst Artificial Intelligence,THBI,Dept Comp Sci &, Beijing 100084, Peoples R China
[3] Cent Supelec, F-91190 Gif Sur Yvette, France
基金
中国国家自然科学基金;
关键词
STN-iCNN; Face parsing; End-to-end;
D O I
10.1007/s11571-020-09615-4
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Face parsing is an important computer vision task that requires accurate pixel segmentation of facial parts (such as eyes, nose, mouth, etc.), providing a basis for further face analysis, modification, and other applications. Interlinked Convolutional Neural Networks (iCNN) was proved to be an effective two-stage model for face parsing. However, the original iCNN was trained separately in two stages, limiting its performance. To solve this problem, we introduce a simple, end-to-end face parsing framework: STN-aided iCNN(STN-iCNN), which extends the iCNN by adding a Spatial Transformer Network (STN) between the two isolated stages. The STN-iCNN uses the STN to provide a trainable connection to the original two-stage iCNN pipeline, making end-to-end joint training possible. Moreover, as a by-product, STN also provides more precise cropped parts than the original cropper. Due to these two advantages, our approach significantly improves the accuracy of the original model. Our model achieved competitive performance on the Helen Dataset, the standard face parsing dataset. It also achieved superior performance on CelebAMask-HQ dataset, proving its good generalization. Our code has been released at https://github.com/aod321/STN-iCNN.
引用
收藏
页码:169 / 179
页数:11
相关论文
共 50 条
  • [31] Jasper: An End-to-End Convolutional Neural Acoustic Model
    Li, Jason
    Lavrukhin, Vitaly
    Ginsburg, Boris
    Leary, Ryan
    Kuchaiev, Oleksii
    Cohen, Jonathan M.
    Nguyen, Huyen
    Gadde, Ravi Teja
    INTERSPEECH 2019, 2019, : 71 - 75
  • [32] End-to-end video background subtraction with 3d convolutional neural networks
    Sakkos, Dimitrios
    Liu, Heng
    Han, Jungong
    Shao, Ling
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 23023 - 23041
  • [33] Streaming Convolutional Neural Networks for End-to-End Learning With Multi-Megapixel Images
    Pinckaers, Hans
    van Ginneken, Bram
    Litjens, Geert
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) : 1581 - 1590
  • [34] END-TO-END PHOTOPLETHYSMOGRAPHY (PPG) BASED BIOMETRIC AUTHENTICATION BY USING CONVOLUTIONAL NEURAL NETWORKS
    Luque, Jordi
    Cortes, Guillem
    Segura, Carlos
    Maravilla, Alexandre
    Esteban, Javier
    Fabregat, Joan
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 538 - 542
  • [35] End-to-End Single Image Super-Resolution Based on Convolutional Neural Networks
    Ferariu, Lavinia
    Beti, Iosif-Alin
    2022 26TH INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2022, : 277 - 282
  • [36] Training Convolutional Neural Networks and Compressed Sensing End-to-End for Microscopy Cell Detection
    Xue, Yao
    Bigras, Gilbert
    Hugh, Judith
    Ray, Nilanjan
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (11) : 2632 - 2641
  • [37] An End-to-end Approach to Language Identification in Short Utterances using Convolutional Neural Networks
    Lozano-Diez, Alicia
    Zazo-Candil, Ruben
    Gonzalez-Dominguez, Javier
    Toledano, Doroteo T.
    Gonzalez-Rodriguez, Joaquin
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 403 - 407
  • [38] Effect of Adding Positional Information on Convolutional Neural Networks for End-to-End Speech Recognition
    Park, Jinhwan
    Sung, Wonyong
    INTERSPEECH 2020, 2020, : 46 - 50
  • [39] End-to-end video background subtraction with 3d convolutional neural networks
    Dimitrios Sakkos
    Heng Liu
    Jungong Han
    Ling Shao
    Multimedia Tools and Applications, 2018, 77 : 23023 - 23041
  • [40] FACE DETECTION AND RECOGNITION FOR HOME SERVICE ROBOTS WITH END-TO-END DEEP NEURAL NETWORKS
    Jiang, Wei
    Wang, Wei
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2232 - 2236