Alpha matting for portraits using encoder-decoder models

被引:0
|
作者
Akshat Srivastava
Srivatsav Raghu
Abitha K Thyagarajan
Jayasri Vaidyaraman
Mohanaprasad Kothandaraman
Pavan Sudheendra
Avinav Goel
机构
[1] VIT University,School of Electronics Engineering (SENSE)
[2] Samsung R&D Institute,undefined
来源
关键词
Alpha matting; Image segmentation; Deep learning; Encoder-decoder models;
D O I
暂无
中图分类号
学科分类号
摘要
Image matting is a technique used to extract the foreground and background from a given image. In the past, classical algorithms based on sampling, propagation, or a combination of the two were used to perform image matting; however, most of these have produced poor results when applied to images with complex backgrounds. They are also unable to extract with high accuracy foreground images that are comprised of thin objects. In this context, the use of deep learning to solve the image matting problem has gained increasing popularity. In this paper, an encoder-decoder model for alpha matting of human portraits using deep learning is proposed. The model used comprises two parts: the first is an encoder-decoder model, which is a deep convolutional network that has 11 convolutional layers and 5 max-pooling layers in the encoder stage and 11 convolutional layers and 5 unpooling layers in the decoder stage. This portion of the model takes the image and trimap as input produces the coarse alpha matte as the output. The second part is the refinement stage with four convolutional layers, responsible for further refining the coarse alpha matte that was produced by the encoder-decoder stage to obtain an alpha matte of high accuracy. The model was trained using 43,100 images. When tested using the alphamatting.com dataset, our model’s output was comparable to the industry standard, yielding an average MSE of 0.023 and an average SAD loss of 66.5.
引用
收藏
页码:14517 / 14528
页数:11
相关论文
共 50 条
  • [31] Pedestrian trajectory prediction using BiRNN encoder-decoder framework*
    Wu, Jiaxu
    Woo, Hanwool
    Tamura, Yusuke
    Moro, Alessandro
    Massaroli, Stefano
    Yamashita, Atsushi
    Asama, Hajime
    ADVANCED ROBOTICS, 2019, 33 (18) : 956 - 969
  • [32] A General Two-branch Decoder Architecture for Improving Encoder-decoder Image Segmentation Models
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Sidibe, Desire
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 374 - 381
  • [33] Prediction of the morphological evolution of a splashing drop using an encoder-decoder
    Yee, Jingzu
    Igarashi, Daichi
    Miyatake, Shun
    Tagawa, Yoshiyuki
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (02):
  • [34] VISIBLE AND INFRARED IMAGE FUSION USING ENCODER-DECODER NETWORK
    Ataman, Ferhat Can
    Bozdagi Akar, Gozde
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1779 - 1783
  • [35] Automated tongue segmentation using deep encoder-decoder model
    Kusakunniran, Worapan
    Borwarnginn, Punyanuch
    Imaromkul, Thanandon
    Aukkapinyo, Kittinun
    Thongkanchorn, Kittikhun
    Wattanadhirach, Disathon
    Mongkolluksamee, Sophon
    Thammasudjarit, Ratchainant
    Ritthipravat, Panrasee
    Tuakta, Pimchanok
    Benjapornlert, Paitoon
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (24) : 37661 - 37686
  • [36] EEG Channel Interpolation Using Deep Encoder-decoder Networks
    Saba-Sadiya, Sari
    Alhanai, Tuka
    Liu, Taosheng
    Ghassemi, Mohammad M.
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2432 - 2439
  • [37] Inferring contextual preferences using deep encoder-decoder learners
    Unger, Moshe
    Shapira, Bracha
    Rokach, Lior
    Livne, Amit
    NEW REVIEW OF HYPERMEDIA AND MULTIMEDIA, 2018, 24 (03) : 262 - 290
  • [38] Learning Depth for Scene Reconstruction Using an Encoder-Decoder Model
    Tu, Xiaohan
    Xu, Cheng
    Liu, Siping
    Xie, Guoqi
    Huang, Jing
    Li, Renfa
    Yuan, Junsong
    IEEE ACCESS, 2020, 8 : 89300 - 89317
  • [39] Automated tongue segmentation using deep encoder-decoder model
    Worapan Kusakunniran
    Punyanuch Borwarnginn
    Thanandon Imaromkul
    Kittinun Aukkapinyo
    Kittikhun Thongkanchorn
    Disathon Wattanadhirach
    Sophon Mongkolluksamee
    Ratchainant Thammasudjarit
    Panrasee Ritthipravat
    Pimchanok Tuakta
    Paitoon Benjapornlert
    Multimedia Tools and Applications, 2023, 82 : 37661 - 37686
  • [40] A Systematic Literature Review on Using the Encoder-Decoder Models for Image Captioning in English and Arabic Languages
    Alsayed, Ashwaq
    Arif, Muhammad
    Qadah, Thamir M.
    Alotaibi, Saud
    APPLIED SCIENCES-BASEL, 2023, 13 (19):