Alpha matting for portraits using encoder-decoder models

被引:0
|
作者
Akshat Srivastava
Srivatsav Raghu
Abitha K Thyagarajan
Jayasri Vaidyaraman
Mohanaprasad Kothandaraman
Pavan Sudheendra
Avinav Goel
机构
[1] VIT University,School of Electronics Engineering (SENSE)
[2] Samsung R&D Institute,undefined
来源
关键词
Alpha matting; Image segmentation; Deep learning; Encoder-decoder models;
D O I
暂无
中图分类号
学科分类号
摘要
Image matting is a technique used to extract the foreground and background from a given image. In the past, classical algorithms based on sampling, propagation, or a combination of the two were used to perform image matting; however, most of these have produced poor results when applied to images with complex backgrounds. They are also unable to extract with high accuracy foreground images that are comprised of thin objects. In this context, the use of deep learning to solve the image matting problem has gained increasing popularity. In this paper, an encoder-decoder model for alpha matting of human portraits using deep learning is proposed. The model used comprises two parts: the first is an encoder-decoder model, which is a deep convolutional network that has 11 convolutional layers and 5 max-pooling layers in the encoder stage and 11 convolutional layers and 5 unpooling layers in the decoder stage. This portion of the model takes the image and trimap as input produces the coarse alpha matte as the output. The second part is the refinement stage with four convolutional layers, responsible for further refining the coarse alpha matte that was produced by the encoder-decoder stage to obtain an alpha matte of high accuracy. The model was trained using 43,100 images. When tested using the alphamatting.com dataset, our model’s output was comparable to the industry standard, yielding an average MSE of 0.023 and an average SAD loss of 66.5.
引用
收藏
页码:14517 / 14528
页数:11
相关论文
共 50 条
  • [21] Interpretable Transformations with Encoder-Decoder Networks
    Worrall, Daniel E.
    Garbin, Stephan J.
    Turmukhambetov, Daniyar
    Brostow, Gabriel J.
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5737 - 5746
  • [22] Understanding Geometry of Encoder-Decoder CNNs
    Ye, Jong Chul
    Sung, Woon Kyoung
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [23] Segmental Encoder-Decoder Models for Large Vocabulary Automatic Speech Recognition
    Beck, Eugen
    Hannemann, Mirko
    Doetsch, Patrick
    Schlueter, Ralf
    Ney, Hermann
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 766 - 770
  • [24] Comparison Between Variational Autoencoder and Encoder-Decoder Models for Short Conversation
    Asakawa, Shin
    Ogata, Takashi
    ICAROB 2017: PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS, 2017, : P639 - P642
  • [25] Investigation of chemical structure recognition by encoder-decoder models in learning progress
    Nemoto, Shumpei
    Mizuno, Tadahaya
    Kusuhara, Hiroyuki
    JOURNAL OF CHEMINFORMATICS, 2023, 15 (01)
  • [26] Whole Image Synthesis Using a Deep Encoder-Decoder Network
    Sevetlidis, Vasileios
    Giuffrida, Mario Valerio
    Tsaftaris, Sotirios A.
    SIMULATION AND SYNTHESIS IN MEDICAL IMAGING, SASHIMI 2016, 2016, 9968 : 127 - 137
  • [27] Unsupervised feature selection using orthogonal encoder-decoder factorization
    Mozafari, Maryam
    Seyedi, Seyed Amjad
    Mohammadiani, Rojiar Pir
    Tab, Fardin Akhlaghian
    INFORMATION SCIENCES, 2024, 663
  • [28] Filling gaps of cartographic polylines by using an encoder-decoder model
    Yu, Wenhao
    Chen, Yujie
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2022, 36 (11) : 2296 - 2321
  • [29] Appraisal of Resistivity Inversion Models With Convolutional Variational Encoder-Decoder Network
    Wilson, Bibin
    Singh, Anand
    Sethi, Amit
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [30] Image Captioning Encoder-Decoder Models Using CNN-RNN Architectures: A Comparative Study
    Suresh, K. Revati
    Jarapala, Arun
    Sudeep, P., V
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (10) : 5719 - 5742