Alpha matting for portraits using encoder-decoder models

被引：0

作者：

Akshat Srivastava

Srivatsav Raghu

Abitha K Thyagarajan

Jayasri Vaidyaraman

Mohanaprasad Kothandaraman

Pavan Sudheendra

Avinav Goel

机构：

[1] VIT University,School of Electronics Engineering (SENSE)

[2] Samsung R&D Institute,undefined

来源：

Multimedia Tools and Applications | 2022年 / 81卷

关键词：

Alpha matting; Image segmentation; Deep learning; Encoder-decoder models;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Image matting is a technique used to extract the foreground and background from a given image. In the past, classical algorithms based on sampling, propagation, or a combination of the two were used to perform image matting; however, most of these have produced poor results when applied to images with complex backgrounds. They are also unable to extract with high accuracy foreground images that are comprised of thin objects. In this context, the use of deep learning to solve the image matting problem has gained increasing popularity. In this paper, an encoder-decoder model for alpha matting of human portraits using deep learning is proposed. The model used comprises two parts: the first is an encoder-decoder model, which is a deep convolutional network that has 11 convolutional layers and 5 max-pooling layers in the encoder stage and 11 convolutional layers and 5 unpooling layers in the decoder stage. This portion of the model takes the image and trimap as input produces the coarse alpha matte as the output. The second part is the refinement stage with four convolutional layers, responsible for further refining the coarse alpha matte that was produced by the encoder-decoder stage to obtain an alpha matte of high accuracy. The model was trained using 43,100 images. When tested using the alphamatting.com dataset, our model’s output was comparable to the industry standard, yielding an average MSE of 0.023 and an average SAD loss of 66.5.

引用

页码：14517 / 14528

页数：11

共 50 条

[31] Pedestrian trajectory prediction using BiRNN encoder-decoder framework*
Wu, Jiaxu
Woo, Hanwool
Tamura, Yusuke
Moro, Alessandro
Massaroli, Stefano
Yamashita, Atsushi
Asama, Hajime
ADVANCED ROBOTICS, 2019, 33 (18) : 956 - 969
[32] A General Two-branch Decoder Architecture for Improving Encoder-decoder Image Segmentation Models
Hu, Sijie
Bonardi, Fabien
Bouchafa, Samia
Sidibe, Desire
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 374 - 381
[33] Prediction of the morphological evolution of a splashing drop using an encoder-decoder
Yee, Jingzu
Igarashi, Daichi
Miyatake, Shun
Tagawa, Yoshiyuki
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (02):
[34] VISIBLE AND INFRARED IMAGE FUSION USING ENCODER-DECODER NETWORK
Ataman, Ferhat Can
Bozdagi Akar, Gozde
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1779 - 1783
[35] Automated tongue segmentation using deep encoder-decoder model
Kusakunniran, Worapan
Borwarnginn, Punyanuch
Imaromkul, Thanandon
Aukkapinyo, Kittinun
Thongkanchorn, Kittikhun
Wattanadhirach, Disathon
Mongkolluksamee, Sophon
Thammasudjarit, Ratchainant
Ritthipravat, Panrasee
Tuakta, Pimchanok
Benjapornlert, Paitoon
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (24) : 37661 - 37686
[36] EEG Channel Interpolation Using Deep Encoder-decoder Networks
Saba-Sadiya, Sari
Alhanai, Tuka
Liu, Taosheng
Ghassemi, Mohammad M.
2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2432 - 2439
[37] Inferring contextual preferences using deep encoder-decoder learners
Unger, Moshe
Shapira, Bracha
Rokach, Lior
Livne, Amit
NEW REVIEW OF HYPERMEDIA AND MULTIMEDIA, 2018, 24 (03) : 262 - 290
[38] Learning Depth for Scene Reconstruction Using an Encoder-Decoder Model
Tu, Xiaohan
Xu, Cheng
Liu, Siping
Xie, Guoqi
Huang, Jing
Li, Renfa
Yuan, Junsong
IEEE ACCESS, 2020, 8 : 89300 - 89317
[39] Automated tongue segmentation using deep encoder-decoder model
Worapan Kusakunniran
Punyanuch Borwarnginn
Thanandon Imaromkul
Kittinun Aukkapinyo
Kittikhun Thongkanchorn
Disathon Wattanadhirach
Sophon Mongkolluksamee
Ratchainant Thammasudjarit
Panrasee Ritthipravat
Pimchanok Tuakta
Paitoon Benjapornlert
Multimedia Tools and Applications, 2023, 82 : 37661 - 37686
[40] A Systematic Literature Review on Using the Encoder-Decoder Models for Image Captioning in English and Arabic Languages
Alsayed, Ashwaq
Arif, Muhammad
Qadah, Thamir M.
Alotaibi, Saud
APPLIED SCIENCES-BASEL, 2023, 13 (19):

← 1 2 3 4 5 →