Multitask Deep Neural Networks for Tele-Wide Stereo Matching

被引:2
|
作者
El-Khamy, Mostafa [1 ,2 ]
Ren, Haoyu [1 ]
Du, Xianzhi [1 ]
Lee, Jungwon [1 ]
机构
[1] Samsung Semicond Inc SSI, DSA SOC Res & Dev, San Diego, CA 92121 USA
[2] Alexandria Univ, Fac Engn, Alexandria 21544, Egypt
关键词
Estimation; Cameras; Optical imaging; Neural networks; Feature extraction; Optical sensors; Lenses; Stereo disparity; single-image depth estimation; stereo matching; tele-wide disparity; deep network fusion; CLASSIFIER FUSION;
D O I
10.1109/ACCESS.2020.3029085
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we propose deep learning solutions for the estimation of the real world depth of elements in a scene captured by two cameras with different field of views. We consider a realistic smart-phone scenario, where the first field of view (FOV) is a wide FOV with 1 x the optical zoom, and the second FOV is contained in thefirst FOV captured by a tele zoom lens with 2 x the optical zoom. We refer to the problem of estimating the depth for all elements in the union of the FOVs which corresponds to the Wide FOV as `tele-wide stereo matching'. Traditional approaches can only estimate the disparity or depth in the overlapped FOV, corresponding to the Tele FOV, using stereo matching algorithms. To benchmark this novel problem, we introduce a single-image inverse-depth estimation (SIDE) solution to estimate the disparity from the image corresponding to the union Wide FOV only. We also design a novel multitask tele-wide stereo matching deep neural network (MT-TW-SMNet), which is the first to combine the stereo matching and the single image depth tasks in one network. Moreover, we propose multiple methods for the fusion between the above networks. For example, we have input feature fusion, that utilizes the disparity estimated by stereo-matching as an additional input feature for SIDE. We also designed networks for decision fusion, that deploys a stacked hour glass (SHG) network for fusion and refnement of the disparity maps from both the SIDE network and the MT-TW-SMNet. These fusion schemes signifcantly improve the accuracy. Experimental results on KITTI and SceneFlow datasets demonstrate that our proposed approaches provide a reasonable solution to the tele-wide stereo matching problem. We demonstrate the effectiveness of our solutions in generating the Bokeh effect on the full Wide FOV.
引用
收藏
页码:184383 / 184398
页数:16
相关论文
共 50 条
  • [31] Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks
    Wu, Kedi
    Wei, Guo-Wei
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (02) : 520 - 531
  • [32] SNR-Invariant Multitask Deep Neural Networks for Robust Speaker Verification
    Yao, Qi
    Mak, Man-Wai
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (11) : 1670 - 1674
  • [33] Multitask Learning of Deep Neural Networks for Low-Resource Speech Recognition
    Chen, Dongpeng
    Mak, Brian Kan-Wing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (07) : 1172 - 1183
  • [34] Demystifying Multitask Deep Neural Networks for Quantitative Structure-Activity Relationships
    Xu, Yuting
    Ma, Junshui
    Liaw, Andy
    Sheridan, Robert P.
    Svetnik, Vladimir
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2017, 57 (10) : 2490 - 2504
  • [35] Multitask learning deep neural networks to combine revealed and stated preference data
    Wang, Shenhao
    Wang, Qingyi
    Zhao, Jinhua
    JOURNAL OF CHOICE MODELLING, 2020, 37
  • [36] Convolutional neural network based deep conditional random fields for stereo matching
    Wang, Zhi
    Zhu, Shiqiang
    Li, Yuehua
    Cui, Zhengzhe
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2016, 40 : 739 - 750
  • [37] Positioning with Map Matching using Deep Neural Networks
    Bergkvist, Hannes
    Davidsson, Paul
    Exner, Peter
    PROCEEDINGS OF THE 17TH EAI INTERNATIONAL CONFERENCE ON MOBILE AND UBIQUITOUS SYSTEMS: COMPUTING, NETWORKING AND SERVICES (MOBIQUITOUS 2020), 2021, : 177 - 183
  • [38] A Survey on Deep Stereo Matching in the Twenties
    Tosi, Fabio
    Bartolomei, Luca
    Poggi, Matteo
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [39] Efficient Deep Learning for Stereo Matching
    Luo, Wenjie
    Schwing, Alexander G.
    Urtasun, Raquel
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5695 - 5703
  • [40] Deep Laparoscopic Stereo Matching with Transformers
    Cheng, Xuelian
    Zhong, Yiran
    Harandi, Mehrtash
    Drummond, Tom
    Wang, Zhiyong
    Ge, Zongyuan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VII, 2022, 13437 : 464 - 474