SA-MVSNet: Self-attention-based multi-view stereo network for 3D reconstruction of images with weak texture

被引:3
|
作者
Yang, Ronghao [1 ]
Miao, Wang [1 ]
Zhang, Zhenxin [2 ,3 ]
Liu, Zhenlong [1 ]
Li, Mubai [2 ,3 ]
Lin, Bin [1 ]
机构
[1] Chengdu Univ Technol, Coll Earth Sci, Chengdu 610059, Sichuan, Peoples R China
[2] Capital Normal Univ, Key Lab 3D Informat Acquisit & Applicat, MOE, Beijing 100048, Peoples R China
[3] Capital Normal Univ, Coll Resource Environm & Tourism, Beijing 100048, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Multi-view stereo; Depth estimation; Self-attention; Transformer; Weak texture; Adaptive propagation;
D O I
10.1016/j.engappai.2023.107800
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-view stereo (MVS) reconstruction is a key task of image-based 3D reconstruction, and deep learning-based methods can achieve better results than traditional algorithms. However, most of the current deep learning-based MVS methods use convolutional neural networks (CNNs) to extract image features, which cannot achieve the aggregation of long-distance context information and capture robust global information. In addition, in the process of fusing depth maps into point clouds, the confidence filters will filter out the depth values with low confidence in weak texture areas. These problems will lead to the low completeness of 3D reconstruction of weak texture and texture-less areas. To address the above problems, this paper proposes SA-MVSNet based on the PatchmatchNet with a self-attentive mechanism. First, we design a coarse-to-fine network framework to advance depth map estimation. In the feature extraction network, a module with a pyramid structure based on Swin Transformer Block is used to replace the original Feature Pyramid Network (FPN), and the self-correlation between weak texture areas is enhanced by applying a global self-attention mechanism. Then, we also propose a self-attention-based adaptive propagation module (SA-AP), which applies a self-attention calculation within depth value propagation window to obtain the relative weight values of current pixel and others, and then adaptively samples the depth values of neighbors on the same surface for propagation. Experiments show that SA-MVSNet has significantly improved the completeness of 3D reconstruction for the images with weak texture on DTU (provided by Danish Technical University), BlendedMVS, and Tanks and Temple datasets. Our code is available at https://github.com/miaowang525/SA-MVSNet.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] View Planning for Multi-View Stereo 3D Reconstruction Using an Autonomous Multicopter
    Korbinian Schmid
    Heiko Hirschmüller
    Andreas Dömel
    Iris Grixa
    Michael Suppa
    Gerd Hirzinger
    Journal of Intelligent & Robotic Systems, 2012, 65 : 309 - 323
  • [32] View Planning for Multi-View Stereo 3D Reconstruction Using an Autonomous Multicopter
    Schmid, Korbinian
    Hirschmueller, Heiko
    Doemel, Andreas
    Grixa, Iris
    Suppa, Michael
    Hirzinger, Gerd
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2012, 65 (1-4) : 309 - 323
  • [33] NTPP-MVSNet: Multi-View Stereo Network Based on Neighboring Tangent Plane Propagation
    Zhao, Qi
    Deng, Yangyan
    Yang, Yifan
    Li, Yawei
    Yuan, Ding
    APPLIED SCIENCES-BASEL, 2023, 13 (14):
  • [34] Deep learning based multi-view stereo matching and 3D scene reconstruction from oblique aerial images
    Liu, Jin
    Gao, Jian
    Ji, Shunping
    Zeng, Chang
    Zhang, Shaoyi
    Gong, Jianya
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2023, 204 : 42 - 60
  • [35] MVSNet plus plus : Learning Depth-Based Attention Pyramid Features for Multi-View Stereo
    Chen, Po-Heng
    Yang, Hsiao-Chien
    Chen, Kuan-Wen
    Chen, Yong-Sheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 7261 - 7273
  • [36] Multi-Head Attention Refiner for Multi-View 3D Reconstruction
    Lee, Kyunghee
    Cho, Ihjoon
    Yang, Boseung
    Park, Unsang
    JOURNAL OF IMAGING, 2024, 10 (11)
  • [37] MFNet: Multi-level fusion aware feature pyramid based multi-view stereo network for 3D reconstruction
    Youcheng Cai
    Lin Li
    Dong Wang
    Xiaoping Liu
    Applied Intelligence, 2023, 53 : 4289 - 4301
  • [38] MFNet: Multi-level fusion aware feature pyramid based multi-view stereo network for 3D reconstruction
    Cai, Youcheng
    Li, Lin
    Wang, Dong
    Liu, Xiaoping
    APPLIED INTELLIGENCE, 2023, 53 (04) : 4289 - 4301
  • [39] INVESTIGATING SPHERICAL EPIPOLAR RECTIFICATION FOR MULTI-VIEW STEREO 3D RECONSTRUCTION
    Elhashash, M.
    Qin, R.
    XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II, 2022, 5-2 : 47 - 52
  • [40] User-guided 3D reconstruction using multi-view stereo
    Rasmuson, Sverker
    Sintorn, Erik
    Assarsson, Ulf
    I3D 2020: ACM SIGGRAPH SYMPOSIUM ON INTERACTIVE 3D GRAPHICS AND GAMES, 2020,