3D-C2FT: Coarse-to-Fine Transformer for Multi-view 3D Reconstruction

被引:4
|
作者
Tiong, Leslie Ching Ow [1 ]
Sigmund, Dick [2 ]
Teoh, Andrew Beng Jin [3 ]
机构
[1] Korea Inst Sci & Technol, Computat Sci Res Ctr, 5 Hwarang Ro 14 Gil, Seoul 02792, South Korea
[2] AIDOT Inc, 128 Beobwon Ro, Seoul 05854, South Korea
[3] Yonsei Univ, Sch Elect & Elect Engn, Seoul 120749, South Korea
来源
关键词
Multi-view 3D reconstruction; Coarse-to-fine transformer; Multi-scale attention;
D O I
10.1007/978-3-031-26319-4_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the transformer model has been successfully employed for the multi-view 3D reconstruction problem. However, challenges remain in designing an attention mechanism to explore the multi-view features and exploit their relations for reinforcing the encoding-decoding modules. This paper proposes a new model, namely 3D coarse-to-fine transformer (3D-C2FT), by introducing a novel coarse-to-fine (C2F) attention mechanism for encoding multi-view features and rectifying defective voxel-based 3D objects. C2F attention mechanism enables the model to learn multi-view information flow and synthesize 3D surface correction in a coarse to fine-grained manner. The proposed model is evaluated by ShapeNet and Multi-view Real-life voxel-based datasets. Experimental results show that 3D-C2FT achieves notable results and outperforms several competing models on these datasets.
引用
收藏
页码:211 / 227
页数:17
相关论文
共 50 条
  • [41] CofiFab: Coarse-to-Fine Fabrication of Large 3D Objects
    Song, Peng
    Deng, Bailin
    Wang, Ziqi
    Dong, Zhichao
    Li, Wei
    Fu, Chi-Wing
    Liu, Ligang
    ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (04):
  • [42] Multi-view 3D reconstruction and modeling of the unknown 3D scenes using genetic algorithms
    Mostafa Merras
    Abderrahim Saaidi
    Nabil El Akkad
    Khalid Satori
    Soft Computing, 2018, 22 : 6271 - 6289
  • [43] Multi-view 3D reconstruction and modeling of the unknown 3D scenes using genetic algorithms
    Merras, Mostafa
    Saaidi, Abderrahim
    El Akkad, Nabil
    Satori, Khalid
    SOFT COMPUTING, 2018, 22 (19) : 6271 - 6289
  • [44] TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection
    Luo, Zhipeng
    Zhang, Gongjie
    Zhou, Changqing
    Liu, Tianrui
    Lu, Shijian
    Pan, Liang
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4219 - 4228
  • [45] A pseudo-3D coarse-to-fine architecture for 3D medical landmark detection
    Cui, Li
    Liu, Boyan
    Xu, Guikun
    Guo, Jixiang
    Tang, Wei
    He, Tao
    NEUROCOMPUTING, 2025, 614
  • [46] A Coarse-to-Fine Registration on 3D Multi-Phase Abdominal CT Images
    Yang, Shao-Di
    Zhang, Fan
    Yang, Zhen
    Yang, Xiao-Yu
    Li, Shu-Zhou
    NANOSCIENCE AND NANOTECHNOLOGY LETTERS, 2020, 12 (07) : 909 - 914
  • [47] Multi-Head Attention Refiner for Multi-View 3D Reconstruction
    Lee, Kyunghee
    Cho, Ihjoon
    Yang, Boseung
    Park, Unsang
    JOURNAL OF IMAGING, 2024, 10 (11)
  • [48] Prior-Guided Multi-View 3D Head Reconstruction
    Wang, Xueying
    Guo, Yudong
    Yang, Zhongqi
    Zhang, Juyong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4028 - 4040
  • [49] Research on Multi-View 3D Reconstruction Technology Based on SFM
    Gao, Lei
    Zhao, Yingbao
    Han, Jingchang
    Liu, Huixian
    SENSORS, 2022, 22 (12)
  • [50] Combining Photometric Normals and Multi-View Stereo for 3D Reconstruction
    Grochulla, Martin
    Thormaehlen, Thorsten
    CVMP 2015: PROCEEDINGS OF THE 12TH EUROPEAN CONFERENCE ON VISUAL MEDIA PRODUCTION, 2015,