Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning

被引:1
|
作者
Li, Rui [1 ]
Fischer, Tobias [1 ]
Segu, Mattia [1 ]
Pollefeys, Marc [1 ]
Van Gool, Luc [1 ]
Tombari, Federico [2 ,3 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Google, Mountain View, CA 94043 USA
[3] Tech Univ Munich, Munich, Germany
关键词
D O I
10.1109/CVPR52733.2024.00940
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recovering the 3D scene geometry from a single view is a fundamental yet ill-posed problem in computer vision. While classical depth estimation methods infer only a 2.5D scene representation limited to the image plane, recent approaches based on radiance fields reconstruct a full 3D representation. However, these methods still struggle with occluded regions since inferring geometry without visual observation requires (i) semantic knowledge of the surroundings, and (ii) reasoning about spatial context. We propose KYN, a novel method for single-view scene reconstruction that reasons about semantic and spatial context to predict each point's density. We introduce a vision-language modulation module to enrich point features with fine-grained semantic information. We aggregate point representations across the scene through a language-guided spatial attention mechanism to yield per-point density predictions aware of the 3D semantic context. We show that KYN improves 3D shape recovery compared to predicting density for each 3D point in isolation. We achieve state-of-the-art results in scene and object reconstruction on KITTI-360, and show improved zero-shot generalization compared to prior work. Project page: https://ruili3.github.io/kyn.
引用
收藏
页码:9848 / 9858
页数:11
相关论文
共 27 条
  • [21] Weakly-Supervised Single-view Dense 3D Point Cloud Reconstruction via Differentiable Renderer
    Peng Jin
    Shaoli Liu
    Jianhua Liu
    Hao Huang
    Linlin Yang
    Michael Weinmann
    Reinhard Klein
    Chinese Journal of Mechanical Engineering, 2021, 34 (05) : 211 - 221
  • [22] Weakly-Supervised Single-view Dense 3D Point Cloud Reconstruction via Differentiable Renderer
    Jin, Peng
    Liu, Shaoli
    Liu, Jianhua
    Huang, Hao
    Yang, Linlin
    Weinmann, Michael
    Klein, Reinhard
    CHINESE JOURNAL OF MECHANICAL ENGINEERING, 2021, 34 (01)
  • [23] Weakly-Supervised Single-view Dense 3D Point Cloud Reconstruction via Differentiable Renderer
    Peng Jin
    Shaoli Liu
    Jianhua Liu
    Hao Huang
    Linlin Yang
    Michael Weinmann
    Reinhard Klein
    Chinese Journal of Mechanical Engineering, 2021, 34
  • [24] PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar
    Klinghoffer, Tzofi
    Xiang, Xiaoyu
    Somasundaram, Siddharth
    Fang, Yuchen
    Richardt, Christian
    Raskar, Ramesh
    Ranjan, Rakesh
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 14565 - 14574
  • [25] Rapid 3D reconstruction of constant-diameter straight pipelines via single-view perspective projection
    Yao, Jiasui
    Cheng, Xiaoqi
    Tan, Haishu
    Li, Xiaosong
    Zhao, Hengxing
    FRONTIERS IN PHYSICS, 2024, 12
  • [26] Daily Assistive View Control Learning of Low-Cost Low-Rigidity Robot via Large-Scale Vision-Language Model
    Kawaharazuka, Kento
    Kanazawa, Naoaki
    Obinata, Yoshiki
    Okada, Kei
    Inaba, Masayuki
    2023 IEEE-RAS 22ND INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, HUMANOIDS, 2023,
  • [27] Semi-supervised single-view 3D reconstruction via multi shape prior fusion strategy and self-attention
    Zhou, Wei
    Shi, Xinzhe
    She, Yunfeng
    Liu, Kunlong
    Zhang, Yongqin
    COMPUTERS & GRAPHICS-UK, 2025, 126