Robust Depth Estimation Based on Parallax Attention for Aerial Scene Perception

被引:3
|
作者
Tong, Wei [1 ,2 ]
Zhang, Miaomiao [3 ]
Zhu, Guangyu [4 ]
Xu, Xin [5 ]
Wu, Edmond Q. [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210023, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Coll Artificial Intelligence, Nanjing 210023, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[4] Beijing Jiaotong Univ China, Sch Traff & Transportat, Beijing 100044, Peoples R China
[5] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410005, Peoples R China
关键词
Costs; Feature extraction; Estimation; Transformers; Task analysis; Convolution; Training; Disparity estimation; parallax attention; stereo matching; transformer;
D O I
10.1109/TII.2024.3392270
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given the precalibrated image pairs, stereo matching aims to infer the scene depth information in real-time, which has important research value in the fields of high-precision 3-D reconstruction of the Earth's surface, automatic driving and unmanned aerial vehicle (UAV) navigation. The cost volume-based stereo matching method adopts a coarse-to-fine manner to construct cascaded cost volume, and applies 3-D convolution to capture the correspondence of feature matching to infer the disparity map, which achieves comparable performance. However, the existing method has difficulty dealing with jitter regions with disparity change, and direct disparity regression easily leads to overfitting of cost volume regularization. To alleviate the above two problems, this work proposes an end-to-end disparity estimation network based on Transformer. Its specific improvements are as follows. 1) The cross-view feature interaction module based on Transformer is introduced to realize the feature interaction of global context information. 2) A parallax attention mechanism is designed to impose global geometric constraints on the epipolar line to improve the reliability of feature matching. 3) Focal loss is applied for the training of the disparity classification model to emphasize one-hot supervision in ambiguous regions. Comprehensive experiments on public datasets Sceneflow, KITTI2015, ETH3D, and aerial WHU datasets validate that the proposed work can effectively enhance the performance of disparity estimation.
引用
收藏
页码:10761 / 10769
页数:9
相关论文
共 50 条
  • [11] Monocular perception of motion in depth from parallax
    de Poot, H.
    Bruil, J.
    Fitzverploegh, P.
    Donker, S.
    Peucker, H.
    van de Grind, W. A.
    PERCEPTION, 1994, 23 : 57 - 57
  • [12] Contribution of motion parallax to segmentation and depth perception
    Yoonessi, Ahmad
    Baker, Curtis L., Jr.
    JOURNAL OF VISION, 2011, 11 (09): : 1 - 21
  • [13] Perception of distance and depth from motion parallax
    Ohtsuka, S
    Saida, S
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1996, 31 (3-4) : 48431 - 48431
  • [14] Motion parallax thresholds for unambiguous depth perception
    Holmin, Jessica
    Nawrot, Mark
    VISION RESEARCH, 2015, 115 : 40 - 47
  • [15] Perception of depth, motion, and stability with motion parallax
    Ono, Hiroshi
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 193 - 198
  • [16] Scene Depth Perception Based on Omnidirectional Structured Light
    Jia, Tong
    Wang, BingNan
    Zhou, ZhongXuan
    Meng, Haixiu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (09) : 4369 - 4378
  • [17] DEPTH-PERCEPTION IN MOTION PARALLAX AND STEREOKINESIS
    CAUDEK, C
    PROFFITT, DR
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1993, 19 (01) : 32 - 47
  • [18] Motion parallax as an independent cue for depth perception
    Rogers, Brian
    Graham, Maureen
    PERCEPTION, 2009, 38 (06) : A117 - A126
  • [19] Traffic scene perception algorithm with joint semantic segmentation and depth estimation
    Fan K.
    Zhong M.
    Tan J.
    Zhan Z.
    Feng Y.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (04): : 684 - 695
  • [20] Scene-relative object motion biases depth percepts based on motion parallax
    French, Ranran L.
    DeAngelis, Gregory C.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2020, 61 (07)