Robust Depth Estimation Based on Parallax Attention for Aerial Scene Perception

被引:3
|
作者
Tong, Wei [1 ,2 ]
Zhang, Miaomiao [3 ]
Zhu, Guangyu [4 ]
Xu, Xin [5 ]
Wu, Edmond Q. [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210023, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Coll Artificial Intelligence, Nanjing 210023, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[4] Beijing Jiaotong Univ China, Sch Traff & Transportat, Beijing 100044, Peoples R China
[5] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410005, Peoples R China
关键词
Costs; Feature extraction; Estimation; Transformers; Task analysis; Convolution; Training; Disparity estimation; parallax attention; stereo matching; transformer;
D O I
10.1109/TII.2024.3392270
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given the precalibrated image pairs, stereo matching aims to infer the scene depth information in real-time, which has important research value in the fields of high-precision 3-D reconstruction of the Earth's surface, automatic driving and unmanned aerial vehicle (UAV) navigation. The cost volume-based stereo matching method adopts a coarse-to-fine manner to construct cascaded cost volume, and applies 3-D convolution to capture the correspondence of feature matching to infer the disparity map, which achieves comparable performance. However, the existing method has difficulty dealing with jitter regions with disparity change, and direct disparity regression easily leads to overfitting of cost volume regularization. To alleviate the above two problems, this work proposes an end-to-end disparity estimation network based on Transformer. Its specific improvements are as follows. 1) The cross-view feature interaction module based on Transformer is introduced to realize the feature interaction of global context information. 2) A parallax attention mechanism is designed to impose global geometric constraints on the epipolar line to improve the reliability of feature matching. 3) Focal loss is applied for the training of the disparity classification model to emphasize one-hot supervision in ambiguous regions. Comprehensive experiments on public datasets Sceneflow, KITTI2015, ETH3D, and aerial WHU datasets validate that the proposed work can effectively enhance the performance of disparity estimation.
引用
收藏
页码:10761 / 10769
页数:9
相关论文
共 50 条
  • [41] Depth Estimation from Motion Parallax: Experimental Evaluation
    Davidson, Pavel
    Mansour, Mostafa
    Stepanov, Oleg
    Piche, Robert
    2019 26TH SAINT PETERSBURG INTERNATIONAL CONFERENCE ON INTEGRATED NAVIGATION SYSTEMS (ICINS), 2019,
  • [42] Motion parallax contribution to perception of self-motion and depth
    Hanes, Douglas A.
    Keller, Julia
    McCollum, Gin
    BIOLOGICAL CYBERNETICS, 2008, 98 (04) : 273 - 293
  • [43] Improving Depth Perception with Motion Parallax and Its Application in Teleconferencing
    Zhang, Cha
    Yin, Zhaozheng
    Florencio, Dinei
    2009 IEEE INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2009), 2009, : 329 - +
  • [44] Optical arbitrary-depth refocusing for large-depth scene in integral imaging display based on reprojected parallax image
    Xing, Yan
    Wang, Qiong-Hua
    Ren, Hui
    Luo, Ling
    Deng, Huan
    Li, Da-Hai
    OPTICS COMMUNICATIONS, 2019, 433 : 209 - 214
  • [45] Deep-Learning-Based Trunk Perception with Depth Estimation and DWA for Robust Navigation of Robotics in Orchards
    Huang, Peichen
    Huang, Peikui
    Wang, Zihong
    Wu, Xiao
    Liu, Jie
    Zhu, Lixue
    AGRONOMY-BASEL, 2023, 13 (04):
  • [46] Measurement of Depth Attention of Driver in Frontal Scene
    Fukuoka, Mamiko
    Doi, Shun'ichi
    Kimura, Takahiko
    Miura, Toshiaki
    ENGINEERING PSYCHOLOGY AND COGNITIVE ERGONOMICS, PROCEEDINGS, 2009, 5639 : 376 - +
  • [47] Computing visual attention from scene depth
    Ouerhani, N
    Hügli, H
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 375 - 378
  • [49] BINOCULAR DEPTH PERCEPTION: DOES HEAD PARALLAX HELP PEOPLE SEE BETTER IN DEPTH?
    Lackner, Kristof
    Boev, Atanas
    Gotchev, Atanas
    2014 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON), 2014,
  • [50] Robust aerial scene-matching algorithm based on relative velocity model
    Choi, Sung Hyuk
    Park, Chan Gook
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2020, 124