Self-Supervised Pretraining With Monocular Height Estimation for Semantic Segmentation

被引：1

作者：

Xiong, Zhitong ^{[1
]}

Chen, Sining ^{[1
]}

Shi, Yilei ^{[2
]}

Zhu, Xiao Xiang ^{[1
,3
]}

机构：

[1] Tech Univ Munich TUM, Chair Data Sci Earth Observat, D-80333 Munich, Germany

[2] Tech Univ Munich TUM, Sch Engn & Design, D-80333 Munich, Germany

[3] Munich Ctr Machine Learning, Chair Data Sci Earth Observat, D-80333 Munich, Germany

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

关键词：

Semantics; Task analysis; Estimation; Neurons; Semantic segmentation; Data models; Buildings; Foundation models; interpretable deep learning; monocular height estimation (MHE); self-supervised pretraining;

D O I：

10.1109/TGRS.2024.3412629

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Monocular height estimation (MHE) is key for generating 3-D city models, essential for swift disaster response. Moving beyond the traditional focus on performance enhancement, our study breaks new ground by probing the interpretability of MHE networks. We have pioneeringly discovered that neurons within MHE models demonstrate selectivity for both height and semantic classes. This insight sheds light on the complex inner workings of MHE models and inspires innovative strategies for leveraging elevation data more effectively. Informed by this insight, we propose a pioneering framework that employs MHE as a self-supervised pretraining method for remote sensing (RS) imagery. This approach significantly enhances the performance of semantic segmentation tasks. Furthermore, we develop a disentangled latent transformer (DLT) module that leverages explainable deep representations from pretrained MHE networks for unsupervised semantic segmentation. Our method demonstrates the significant potential of MHE tasks in developing foundation models for sophisticated pixel-level semantic analyses. Additionally, we present a new dataset designed to benchmark the performance of both semantic segmentation and height estimation tasks. The dataset and code will be publicly available at https://github.com/zhu-xlab/DLT-MHE.pytorch.

引用

页数：12

共 50 条

[21] Semantically guided self-supervised monocular depth estimation
Lu, Xiao
Sun, Haoran
Wang, Xiuling
Zhang, Zhiguo
Wang, Haixia
IET IMAGE PROCESSING, 2022, 16 (05) : 1293 - 1304
[22] Self-Supervised Monocular Scene Decomposition and Depth Estimation
Safadoust, Sadra
Guney, Fatma
2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 627 - 636
[23] Self-supervised pretraining for transferable quantitative phase image cell segmentation
Vicar, Tomas
Chemelik, Jiri
Jakubicek, Roman
Chmelikova, Larisa
Gumulec, Jaromir
Balvan, J. A. N.
Provaznik, I. V. O.
Kolar, Radim
BIOMEDICAL OPTICS EXPRESS, 2021, 12 (10) : 6514 - 6528
[24] Joint Self-Supervised Monocular Depth Estimation and SLAM
Xing, Xiaoxia
Cai, Yinghao
Lu, Tao
Yang, Yiping
Wen, Dayong
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4030 - 4036
[25] Learn to Adapt for Self-Supervised Monocular Depth Estimation
Sun, Qiyu
Yen, Gary G.
Tang, Yang
Zhao, Chaoqiang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 15647 - 15659
[26] Self-supervised monocular depth estimation for gastrointestinal endoscopy
Liu, Yuying
Zuo, Siyang
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 238
[27] Self-supervised monocular depth estimation with direct methods
Wang, Haixia
Sun, Yehao
Wu, Q. M. Jonathan
Lu, Xiao
Wang, Xiuling
Zhang, Zhiguo
NEUROCOMPUTING, 2021, 421 : 340 - 348
[28] Self-supervised monocular depth estimation with direct methods
Wang H.
Sun Y.
Wu Q.M.J.
Lu X.
Wang X.
Zhang Z.
Neurocomputing, 2021, 421 : 340 - 348
[29] Adaptive Self-supervised Depth Estimation in Monocular Videos
Mendoza, Julio
Pedrini, Helio
IMAGE AND GRAPHICS (ICIG 2021), PT III, 2021, 12890 : 687 - 699
[30] Learn to Adapt for Self-Supervised Monocular Depth Estimation
Sun, Qiyu
Yen, Gary G.
Tang, Yang
Zhao, Chaoqiang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 15647 - 15659

← 1 2 3 4 5 →