Self-Supervised Pretraining With Monocular Height Estimation for Semantic Segmentation

被引:1
|
作者
Xiong, Zhitong [1 ]
Chen, Sining [1 ]
Shi, Yilei [2 ]
Zhu, Xiao Xiang [1 ,3 ]
机构
[1] Tech Univ Munich TUM, Chair Data Sci Earth Observat, D-80333 Munich, Germany
[2] Tech Univ Munich TUM, Sch Engn & Design, D-80333 Munich, Germany
[3] Munich Ctr Machine Learning, Chair Data Sci Earth Observat, D-80333 Munich, Germany
关键词
Semantics; Task analysis; Estimation; Neurons; Semantic segmentation; Data models; Buildings; Foundation models; interpretable deep learning; monocular height estimation (MHE); self-supervised pretraining;
D O I
10.1109/TGRS.2024.3412629
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Monocular height estimation (MHE) is key for generating 3-D city models, essential for swift disaster response. Moving beyond the traditional focus on performance enhancement, our study breaks new ground by probing the interpretability of MHE networks. We have pioneeringly discovered that neurons within MHE models demonstrate selectivity for both height and semantic classes. This insight sheds light on the complex inner workings of MHE models and inspires innovative strategies for leveraging elevation data more effectively. Informed by this insight, we propose a pioneering framework that employs MHE as a self-supervised pretraining method for remote sensing (RS) imagery. This approach significantly enhances the performance of semantic segmentation tasks. Furthermore, we develop a disentangled latent transformer (DLT) module that leverages explainable deep representations from pretrained MHE networks for unsupervised semantic segmentation. Our method demonstrates the significant potential of MHE tasks in developing foundation models for sophisticated pixel-level semantic analyses. Additionally, we present a new dataset designed to benchmark the performance of both semantic segmentation and height estimation tasks. The dataset and code will be publicly available at https://github.com/zhu-xlab/DLT-MHE.pytorch.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Self-Supervised Monocular Depth Estimation With Multiscale Perception
    Zhang, Yourun
    Gong, Maoguo
    Li, Jianzhao
    Zhang, Mingyang
    Jiang, Fenlong
    Zhao, Hongyu
    IEEE Transactions on Image Processing, 2022, 31 : 3251 - 3266
  • [32] Self-Supervised Monocular Depth Estimation With Multiscale Perception
    Zhang, Yourun
    Gong, Maoguo
    Li, Jianzhao
    Zhang, Mingyang
    Jiang, Fenlong
    Zhao, Hongyu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3251 - 3266
  • [33] Semantic and Optical Flow Guided Self-supervised Monocular Depth and Ego-Motion Estimation
    Fang, Jiaojiao
    Liu, Guizhong
    IMAGE AND GRAPHICS (ICIG 2021), PT III, 2021, 12890 : 465 - 477
  • [34] Depth estimation of supervised monocular images based on semantic segmentation
    Wang, Qi
    Piao, Yan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [35] Self-supervised Monocular Depth Estimation Based on Semantic Assistance and Depth Temporal Consistency Constraints
    Ling, Chuanwu
    Chen, Hua
    Xu, Dayong
    Zhang, Xiaogang
    Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2024, 51 (08): : 1 - 12
  • [36] Dual-attention-based semantic-aware self-supervised monocular depth estimation
    Xu, Jinze
    Ye, Feng
    Lai, Yizong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 65579 - 65601
  • [37] Self-Supervised Model Adaptation for Multimodal Semantic Segmentation
    Valada, Abhinav
    Mohan, Rohit
    Burgard, Wolfram
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (05) : 1239 - 1285
  • [38] Self-supervised Semantic Segmentation: Consistency over Transformation
    Karimijafarbigloo, Sanaz
    Azad, Reza
    Kazerouni, Amirhossein
    Velichko, Yury
    Bagci, Ulas
    Merhof, Dorit
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 2646 - 2655
  • [39] Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
    Araslanov, Nikita
    Roth, Stefan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15379 - 15389
  • [40] Self-supervised contrastive representation learning for semantic segmentation
    Liu B.
    Cai H.
    Wang Y.
    Chen X.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (01): : 125 - 134