WildScenes: A benchmark for 2D and 3D semantic segmentation in large-scale natural environments

被引:0
|
作者
Vidanapathirana, Kavisha [1 ,2 ]
Knights, Joshua [1 ,2 ]
Hausler, Stephen [1 ]
Cox, Mark [1 ]
Ramezani, Milad [1 ]
Jooste, Jason [1 ]
Griffiths, Ethan [1 ,2 ]
Mohamed, Shaheer [1 ,2 ]
Sridharan, Sridha [2 ]
Fookes, Clinton [2 ]
Moghadam, Peyman [1 ,2 ]
机构
[1] CSIRO, CSIRO Robot, Data61, 1 Technology Ct, Pullenvale, Qld 4069, Australia
[2] Queensland Univ Technol, Brisbane, Qld, Australia
来源
关键词
Semantic scene understanding; performance evaluation and benchmarking; data sets for robotic vision; data sets for robot learning; DATASET;
D O I
10.1177/02783649241278369
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recent progress in semantic scene understanding has primarily been enabled by the availability of semantically annotated bi-modal (camera and LiDAR) datasets in urban environments. However, such annotated datasets are also needed for natural, unstructured environments to enable semantic perception for applications, including conservation, search and rescue, environment monitoring, and agricultural automation. Therefore, we introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scale, sequential traversals in natural environments, including semantic annotations in high-resolution 2D images and dense 3D LiDAR point clouds, and accurate 6-DoF pose information. The data is (1) trajectory-centric with accurate localization and globally aligned point clouds, (2) calibrated and synchronized to support bi-modal training and inference, and (3) containing different natural environments over 6 months to support research on domain adaptation. Our 3D semantic labels are obtained via an efficient, automated process that transfers the human-annotated 2D labels from multiple views into 3D point cloud sequences, thus circumventing the need for expensive and time-consuming human annotation in 3D. We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques to demonstrate the challenges in semantic segmentation in natural environments. We propose train-val-test splits for standard benchmarks as well as domain adaptation benchmarks and utilize an automated split generation technique to ensure the balance of class label distributions. The WildScenes benchmark webpage is https://csiro-robotics.github.io/WildScenes, and the data is publicly available at https://data.csiro.au/collection/csiro:61541.
引用
收藏
页码:532 / 549
页数:18
相关论文
共 50 条
  • [31] SEMANTIC ENRICHMENT OF 3D POINT CLOUDS USING 2D IMAGE SEGMENTATION
    Rai, A.
    Srivastava, N.
    Khoshelham, K.
    Jain, K.
    GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 1659 - 1666
  • [32] Data Augmented 3D Semantic Scene Completion with 2D Segmentation Priors
    Dourado, Aloisio
    Guth, Frederico
    de Campos, Teofilo
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 687 - 696
  • [33] Autonomous 3D Exploration in Large-Scale Environments with Dynamic Obstacles
    Wiman, Emil
    Widen, Ludvig
    Tiger, Mattias
    Heintz, Fredrik
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2389 - 2395
  • [34] Exploiting the Complementarity of 2D and 3D Networks to Address Domain-Shift in 3D Semantic Segmentation
    Cardace, Adriano
    Ramirez, Pierluigi Zama
    Salti, Samuele
    Di Stefano, Luigi
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 98 - 109
  • [35] CUS3D: A New Comprehensive Urban-Scale Semantic-Segmentation 3D Benchmark Dataset
    Gao, Lin
    Liu, Yu
    Chen, Xi
    Liu, Yuxiang
    Yan, Shen
    Zhang, Maojun
    REMOTE SENSING, 2024, 16 (06)
  • [36] A 3D BUILDING INDOOR-OUTDOOR BENCHMARK FOR SEMANTIC SEGMENTATION
    Cao, Yuwei
    Scaioni, Marco
    GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 147 - 153
  • [38] Distinctive 2D and 3D features for automated large-scale scene analysis in urban areas
    Weinmann, M.
    Urban, S.
    Hinz, S.
    Jutzi, B.
    Mallet, C.
    COMPUTERS & GRAPHICS-UK, 2015, 49 : 47 - 57
  • [39] NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
    Liu, Jun
    Shahroudy, Amir
    Perez, Mauricio
    Wang, Gang
    Duan, Ling-Yu
    Kot, Alex C.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2684 - 2701
  • [40] Fusion of images and point clouds for the semantic segmentation of large-scale 3D scenes based on deep learning
    Zhang, Rui
    Li, Guangyun
    Li, Minglei
    Wang, Li
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 143 : 85 - 96