WildScenes: A benchmark for 2D and 3D semantic segmentation in large-scale natural environments

被引:0
|
作者
Vidanapathirana, Kavisha [1 ,2 ]
Knights, Joshua [1 ,2 ]
Hausler, Stephen [1 ]
Cox, Mark [1 ]
Ramezani, Milad [1 ]
Jooste, Jason [1 ]
Griffiths, Ethan [1 ,2 ]
Mohamed, Shaheer [1 ,2 ]
Sridharan, Sridha [2 ]
Fookes, Clinton [2 ]
Moghadam, Peyman [1 ,2 ]
机构
[1] CSIRO, CSIRO Robot, Data61, 1 Technology Ct, Pullenvale, Qld 4069, Australia
[2] Queensland Univ Technol, Brisbane, Qld, Australia
来源
关键词
Semantic scene understanding; performance evaluation and benchmarking; data sets for robotic vision; data sets for robot learning; DATASET;
D O I
10.1177/02783649241278369
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recent progress in semantic scene understanding has primarily been enabled by the availability of semantically annotated bi-modal (camera and LiDAR) datasets in urban environments. However, such annotated datasets are also needed for natural, unstructured environments to enable semantic perception for applications, including conservation, search and rescue, environment monitoring, and agricultural automation. Therefore, we introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scale, sequential traversals in natural environments, including semantic annotations in high-resolution 2D images and dense 3D LiDAR point clouds, and accurate 6-DoF pose information. The data is (1) trajectory-centric with accurate localization and globally aligned point clouds, (2) calibrated and synchronized to support bi-modal training and inference, and (3) containing different natural environments over 6 months to support research on domain adaptation. Our 3D semantic labels are obtained via an efficient, automated process that transfers the human-annotated 2D labels from multiple views into 3D point cloud sequences, thus circumventing the need for expensive and time-consuming human annotation in 3D. We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques to demonstrate the challenges in semantic segmentation in natural environments. We propose train-val-test splits for standard benchmarks as well as domain adaptation benchmarks and utilize an automated split generation technique to ensure the balance of class label distributions. The WildScenes benchmark webpage is https://csiro-robotics.github.io/WildScenes, and the data is publicly available at https://data.csiro.au/collection/csiro:61541.
引用
收藏
页码:532 / 549
页数:18
相关论文
共 50 条
  • [41] 3D Semantic Segmentation of Large-Scale Point-Clouds in Urban Areas Using Deep Learning
    Lowphansirikul, Chakri
    Kim, Kyoung-Sook
    Vinayaraj, Poliyapram
    Tuarob, Suppawong
    2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2019, : 238 - 243
  • [42] Multi-Scale Classification and Contrastive Regularization: Weakly Supervised Large-Scale 3D Point Cloud Semantic Segmentation
    Wang, Jingyi
    He, Jingyang
    Liu, Yu
    Chen, Chen
    Zhang, Maojun
    Tan, Hanlin
    REMOTE SENSING, 2024, 16 (17)
  • [43] 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
    Xu, Xiaoxu
    Yuan, Yitian
    Li, Jinlong
    Zhang, Qiudan
    Jie, Zequn
    Ma, Lin
    Tang, Hao
    Sebe, Nicu
    Wang, Xu
    COMPUTER VISION - ECCV 2024, PT LXXIII, 2025, 15131 : 87 - 104
  • [44] SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks
    Boulch, Alexandre
    Guerry, Yids
    Le Saux, Bertrand
    Audebert, Nicolas
    COMPUTERS & GRAPHICS-UK, 2018, 71 : 189 - 198
  • [45] 2D TO 3D LABEL PROPAGATION FOR THE SEMANTIC SEGMENTATION OF HERITAGE BUILDING POINT CLOUDS
    Pellis, E.
    Murtiyoso, A.
    Masiero, A.
    Tucci, G.
    Betti, M.
    Grussenmeyer, P.
    XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II, 2022, 43-B2 : 861 - 867
  • [46] A Benchmark for 3D Mesh Segmentation
    Chen, Xiaobai
    Golovinskiy, Aleksey
    Funkhouser, Thomas
    ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (03):
  • [47] SignAvatars: A Large-Scale 3D Sign Language Holistic Motion Dataset and Benchmark
    Yu, Zhengdi
    Huang, Shaoli
    Cheng, Yongkang
    Birdal, Tolga
    COMPUTER VISION - ECCV 2024, PT V, 2025, 15063 : 1 - 19
  • [48] Multimodal interaction for 2D and 3D environments
    Cohen, P
    McGee, D
    Oviatt, S
    Wu, LZ
    Clow, J
    King, R
    Julier, S
    Rosenblum, L
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 1999, 19 (04) : 10 - 13
  • [49] Current Progress and Challenges in Large-Scale 3D Mitochondria Instance Segmentation
    Franco-Barranco, Daniel
    Lin, Zudi
    Jang, Won-Dong
    Wang, Xueying
    Shen, Qijia
    Yin, Wenjie
    Fan, Yutian
    Li, Mingxing
    Chen, Chang
    Xiong, Zhiwei
    Xin, Rui
    Liu, Hao
    Chen, Huai
    Li, Zhili
    Zhao, Jie
    Chen, Xuejin
    Pape, Constantin
    Conrad, Ryan
    Nightingale, Luke
    de Folter, Joost
    Jones, Martin L.
    Liu, Yanling
    Ziaei, Dorsa
    Huschauer, Stephan
    Arganda-Carreras, Ignacio
    Pfister, Hanspeter
    Wei, Donglai
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (12) : 3956 - 3971
  • [50] On Prioritization Mechanisms for Large-Scale 3D Streaming in Distributed Virtual Environments
    Jia, Jinyuan
    Wang, Mingfei
    Wang, Wei
    Hei, Xiaojun
    2016 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV 2016), 2016, : 465 - 472