WildScenes: A benchmark for 2D and 3D semantic segmentation in large-scale natural environments

被引:0
|
作者
Vidanapathirana, Kavisha [1 ,2 ]
Knights, Joshua [1 ,2 ]
Hausler, Stephen [1 ]
Cox, Mark [1 ]
Ramezani, Milad [1 ]
Jooste, Jason [1 ]
Griffiths, Ethan [1 ,2 ]
Mohamed, Shaheer [1 ,2 ]
Sridharan, Sridha [2 ]
Fookes, Clinton [2 ]
Moghadam, Peyman [1 ,2 ]
机构
[1] CSIRO, CSIRO Robot, Data61, 1 Technology Ct, Pullenvale, Qld 4069, Australia
[2] Queensland Univ Technol, Brisbane, Qld, Australia
来源
关键词
Semantic scene understanding; performance evaluation and benchmarking; data sets for robotic vision; data sets for robot learning; DATASET;
D O I
10.1177/02783649241278369
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recent progress in semantic scene understanding has primarily been enabled by the availability of semantically annotated bi-modal (camera and LiDAR) datasets in urban environments. However, such annotated datasets are also needed for natural, unstructured environments to enable semantic perception for applications, including conservation, search and rescue, environment monitoring, and agricultural automation. Therefore, we introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scale, sequential traversals in natural environments, including semantic annotations in high-resolution 2D images and dense 3D LiDAR point clouds, and accurate 6-DoF pose information. The data is (1) trajectory-centric with accurate localization and globally aligned point clouds, (2) calibrated and synchronized to support bi-modal training and inference, and (3) containing different natural environments over 6 months to support research on domain adaptation. Our 3D semantic labels are obtained via an efficient, automated process that transfers the human-annotated 2D labels from multiple views into 3D point cloud sequences, thus circumventing the need for expensive and time-consuming human annotation in 3D. We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques to demonstrate the challenges in semantic segmentation in natural environments. We propose train-val-test splits for standard benchmarks as well as domain adaptation benchmarks and utilize an automated split generation technique to ensure the balance of class label distributions. The WildScenes benchmark webpage is https://csiro-robotics.github.io/WildScenes, and the data is publicly available at https://data.csiro.au/collection/csiro:61541.
引用
收藏
页码:532 / 549
页数:18
相关论文
共 50 条
  • [21] Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving
    Tian, Xiaoyu
    Jiang, Tao
    Yun, Longfei
    Mao, Yucheng
    Yang, Huitong
    Wang, Yue
    Wang, Yilun
    Zhao, Hang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] PointCartesian-Net: enhancing 3D coordinates for semantic segmentation of large-scale point clouds
    Zhou, Yuan
    Sun, Qi
    Meng, Jin
    Hu, Qinglong
    Lyu, Jiahang
    Wang, Zhiwei
    Wang, Shifeng
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2021, 38 (08) : 1194 - 1200
  • [23] Semantic segmentation of large-scale segmental lining point clouds using 3D deep learning
    Lin, Wei
    Sheil, Brian
    Xie, Xiongyao
    Zhang, Yangbin
    Cao, Yuyang
    GEOSHANGHAI INTERNATIONAL CONFERENCE 2024, VOL 8, 2024, 1337
  • [24] 3D point cloud semantic segmentation toward large-scale unstructured agricultural scene classification
    Chen, Yi
    Xiong, Yingjun
    Zhang, Baohua
    Zhou, Jun
    Zhang, Qian
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 190
  • [25] A NEW REAL-TIME 3D DENSE SEMANTIC MAPPING SYSTEM FOR LARGE-SCALE ENVIRONMENTS
    Xing, Zhiwei
    Zhu, Xiaorui
    Wu, Yudong
    International Journal of Robotics and Automation, 2024, 39 (01) : 12 - 23
  • [26] Large-scale 3D imaging of insects with natural color
    Qian, Jia
    Dang, Shipei
    Wang, Zhaojun
    Zhou, Xing
    Dan, Dan
    Yao, Baoli
    Tong, Yijie
    Yang, Haidong
    Lu, Yuanyuan
    Chen, Yandong
    Yang, Xingke
    Bai, Ming
    Lei, Ming
    OPTICS EXPRESS, 2019, 27 (04): : 4845 - 4857
  • [27] Large-scale 3D Semantic Mapping Using Stereo Vision
    Yang, Yi
    Qiu, Fan
    Li, Hao
    Zhang, Lu
    Wang, Mei-Ling
    Fu, Meng-Yin
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2018, 15 (02) : 194 - 206
  • [28] Large-scale 3D Semantic Mapping Using Stereo Vision
    Yi Yang
    Fan Qiu
    Hao Li
    Lu Zhang
    Mei-Ling Wang
    Meng-Yin Fu
    International Journal of Automation and Computing, 2018, 15 (02) : 194 - 206
  • [29] Fully 3D and joint 2D/3D data over urban segmentation of InSAR environments
    Dell'Acqua, F
    Gamba, P
    Soergel, U
    Thoennessen, U
    IEEE/ISPRS JOINT WORKSHOP ON REMOTE SENSING AND DATA FUSION OVER URBAN AREAS, 2001, : 328 - 331
  • [30] Semantic 3D Reconstruction with Learning MVS and 2D Segmentation of Aerial Images
    Wei, Zizhuang
    Wang, Yao
    Yi, Hongwei
    Chen, Yisong
    Wang, Guoping
    APPLIED SCIENCES-BASEL, 2020, 10 (04):