Multispectral Video Semantic Segmentation: A Benchmark Dataset and Baseline

被引:10
|
作者
Ji, Wei [1 ]
Li, Jingjing [1 ]
Bian, Cheng [2 ]
Zhou, Zongwei [3 ]
Zhao, Jiaying [2 ]
Yuille, Alan [3 ]
Cheng, Li [1 ]
机构
[1] Univ Alberta, Edmonton, AB T6G 2M7, Canada
[2] ByteDance, Beijing, Peoples R China
[3] Johns Hopkins Univ, Baltimore, MD USA
基金
加拿大自然科学与工程研究理事会;
关键词
ATTENTION; NETWORK; FUSION;
D O I
10.1109/CVPR52729.2023.00112
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robust and reliable semantic segmentation in complex scenes is crucial for many real-life applications such as autonomous safe driving and nighttime rescue. In most approaches, it is typical to make use of RGB images as input. They however work well only in preferred weather conditions; when facing adverse conditions such as rainy, overexposure, or low-light, they often fail to deliver satisfactory results. This has led to the recent investigation into multispectral semantic segmentation, where RGB and thermal infrared (RGBT) images are both utilized as input. This gives rise to significantly more robust segmentation of image objects in complex scenes and under adverse conditions. Nevertheless, the present focus in single RGBT image input restricts existing methods from well addressing dynamic real-world scenes. Motivated by the above observations, in this paper, we set out to address a relatively new task of semantic segmentation of multispectral video input, which we refer to as Multispectral Video Semantic Segmentation, or MVSS in short. An in-house MVSeg dataset is thus curated, consisting of 738 calibrated RGB and thermal videos, accompanied by 3,545 fine-grained pixel-level semantic annota- tions of 26 categories. Our dataset contains a wide range of challenging urban scenes in both daytime and nighttime. Moreover, we propose an effective MVSS baseline, dubbed MVNet, which is to our knowledge the first model to jointly learn semantic representations from multispectral and temporal contexts. Comprehensive experiments are conducted using various semantic segmentation models on the MVSeg dataset. Empirically, the engagement of multispectral video input is shown to lead to significant improvement in semantic segmentation; the effectiveness of our MVNet baseline has also been verified.
引用
收藏
页码:1094 / 1104
页数:11
相关论文
共 50 条
  • [11] SemanticRT: A Large-Scale Dataset and Method for Robust Semantic Segmentation in Multispectral Images
    Ji, Wei
    Li, Jingjing
    Bian, Cheng
    Zhang, Zhicheng
    Cheng, Li
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3307 - 3316
  • [12] High Efficiency Dataset Generation for Semantic Video Segmentation on Road Intersection
    Nagai, Wataru
    Katayama, Takafumi
    Song, Tian
    Shimamoto, Takashi
    2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 372 - 375
  • [13] Electrical Thermal Image Semantic Segmentation: Large-Scale Dataset and Baseline
    Wang, Futian
    Guo, Yin
    Li, Chenglong
    Lu, Andong
    Ding, Zhongfeng
    Tang, Jin
    Luo, Bin
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [14] A BENCHMARK FOR SEMANTIC IMAGE SEGMENTATION
    Li, Hui
    Cai, Jianfei
    Thi Nhat Anh Nguyen
    Zheng, Jianmin
    2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
  • [15] Rethinking Breast Lesion Segmentation in Ultrasound: A New Video Dataset and A Baseline Network
    Li, Jialu
    Zheng, Qingqing
    Li, Mingshuang
    Liu, Ping
    Wang, Qiong
    Sun, Litao
    Zhu, Lei
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT IV, 2022, 13434 : 391 - 400
  • [16] Amodal Cityscapes: A New Dataset, its Generation, and an Amodal Semantic Segmentation Challenge Baseline
    Breitenstein, Jasmin
    Fingscheidt, Tim
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1018 - 1025
  • [17] Enhancing Environmental Monitoring Through Multispectral Imaging: The WasteMS Dataset for Semantic Segmentation of Lakeside Waste
    Zhu, Qinfeng
    Weng, Ningxin
    Fan, Lei
    Cai, Yuanzhi
    MULTIMEDIA MODELING, MMM 2025, PT I, 2025, 15520 : 362 - 372
  • [18] CSPC-Dataset: New LiDAR Point Cloud Dataset and Benchmark for Large-Scale Scene Semantic Segmentation
    Tong, Guofeng
    Li, Yong
    Chen, Dong
    Sun, Qi
    Cao, Wei
    Xiang, Guiqiu
    IEEE ACCESS, 2020, 8 : 87695 - 87718
  • [19] A Benchmark High-Resolution GaoFen-3 SAR Dataset for Building Semantic Segmentation
    Xia, Junshi
    Yokoya, Naoto
    Adriano, Bruno
    Zhang, Lianchong
    Li, Guoqing
    Wang, Zhigang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 5950 - 5963
  • [20] A benchmark dataset for defect detection and classification in electroluminescence images of PV modules using semantic segmentation
    Pratt, Lawrence
    Mattheus, Jana
    Klein, Richard
    SYSTEMS AND SOFT COMPUTING, 2023, 5