DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

被引:0
|
作者
Gu, Xiuye [1 ]
Cui, Yin [1 ,2 ]
Huang, Jonathan [1 ]
Rashwan, Abdullah [1 ]
Yang, Xuan [1 ]
Zhou, Xingyi [1 ]
Ghiasi, Golnaz [1 ]
Kuo, Weicheng [1 ]
Chen, Huizhong [1 ]
Chen, Liang-Chieh [1 ,3 ]
Ross, David [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
[2] NVIDIA, Santa Clara, CA USA
[3] ByteDance, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Observing the close relationship among panoptic, semantic and instance segmentation tasks, we propose to train a universal multi-dataset multi-task segmentation model: DaTaSeg. We use a shared representation (mask proposals with class predictions) for all tasks. To tackle task discrepancy, we adopt different merge operations and post-processing for different tasks. We also leverage weak-supervision, allowing our segmentation model to benefit from cheaper bounding box annotations. To share knowledge across datasets, we use text embeddings from the same semantic embedding space as classifiers and share all network parameters among datasets. We train DaTaSeg on ADE semantic, COCO panoptic, and Objects365 detection datasets. DaTaSeg improves performance on all datasets, especially small-scale datasets, achieving 54.0 mIoU on ADE semantic and 53.5 PQ on COCO panoptic. DaTaSeg also enables weakly-supervised knowledge transfer on ADE panoptic and Objects365 instance segmentation. Experiments show DaTaSeg scales with the number of training datasets and enables open-vocabulary segmentation through direct transfer. In addition, we annotate an Objects365 instance segmentation set of 1,000 images and release it as a public evaluation benchmark on https://laoreja.github.io/dataseg.
引用
收藏
页数:26
相关论文
共 50 条
  • [31] Monocular Instance Motion Segmentation for Autonomous Driving: KITTI InstanceMotSeg Dataset and Multi-task Baseline
    Mohamed, Eslam
    Ewaisha, Mahmoud
    Siam, Mennatullah
    Rashed, Hazem
    Yogamani, Senthil
    Hamdy, Waleed
    El-Dakdouky, Mohamed
    El-Sallab, Ahmad
    2021 32ND IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2021, : 114 - 121
  • [32] WoodScape: A multi-task, multi-camera fisheye dataset for autonomous driving
    Yogamani, Senthil
    Hughes, Ciaran
    Horgan, Jonathan
    Sistu, Ganesh
    Varley, Padraig
    O'Dea, Derek
    Uricar, Michal
    Milz, Stefan
    Simon, Martin
    Amende, Karl
    Witt, Christian
    Rashed, Hazem
    Chennupati, Sumanth
    Nayak, Sanjaya
    Mansoor, Saquib
    Perrotton, Xavier
    Perez, Patrick
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9307 - 9317
  • [33] Semantic Segmentation via Multi-task, Multi-domain Learning
    Fourure, Damien
    Emonet, Remi
    Fromont, Elisa
    Muselet, Damien
    Tremeau, Alain
    Wolf, Christian
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016, 2016, 10029 : 333 - 343
  • [34] Multi-task gradient descent for multi-task learning
    Bai, Lu
    Ong, Yew-Soon
    He, Tiantian
    Gupta, Abhishek
    MEMETIC COMPUTING, 2020, 12 (04) : 355 - 369
  • [35] Multi-task gradient descent for multi-task learning
    Lu Bai
    Yew-Soon Ong
    Tiantian He
    Abhishek Gupta
    Memetic Computing, 2020, 12 : 355 - 369
  • [36] Multi-scale Field Distillation for Multi-task Semantic Segmentation
    Dong, Aimei
    Liu, Sidi
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 508 - 519
  • [37] Improvements on a Multi-task BERT Model
    Agrali, Mahmut
    Tekir, Selma
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [38] Multi-task agency: a combinatorial model
    Bardsley, P
    JOURNAL OF ECONOMIC BEHAVIOR & ORGANIZATION, 2001, 44 (02) : 233 - 248
  • [39] A Multi-task Framework for Skin Lesion Detection and Segmentation
    Vesal, Sulaiman
    Patil, Shreyas Malakarjun
    Ravikumar, Nishant
    Maier, Andreas K.
    OR 2.0 CONTEXT-AWARE OPERATING THEATERS, COMPUTER ASSISTED ROBOTIC ENDOSCOPY, CLINICAL IMAGE-BASED PROCEDURES, AND SKIN IMAGE ANALYSIS, OR 2.0 2018, 2018, 11041 : 285 - 293
  • [40] Interdependent Multi-task Learning for Simultaneous Segmentation and Detection
    Reginthala, Mahesh
    Iwahori, Yuji
    Bhuyan, M. K.
    Hayashi, Yoshitsugu
    Achariyaviriya, Witsarut
    Kijsirikul, Boonserm
    ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2020, : 167 - 174