DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

被引:0
|
作者
Gu, Xiuye [1 ]
Cui, Yin [1 ,2 ]
Huang, Jonathan [1 ]
Rashwan, Abdullah [1 ]
Yang, Xuan [1 ]
Zhou, Xingyi [1 ]
Ghiasi, Golnaz [1 ]
Kuo, Weicheng [1 ]
Chen, Huizhong [1 ]
Chen, Liang-Chieh [1 ,3 ]
Ross, David [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
[2] NVIDIA, Santa Clara, CA USA
[3] ByteDance, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Observing the close relationship among panoptic, semantic and instance segmentation tasks, we propose to train a universal multi-dataset multi-task segmentation model: DaTaSeg. We use a shared representation (mask proposals with class predictions) for all tasks. To tackle task discrepancy, we adopt different merge operations and post-processing for different tasks. We also leverage weak-supervision, allowing our segmentation model to benefit from cheaper bounding box annotations. To share knowledge across datasets, we use text embeddings from the same semantic embedding space as classifiers and share all network parameters among datasets. We train DaTaSeg on ADE semantic, COCO panoptic, and Objects365 detection datasets. DaTaSeg improves performance on all datasets, especially small-scale datasets, achieving 54.0 mIoU on ADE semantic and 53.5 PQ on COCO panoptic. DaTaSeg also enables weakly-supervised knowledge transfer on ADE panoptic and Objects365 instance segmentation. Experiments show DaTaSeg scales with the number of training datasets and enables open-vocabulary segmentation through direct transfer. In addition, we annotate an Objects365 instance segmentation set of 1,000 images and release it as a public evaluation benchmark on https://laoreja.github.io/dataseg.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Multi-task Network for Panoptic Segmentation in Automated Driving
    Petrovai, Andra
    Nedevschi, Sergiu
    2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2019, : 2394 - 2401
  • [42] Longitudinal Multi-Dataset PET Image Reconstruction
    Ellis, Sam
    Reader, Andrew J.
    2017 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE (NSS/MIC), 2017,
  • [43] Online Multi-task Clustering for Human Motion Segmentation
    Sun, Gan
    Cong, Yang
    Wang, Lichen
    Ding, Zhengming
    Fu, Yun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 970 - 979
  • [44] Multi-task Federated Learning for Heterogeneous Pancreas Segmentation
    Shen, Chen
    Wang, Pochuan
    Roth, Holger R.
    Yang, Dong
    Xu, Daguang
    Oda, Masahiro
    Wang, Weichung
    Fuh, Chiou-Shann
    Chen, Po-Ting
    Liu, Kao-Lang
    Liao, Wei-Chih
    Mori, Kensaku
    CLINICAL IMAGE-BASED PROCEDURES, DISTRIBUTED AND COLLABORATIVE LEARNING, ARTIFICIAL INTELLIGENCE FOR COMBATING COVID-19 AND SECURE AND PRIVACY-PRESERVING MACHINE LEARNING, CLIP 2021, DCL 2021, LL-COVID19 2021, PPML 2021, 2021, 12969 : 101 - 110
  • [45] Multi-task Pairwise Neural Ranking for Hashtag Segmentation
    Maddela, Mounica
    Xu, Wei
    Preotiuc-Pietro, Daniel
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2538 - 2549
  • [46] Multi-Task Learning and Multimodal Fusion for Road Segmentation
    Cheng, Bowen
    Tian, Miaomiao
    Jiang, Shuai
    Liu, Weiwei
    Pang, Yalong
    IEEE ACCESS, 2023, 11 : 18947 - 18959
  • [47] TransNuSeg: A Lightweight Multi-task Transformer for Nuclei Segmentation
    He, Zhenqi
    Unberath, Mathias
    Ke, Jing
    Shen, Yiqing
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 206 - 215
  • [48] Multi-Dataset Hyper-CNN for Hyperspectral Image Segmentation of Remote Sensing Images
    Liu, Li
    Awwad, Emad Mahrous
    Ali, Yasser A.
    Al-Razgan, Muna
    Maarouf, Ali
    Abualigah, Laith
    Hoshyar, Azadeh Noori
    PROCESSES, 2023, 11 (02)
  • [49] MEDIC: a multi-task learning dataset for disaster image classification
    Alam, Firoj
    Alam, Tanvirul
    Hasan, Md Arid
    Hasnat, Abul
    Imran, Muhammad
    Ofli, Ferda
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (03): : 2609 - 2632
  • [50] MEDIC: a multi-task learning dataset for disaster image classification
    Firoj Alam
    Tanvirul Alam
    Md. Arid Hasan
    Abul Hasnat
    Muhammad Imran
    Ferda Ofli
    Neural Computing and Applications, 2023, 35 : 2609 - 2632