DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

被引:0
|
作者
Gu, Xiuye [1 ]
Cui, Yin [1 ,2 ]
Huang, Jonathan [1 ]
Rashwan, Abdullah [1 ]
Yang, Xuan [1 ]
Zhou, Xingyi [1 ]
Ghiasi, Golnaz [1 ]
Kuo, Weicheng [1 ]
Chen, Huizhong [1 ]
Chen, Liang-Chieh [1 ,3 ]
Ross, David [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
[2] NVIDIA, Santa Clara, CA USA
[3] ByteDance, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Observing the close relationship among panoptic, semantic and instance segmentation tasks, we propose to train a universal multi-dataset multi-task segmentation model: DaTaSeg. We use a shared representation (mask proposals with class predictions) for all tasks. To tackle task discrepancy, we adopt different merge operations and post-processing for different tasks. We also leverage weak-supervision, allowing our segmentation model to benefit from cheaper bounding box annotations. To share knowledge across datasets, we use text embeddings from the same semantic embedding space as classifiers and share all network parameters among datasets. We train DaTaSeg on ADE semantic, COCO panoptic, and Objects365 detection datasets. DaTaSeg improves performance on all datasets, especially small-scale datasets, achieving 54.0 mIoU on ADE semantic and 53.5 PQ on COCO panoptic. DaTaSeg also enables weakly-supervised knowledge transfer on ADE panoptic and Objects365 instance segmentation. Experiments show DaTaSeg scales with the number of training datasets and enables open-vocabulary segmentation through direct transfer. In addition, we annotate an Objects365 instance segmentation set of 1,000 images and release it as a public evaluation benchmark on https://laoreja.github.io/dataseg.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Visual Person Understanding Through Multi-task and Multi-dataset Learning
    Pfeiffer, Kilian
    Hermans, Alexander
    Sarandi, Istvan
    Weber, Mark
    Leibe, Bastian
    PATTERN RECOGNITION, DAGM GCPR 2019, 2019, 11824 : 551 - 566
  • [2] Multi-Dataset Multi-Task Learning for COVID-19 Prognosis
    Ruffini, Filippo
    Tronchin, Lorenzo
    Wu, Zhuoru
    Chen, Wenting
    Soda, Paolo
    Shen, Linlin
    Guarrasil, Valerio
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII, 2024, 15012 : 251 - 261
  • [3] Multi-dataset fusion for multi-task learning on face attribute recognition
    Lu, Hengjie
    Xu, Shugong
    Wang, Jiahao
    PATTERN RECOGNITION LETTERS, 2023, 173 : 72 - 78
  • [4] Universal multi-task kernels
    Caponnetto, Andrea
    Micchelli, Charles A.
    Pontil, Massimiliano
    Ying, Yiming
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 1615 - 1646
  • [5] Universal multi-task Kernels
    Caponnetto, Andrea
    Micchelli, Charles A.
    Pontil, Massimiliano
    Ying, Yiming
    Journal of Machine Learning Research, 2008, 9 : 1615 - 1646
  • [6] Multi-dataset Training for Medical Image Segmentation as a Service
    Civit-Masot, Javier
    Luna-Perejon, Francisco
    Duran-Lopez, Lourdes
    Dominguez-Morales, J. P.
    Vicente-Diaz, Saturnino
    Linares-Barranco, Alejandro
    Civit, Anton
    IJCCI: PROCEEDINGS OF THE 11TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2019, : 542 - 547
  • [7] MultiTalent: A Multi-dataset Approach to Medical Image Segmentation
    Ulrich, Constantin
    Isensee, Fabian
    Wald, Tassilo
    Zenk, Maximilian
    Baumgartner, Michael
    Maier-Hein, Klaus H.
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT III, 2023, 14222 : 648 - 658
  • [8] Multi-dataset Approach to Medical Image Segmentation MultiTalent
    Ulrich, Constantin
    Isensee, Fabian
    Wald, Tassilo
    Zenk, Maximilian
    Baumgartner, Michael
    Maier-Hein, Klaus H.
    BILDVERARBEITUNG FUR DIE MEDIZIN 2024, 2024, : 78 - 78
  • [9] MENSA: Multi-Dataset Harmonized Pretraining for Semantic Segmentation
    Shi, Bowen
    Zhang, Xiaopeng
    Wang, Yaoming
    Dai, Wenrui
    Zou, Junni
    Xiong, Hongkai
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 2127 - 2140
  • [10] ScLNet: A cornea with scleral lens OCT layers segmentation dataset and new multi-task model
    Cao, Yang
    Yu, Xiang Le
    Yao, Han
    Jin, Yue
    Lin, Kuangqing
    Shi, Ce
    Cheng, Hongling
    Lin, Zhiyang
    Jiang, Jun
    Gao, Hebei
    Shen, Meixiao
    HELIYON, 2024, 10 (13)