COCO_TS Dataset: Pixel-Level Annotations Based on Weak Supervision for Scene Text Segmentation

被引:18
|
作者
Bonechi, Simone [1 ]
Andreini, Paolo [1 ]
Bianchini, Monica [1 ]
Scarselli, Franco [1 ]
机构
[1] Univ Siena, DIISM, Via Roma 56, Siena, Italy
关键词
Scene text segmentation; Weakly supervised learning; Bounding-box supervision; Convolutional Neural Networks; COMPETITION;
D O I
10.1007/978-3-030-30508-6_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The absence of large scale datasets with pixel-level supervisions is a significant obstacle for the training of deep convolutional networks for scene text segmentation. For this reason, synthetic data generation is normally employed to enlarge the training dataset. Nonetheless, synthetic data cannot reproduce the complexity and variability of natural images. In this paper, a weakly supervised learning approach is used to reduce the shift between training on real and synthetic data. Pixel-level supervisions for a text detection dataset (i.e. where only bounding-box annotations are available) are generated. In particular, the COCO-Text-Segmentation (COCO_TS) dataset, which provides pixel-level supervisions for the COCO-Text dataset, is created and released. The generated annotations are used to train a deep convolutional neural network for semantic segmentation. Experiments show that the proposed dataset can be used instead of synthetic data, allowing us to use only a fraction of the training samples and significantly improving the performances.
引用
收藏
页码:238 / 250
页数:13
相关论文
共 43 条
  • [1] Weak supervision for generating pixel-level annotations in scene text segmentation
    Bonechi, Simone
    Bianchini, Monica
    Scarselli, Franco
    Andreini, Paolo
    PATTERN RECOGNITION LETTERS, 2020, 138 (138) : 1 - 7
  • [2] Semi-Supervised Pixel-Level Scene Text Segmentation by Mutually Guided Network
    Wang, Chuan
    Zhao, Shan
    Zhu, Li
    Luo, Kunming
    Guo, Yanwen
    Wang, Jue
    Liu, Shuaicheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 8212 - 8221
  • [3] Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation
    Quang Nguyen
    Truong Vu
    Anh Tran
    Khoi Nguyen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models
    Wu, Weijia
    Zhao, Yuzhong
    Shou, Mike Zheng
    Zhou, Hong
    Shen, Chunhua
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1206 - 1217
  • [5] Concrete crack pixel-level segmentation: a comparison of scene illumination angle of incidence
    Dow, Hamish
    Perry, Marcus
    McAlorum, Jack
    Pennada, Sanjeetha
    e-Journal of Nondestructive Testing, 2024, 29 (07):
  • [6] Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation
    Ahn, Jiwoon
    Kwak, Suha
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4981 - 4990
  • [7] ArtSeg—Artifact segmentation and removal in brightfield cell microscopy images without manual pixel-level annotations
    Mohammed A. S. Ali
    Kaspar Hollo
    Tõnis Laasfeld
    Jane Torp
    Maris-Johanna Tahk
    Ago Rinken
    Kaupo Palo
    Leopold Parts
    Dmytro Fishman
    Scientific Reports, 12
  • [8] Large-scale uterine myoma MRI dataset covering all FIGO types with pixel-level annotations
    Pan, Haixia
    Chen, Minghuang
    Bai, Wenpei
    Li, Bin
    Zhao, Xiaoran
    Zhang, Meng
    Zhang, Dongdong
    Li, Yanan
    Wang, Hongqiang
    Geng, Haotian
    Kong, Weiya
    Yin, Cong
    Han, Linfeng
    Lan, Jiahua
    Zhao, Tian
    SCIENTIFIC DATA, 2024, 11 (01)
  • [9] Adaptive Video Text Tracking Based on Pixel-level Feature Extraction
    School of Computer Science, Hubei University of Technology, Hubei, Wuhan
    430000, China
    不详
    430223, China
    J. Eng. Sci. Technol. Rev., 2024, 5 (55-61):
  • [10] Video segmentation for traffic monitoring tasks based on pixel-level snakes
    Vilariño, DL
    Cabello, D
    Pardo, XM
    Brea, VM
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PROCEEDINGS, 2003, 2652 : 1074 - 1081