Reducing Training Data Using Pre-Trained Foundation Models: A Case Study on Traffic Sign Segmentation Using the Segment Anything Model

被引:0
|
作者
Henninger, Sofia [1 ]
Kellner, Maximilian [1 ,2 ]
Rombach, Benedikt [1 ]
Reiterer, Alexander [1 ,2 ]
机构
[1] Fraunhofer Inst Phys Measurement Tech IPM, D-79110 Freiburg, Germany
[2] Albert Ludwigs Univ Freiburg, Dept Sustainable Syst Engn INATECH, D-79110 Freiburg, Germany
关键词
semantic segmentation; segment anything model; Mask R-CNN; training data reduction; traffic signs;
D O I
10.3390/jimaging10090220
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
The utilization of robust, pre-trained foundation models enables simple adaptation to specific ongoing tasks. In particular, the recently developed Segment Anything Model (SAM) has demonstrated impressive results in the context of semantic segmentation. Recognizing that data collection is generally time-consuming and costly, this research aims to determine whether the use of these foundation models can reduce the need for training data. To assess the models' behavior under conditions of reduced training data, five test datasets for semantic segmentation will be utilized. This study will concentrate on traffic sign segmentation to analyze the results in comparison to Mask R-CNN: the field's leading model. The findings indicate that SAM does not surpass the leading model for this specific task, regardless of the quantity of training data. Nevertheless, a knowledge-distilled student architecture derived from SAM exhibits no reduction in accuracy when trained on data that have been reduced by 95%.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Natural language generation from Universal Dependencies using data augmentation and pre-trained language models
    Nguyen D.T.
    Tran T.
    International Journal of Intelligent Information and Database Systems, 2023, 16 (01) : 89 - 105
  • [22] Optimized classification of dental implants using convolutional neural networks and pre-trained models with preprocessed data
    Reza Ahmadi Lashaki
    Zahra Raeisi
    Nasim Razavi
    Mehdi Goodarzi
    Hossein Najafzadeh
    BMC Oral Health, 25 (1)
  • [23] Abusive and Hate speech Classification in Arabic Text Using Pre-trained Language Models and Data Augmentation
    Badri, Nabil
    Kboubi, Ferihane
    Chaibi, Anja Habacha
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (11)
  • [24] Using Pre-trained Full-Precision Models to Speed Up Training Binary Networks For Mobile Devices
    Alizadeh, Milad
    Lane, Nicholas D.
    MOBISYS'18: PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS, APPLICATIONS, AND SERVICES, 2018, : 528 - 528
  • [25] BRAIN TUMOR SEGMENTATION AND CLASSIFICATION USING CNN PRE-TRAINED VGG-16 MODEL IN MRI IMAGES
    Gayathri, T.
    Kumar, K. sundeep
    IIUM ENGINEERING JOURNAL, 2024, 25 (02): : 196 - 211
  • [26] Deep transfer learning based model for colorectal cancer histopathology segmentation: A comparative study of deep pre-trained models
    Kassani, Sara Hosseinzadeh
    Kassani, Peyman Hosseinzadeh
    Wesolowski, Michal J.
    Schneider, Kevin A.
    Deters, Ralph
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2022, 159
  • [27] A probabilistic graphical model for microphone array source separation using rich pre-trained source models
    Attias, H. T.
    2006 IEEE SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP PROCEEDINGS, VOLS 1 AND 2, 2006, : 641 - 645
  • [28] A Probabilistic Graphical Model for Microphone Array Source Separation using Rich Pre-Trained Source Models
    Attias, H. T.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2622 - 2625
  • [29] SE3M: A model for software effort estimation using pre-trained embedding models
    De Bortoli Fávero, Eliane Maria
    Casanova, Dalcimar
    Pimentel, Andrey Ricardo
    Information and Software Technology, 2022, 147
  • [30] SE3M: A model for software effort estimation using pre-trained embedding models
    De Bortoli Favero, Eliane Maria
    Casanova, Dalcimar
    Pimentel, Andrey Ricardo
    INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 147