Zero-shot monitoring of construction workers' ' personal protective equipment based on image captioning

被引:4
|
作者
Gil, Daeyoung [1 ]
Lee, Ghang [1 ]
机构
[1] Yonsei Univ, Dept Architecture & Architectural Engn, Seoul, South Korea
关键词
Zero-shot detection; Computer vision; Site monitoring; Image captioning; Human-pose estimation; SAFETY;
D O I
10.1016/j.autcon.2024.105470
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Previous studies have deployed object-detection-based approaches to automate personal protective equipment (PPE) safety monitoring. However, previously proposed methods require large amounts of labeled data for different PPE items. This study proposes a zero-shot PPE monitoring method that does not require a training process to overcome this problem. The proposed method comprises three steps. First, it detects workers onsite from images and crops body parts using human-body key points. Next, the cropped body images are described in text using image captioning. Finally, the extracted text is compared with prompts describing body parts wearing PPE, and safety is determined based on cosine similarity. Compared to the F1-score of 73.5% achieved by traditional object detection approaches trained on 50 images for hardhat monitoring, the proposed zero-shot approach demonstrates significant improvement with an F1-score of 82.6%. It also surpasses the previous zero-shot monitoring performance (an accuracy of 53%).
引用
收藏
页数:10
相关论文
共 50 条
  • [1] EntroCap: Zero-shot image captioning with entropy-based retrieval
    Yan, Jie
    Xie, Yuxiang
    Zou, Shiwei
    Wei, Yingmei
    Luan, Xidao
    NEUROCOMPUTING, 2025, 611
  • [2] Zero-TextCap: Zero-shot Framework for Text-based Image Captioning
    Xu, Dongsheng
    Zhao, Wenye
    Cai, Yi
    Huang, Qingbao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4949 - 4957
  • [3] ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
    Zeng, Zequn
    Zhang, Hao
    Lu, Ruiying
    Wang, Dongsheng
    Chen, Bo
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23465 - 23476
  • [4] Transferable Decoding with Visual Entities for Zero-Shot Image Captioning
    Fei, Junjie
    Wang, Teng
    Zhang, Jinrui
    He, Zhenyu
    Wang, Chengjie
    Zheng, Feng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3113 - 3123
  • [5] MeaCap: Memory-Augmented Zero-shot Image Captioning
    Zeng, Zequn
    Xie, Yan
    Zhang, Hao
    Chen, Chiyu
    Chen, Bo
    Wang, Zhengjue
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 14100 - 14110
  • [6] Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
    Luol, Jianjie
    Chen, Jingwen
    Li, Yehao
    Pan, Yingwei
    Feng, Jianlin
    Cha, Hongyang
    Yao, Ting
    COMPUTER VISION-ECCV 2024, PT LVII, 2025, 15115 : 237 - 254
  • [7] Improving Zero-Shot Image Captioning Efficiency with Metropolis-Hastings
    Dul, Dehu
    Wu, Yujia
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 305 - 318
  • [8] Zero-Shot Image Classification Based on Attribute
    Zhang, Wei
    Chen, Wenbai
    Chen, Xiangfeng
    Han, Hu
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 25 - 30
  • [9] Synthesize then align: Modality alignment augmentation for zero-shot image captioning with synthetic data
    Liu, Zhiyue
    Liu, Jinyuan
    Ling, Xin
    Huang, Qingbao
    Wang, Jiahai
    KNOWLEDGE-BASED SYSTEMS, 2025, 315
  • [10] Zero-Shot Image Dehazing
    Li, Boyun
    Gou, Yuanbiao
    Liu, Jerry Zitao
    Zhu, Hongyuan
    Zhou, Joey Tianyi
    Peng, Xi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8457 - 8466