D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection

被引:0
|
作者
Hirahara, Kentaro [1 ]
Nakane, Chikahito [2 ]
Ebisawa, Hajime [2 ]
Kuroda, Tsuyoshi [3 ]
Iwaki, Yohei [3 ]
Utsumi, Tomoyoshi
Nomura, Yuichiro [4 ]
Koike, Makoto [5 ]
Mineno, Hiroshi [1 ,2 ,4 ,5 ]
机构
[1] Shizuoka Univ, Grad Sch Integrated Sci & Technol, 3-5-1 Johoku,Chuo Ku, Hamamatsu, Shizuoka 4328011, Japan
[2] Shizuoka Univ, Fac Informat, 3-5-1 Johoku,Chuo Ku, Hamamatsu, Shizuoka 4328011, Japan
[3] Yamaha Motor Co Ltd, Tech Res & Dev Ctr, 2500 Shingai, Iwata, Shizuoka 4388501, Japan
[4] Shizuoka Univ, Res Inst Green Sci & Technol, 836 Ohya,Suruga Ku, Shizuoka, Shizuoka 4228529, Japan
[5] Shizuoka Univ, Grad Sch Sci & Technol, 3-5-1 Johoku,Chuo Ku, Hamamatsu, Shizuoka 4328011, Japan
基金
日本科学技术振兴机构;
关键词
Agriculture; Generative data augmentation; Domain adaptation; Image-based phenotyping; Image generation; IMAGE QUALITY ASSESSMENT; INDEX;
D O I
10.1016/j.compag.2024.109849
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
In agricultural practices, plant phenotyping using object detection models is gaining attention, plant phenotyping is a technology that accurately measures the quality and condition of cultivated crops from images, contributing to the improvement of crop yield and quality, as well as reducing environmental impact. However, collecting the training data necessary to create generic and high-precision models is extremely challenging due difficulties associated with annotations and the diversity of domains. Such difficulties arise from the unique shapes and backgrounds of plants, as well as the significant changes in appearance due to environmental conditions and growth stages. Furthermore, it is difficult to transfer training data across different crops, and although machine learning models effective for specific environments, conditions, and crops have been developed, they cannot be widely applied in real-world conditions. Therefore, in this study, we propose a generative artificial intelligence data augmentation method (D4) and investigated its application towards a shoot detection task in a vineyard. D4 uses a pre-trained text-guided diffusion model based on a large number of original images culled from video data collected by unmanned ground vehicles or other means, and a small number of annotated datasets. The proposed method generates new annotated images with background information adapted to the target domain while retaining annotation information necessary for object detection. In addition, D4 overcomes the lack of training data in agriculture, including the difficulty of annotation and diversity of domains. We confirmed that this generative data augmentation method improved the mean average precision by up to 28.65% for the BBox detection task and the average precision by up to 13.73% for the keypoint detection task for vineyard shoot detection. D4 generative data augmentation is expected to simultaneously solve the cost and domain diversity issues of training data generation for agricultural applications and improve the generalization performance of detection models.
引用
收藏
页数:20
相关论文
共 20 条
  • [1] Diffusion model-based text-guided enhancement network for medical image segmentation
    Dong, Zhiwei
    Yuan, Genji
    Hua, Zhen
    Li, Jinjiang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [2] Text-Guided Multi-region Scene Image Editing Based on Diffusion Model
    Li, Ruichen
    Wu, Lei
    Wang, Changshuo
    Dong, Pei
    Li, Xin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XI, ICIC 2024, 2024, 14872 : 229 - 240
  • [3] Language Model Data Augmentation Based on Text Domain Transfer
    Ogawa, Atsunori
    Tawara, Naohiro
    Delcroix, Marc
    INTERSPEECH 2020, 2020, : 4926 - 4930
  • [4] Improving Text Classification with Large Language Model-Based Data Augmentation
    Zhao, Huanhuan
    Chen, Haihua
    Ruggles, Thomas A.
    Feng, Yunhe
    Singh, Debjani
    Yoon, Hong-Jun
    ELECTRONICS, 2024, 13 (13)
  • [5] Diffusion Model-Based Data Augmentation for Lung Ultrasound Classification with Limited Data
    Zhang, Xiaohui
    Gangopadhyay, Ahana
    Chang, Hsi-Ming
    Soni, Ravi
    MACHINE LEARNING FOR HEALTH, ML4H, VOL 225, 2023, 225 : 664 - 676
  • [6] DMDAT: Diffusion Model-Based Data Augmentation Technique for Vision-Based Accident Detection in Vehicular Networks
    Sai, Siva
    Mittal, Uday
    Chamola, Vinay
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (02) : 2241 - 2250
  • [7] TG-CDDPM: text-guided antimicrobial peptides generation based on conditional denoising diffusion probabilistic model
    Cao, Junhang
    Zhang, Jun
    Yu, Qiyuan
    Ji, Junkai
    Li, Jianqiang
    He, Shan
    Zhu, Zexuan
    BRIEFINGS IN BIOINFORMATICS, 2024, 26 (01)
  • [8] Fault Detection of Bearing by Resnet Classifier with Model-Based Data Augmentation
    Qian, Lu
    Pan, Qing
    Lv, Yaqiong
    Zhao, Xingwei
    MACHINES, 2022, 10 (07)
  • [9] Enhancing plant health classification via diffusion model-based data augmentation
    Lee, Younghoon
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [10] DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors
    Lei, Biwen
    Yu, Kai
    Feng, Mengyang
    Cui, Miaomiao
    Xie, Xuansong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10487 - 10497