Enhancing video temporal grounding with large language model-based data augmentation

被引:0
|
作者
Tian, Yun [1 ]
Guo, Xiaobo [1 ]
Wang, Jinsong [1 ]
Li, Bin [2 ]
机构
[1] Changchun Univ Sci & Technol, Sch Optoelect Engn, Changchun 130022, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 05期
关键词
Video temporal grounding; Large language model; Data augmentation; Video description; Semantic enrichment; ANNOTATION; QUALITY;
D O I
10.1007/s11227-025-07159-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given an untrimmed video and a natural language query, the task of video temporal grounding (VTG) aims to precisely identify the temporal segment in the video that semantically matches the query. Existing datasets for this task often provide natural language queries that are overly simplistic and manually annotated, which lack sufficient semantic richness to fully capture the video's content. This limitation hinders the model's ability to comprehend complex semantic scenarios and degrades its overall performance. To address these challenges, we introduce a novel, low-cost, large language model-based data augmentation method, that can enrich the original samples and expand the dataset without requiring external data. We propose a fine-grained image captioning module with a noise filter to extract unexploited information from videos. Additionally, we design a hierarchical semantic prompting framework to guide GPT-3.5 in producing semantically rich and contextually coherent natural language queries. Our method outperforms the SOTA method MRTNet when combined with 2D-TAN and VSLNet across three public VTG datasets, particularly excelling in complex semantics and long-duration segment localization.
引用
收藏
页数:31
相关论文
共 50 条
  • [1] Improving Text Classification with Large Language Model-Based Data Augmentation
    Zhao, Huanhuan
    Chen, Haihua
    Ruggles, Thomas A.
    Feng, Yunhe
    Singh, Debjani
    Yoon, Hong-Jun
    ELECTRONICS, 2024, 13 (13)
  • [2] ETC: Temporal Boundary Expand Then Clarify for Weakly Supervised Video Grounding With Multimodal Large Language Model
    Li, Guozhang
    Ding, Xinpeng
    Cheng, De
    Li, Jie
    Wang, Nannan
    Gao, Xinbo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1772 - 1782
  • [3] Enhancing plant health classification via diffusion model-based data augmentation
    Lee, Younghoon
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [4] RumorLLM: A Rumor Large Language Model-Based Fake-News-Detection Data-Augmentation Approach
    Lai, Jianqiao
    Yang, Xinran
    Luo, Wenyue
    Zhou, Linjiang
    Li, Langchen
    Wang, Yongqi
    Shi, Xiaochuan
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [5] MOCODA: Model-based Counterfactual Data Augmentation
    Pitis, Silviu
    Creager, Elliot
    Mandlekar, Ajay
    Garg, Animesh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Model-Based Clustering of Temporal Data
    El Assaad, Hani
    Same, Allou
    Govaert, Gerard
    Aknin, Patrice
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2013, 2013, 8131 : 9 - 16
  • [7] Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting
    Harris, Nicholas
    Butani, Anand
    Hashmy, Syed
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2024, 4 (02): : 2358 - 2368
  • [8] MLLM-TA: Leveraging Multimodal Large Language Models for Precise Temporal Video Grounding
    Liu, Yi
    Hou, Haowen
    Ma, Fei
    Ni, Shiguang
    Yu, Fei Richard
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 281 - 285
  • [9] Model-based temporal object verification using video
    Li, BX
    Chellappa, R
    Zheng, QF
    Der, SZ
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (06) : 897 - 908
  • [10] Null Model-Based Data Augmentation for Graph Classification
    Wang, Zeyu
    Wang, Jinhuan
    Shan, Yalu
    Yu, Shanqing
    Xu, Xiaoke
    Xuan, Qi
    Chen, Guanrong
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (02): : 1821 - 1833