Fuzzy Multimodal Graph Reasoning for Human-Centric Instructional Video Grounding

被引:0
|
作者
Li, Yujie [1 ]
Jiang, Xun [2 ,3 ]
Xu, Xing [3 ,4 ,5 ]
Lu, Huimin [6 ]
Tao Shen, Heng [3 ,4 ,5 ]
机构
[1] Kyushu Inst Technol, Fukuoka 8048550, Japan
[2] Univ Elect Sci & Technol China, Ctr Future Multimedia, Chengdu 611731, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
[4] Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 611731, Peoples R China
[5] Tongji Univ, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[6] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
Grounding; Feature extraction; Cognition; Task analysis; Visualization; Fuzzy systems; Education; Fuzzy logic; graph learning; human-centric video understanding; temporal grounding; NETWORK;
D O I
10.1109/TFUZZ.2024.3436030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human-centric instructional videos provide opportunities for users to learn real-world multistep tasks, such as cooking, makeup, and using professional tools. However, these lengthy videos always lead to a tedious learning experience, making it challenging for learners to catch specific guidance efficiently. In this article, we present a novel approach, named fuzzy multimodal graph reasoning (FMGR), to extract target events in long untrimmed human-centric instructional videos using natural language. Specifically, we devise a fuzzy multimodal graph learning layers in our method, which encompass first contextual graph reasoning that transforms the individual features into contextualized features, second cross-modal relation fuzzifier that models the fine-grained matching relationships between two modalities, and third fuzzy graph reasoning that conducts massage passing among cross-modal matching node pairs. Particularly, we integrate fuzzy theory into the cross-modal relation fuzzifier to amplify potential matching pairs, while simultaneously mitigating the interference from ambiguous matches. To validate our method, we conducted evaluations on two human-centric instructional video datasets, i.e., MedVidQA and YouMakeUp. Moreover, we also take further analysis on the impacts of interrogative and declarative queries. Extensive experimental results and further analysis reveal the effectiveness of our proposed FMGR method.
引用
收藏
页码:5046 / 5059
页数:14
相关论文
共 39 条
  • [31] Cognitive digital twin in manufacturing process: integrating the knowledge graph for enhanced human-centric Industry 5.0
    Su, Chang
    Tang, Xin
    Han, Yong
    Wang, Tao
    Jiang, Dongsheng
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2024,
  • [32] Towards Human-centric Digital Twins: Leveraging Computer Vision and Graph Models to Predict Outdoor Comfort
    Liu, Pengyuan
    Zhao, Tianhong
    Luo, Junjie
    Lei, Binyu
    Frei, Mario
    Miller, Clayton
    Biljecki, Filip
    SUSTAINABLE CITIES AND SOCIETY, 2023, 93
  • [33] Human-centric robotic assembly line design: a fuzzy inference system approach for adaptive workload management
    Ghorbani, Elham
    Keivanpour, Samira
    Sekkay, Firdaous
    Imbeau, Daniel
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2024, 134 (7-8): : 3805 - 3827
  • [34] IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
    Zhai, Yuanhao
    Lin, Kevin
    Li, Linjie
    Lin, Chung-Ching
    Wang, Jianfeng
    Yang, Zhengyuan
    Doermann, David
    Yuan, Junsong
    Liu, Zicheng
    Wang, Lijuan
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 134 - 152
  • [35] An Integrated Development Environment for the Design of Fuzzy Human-centric System in Accordance with IEEE Standard 1855-2016
    Pandya, Bhavesh
    Pourabdollah, Amir
    Lotfi, Ahmad
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 497 - 504
  • [36] Graph Structure Learning-Based Multivariate Time Series Anomaly Detection in Internet of Things for Human-Centric Consumer Applications
    He, Shiming
    Li, Genxin
    Yi, Tongzhijian
    Alfarraj, Osama
    Tolba, Amr
    Sangaiah, Arun Kumar
    Sherratt, R. Simon
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (03) : 5419 - 5431
  • [37] A knowledge empowered graph learning feature selection method based on variation propagation effect representation and analysis for human-centric manufacturing systems
    Xu, Qiuhao
    Gao, Pengjie
    Wang, Junliang
    Zhang, Jie
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2024,
  • [38] DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing
    Li, Jia-Wei
    Cao, Yan-Pei
    Wu, Jay Zhangjie
    Mao, Weijia
    Gu, Yuchao
    Zhao, Rui
    Keppo, Jussi
    Shan, Ying
    Shou, Mike Zheng
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 7664 - 7674
  • [39] A Hybrid QFD-Based Human-Centric Decision Making Approach of Disassembly Schemes Under Interval 2-Tuple q-Rung Orthopair Fuzzy Sets
    Zhang, Honghao
    Huang, Zhongwei
    Tian, Guangdong
    Wang, Wenjie
    Li, Zhiwu
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 724 - 735