Sports action recognition algorithm based on multi-modal data recognition

被引:0
|
作者
Zhang, Lin [1 ]
机构
[1] Jilin Province Economic Management Cadre College, Changchun, China
关键词
Data fusion - Musculoskeletal system - Spatio-temporal data;
D O I
10.3233/IDT-230372
中图分类号
学科分类号
摘要
The recognition of sports action is an important research subject, which is conducive to the improvement of athletes' own level. To improve the accuracy of multi-modal data action recognition, based on the Transformer module, this study introduces a multi-head attention mechanism, fuses multi-modal data, and constructs a multi-stream structured object relationship inference network. Based on PointNet++ network and combining five different data fusion frameworks, a motion recognition model that integrates RGB data and 3D skeleton point cloud is constructed. The results showed that the Top-1 accuracy of multi-stream structured object relationship inference network was 42.5% and 42.7%, respectively, which was better than other algorithms. The accuracy of the multi-modal fusion model was improved by 15.6% and 5.1% compared with the single mode, and by 5.4% and 2.6% compared with the dual mode, which showed its superiority in the action recognition task. This showed that the fusion of multi-modal data can provide more abundant information, so as to improve the accuracy of action recognition. The accuracy of the action recognition model combining RGB data and 3D skeleton point cloud was 84.3%, 87.5%, 90.2%, 90.6% and 91.2% after the combination of different strategies, which effectively compensated for the problem of missing information in 3D skeleton point cloud and significantly improved the accuracy of action recognition. With a small amount of data, the Top-1 accuracy of the multi-stream structured object relationship inference network in this study was superior to other algorithms, showing its advantages in dealing with complex action recognition tasks. In addition, the action recognition model that fuses RGB data and 3D skeleton point cloud also achieved higher accuracy, which is better than other algorithms. This study can meet the needs of motion recognition in different scenarios and has certain reference value. © 2024 - IOS Press. All rights reserved.
引用
收藏
页码:3243 / 3257
相关论文
共 50 条
  • [31] MULTI-MODAL LEARNING FOR GESTURE RECOGNITION
    Cao, Congqi
    Zhang, Yifan
    Lu, Hanqing
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
  • [32] Expression Recognition Survey Through Multi-Modal Data Analytics
    Ramyasree, Kummari
    Kumar, Ch. Sumanth
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (06): : 600 - 610
  • [33] Multi-modal Sensing for Behaviour Recognition
    Wang, Ziwei
    Liu, Jiajun
    Arablouei, Reza
    Bishop-Hurley, Greg
    Matthews, Melissa
    Borges, Paulo
    PROCEEDINGS OF THE 2022 THE 28TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, ACM MOBICOM 2022, 2022, : 900 - 902
  • [34] The future of action recognition: are multi-modal visual language models the key?
    Gumuskaynak, Enes
    Eken, Suleyman
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (04)
  • [35] Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
    Munro, Jonathan
    Damen, Dima
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 119 - 129
  • [36] Learning discriminative motion feature for enhancing multi-modal action recognition
    Yang, Jianyu
    Huang, Yao
    Shao, Zhanpeng
    Liu, Chunping
    Journal of Visual Communication and Image Representation, 2021, 79
  • [37] Multi-modal Instance Refinement for Cross-Domain Action Recognition
    Qing, Yuan
    Wu, Naixing
    Wan, Shaohua
    Duan, Lixin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 284 - 296
  • [38] Multi-Modal Knowledge Distillation for Domain-Adaptive Action Recognition
    Zhu, Xiaoyu
    Liu, Wenhe
    de Melo, Celso M.
    Hauptmann, Alexander
    SYNTHETIC DATA FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING: TOOLS, TECHNIQUES, AND APPLICATIONS II, 2024, 13035
  • [39] Multi-Modal Domain Adaptation for Fine-grained Action Recognition
    Munro, Jonathan
    Damen, Dima
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3723 - 3726
  • [40] Language-guided Multi-Modal Fusion for Video Action Recognition
    Hsiao, Jenhao
    Li, Yikang
    Ho, Chiuman
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3151 - 3155