Multi-modal fusion in ergonomic health: bridging visual and pressure for sitting posture detection

被引:1
|
作者
Quan, Qinxiao [1 ]
Gao, Yang [2 ]
Bai, Yang [1 ]
Jin, Zhanpeng [1 ]
机构
[1] South China Univ Technol, Sch Future Technol, Guangzhou, Peoples R China
[2] East China Normal Univ, Sch Comp Sci, Shanghai, Peoples R China
关键词
Pressure sensing; Computer vision; Sitting posture recognition; Feature fusion; Multi-label classification; RECOGNITION;
D O I
10.1007/s42486-024-00164-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the contradiction between the pursuit of health and the increasing duration of sedentary office work intensifies, there has been a growing focus on maintaining correct sitting posture while working in recent years. Scientific studies have shown that sitting posture correction plays a positive role in alleviating physical pain. With the rapid development of artificial intelligence technology, a significant amount of research has shifted towards implementing sitting posture detection and recognition systems using machine learning approaches. In this paper, we introduce an innovative sitting posture recognition system that integrates visual and pressure modalities. The system employs a differentiated pre-training strategy for training the bimodal models and features a feature fusion module designed based on feed-forward networks. Our system utilizes commonly available built-in cameras in laptops for collecting visual data and thin-film pressure sensor mats for pressure data in office scenarios. It achieved an F1-Macro score of 95.43% on a dataset with complex composite actions, marking an improvement of 7.13% and 10.79% over systems that rely solely on pressure or visual modalities, respectively, and a 7.07% improvement over systems using a uniform pre-training strategy.
引用
收藏
页码:380 / 393
页数:14
相关论文
共 50 条
  • [1] Online video visual relation detection with hierarchical multi-modal fusion
    He, Yuxuan
    Gan, Ming-Gang
    Ma, Qianzhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 65707 - 65727
  • [2] Video Visual Relation Detection via Multi-modal Feature Fusion
    Sun, Xu
    Ren, Tongwei
    Zi, Yuan
    Wu, Gangshan
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2657 - 2661
  • [3] Multi-modal Fusion
    Liu, Huaping
    Hussain, Amir
    Wang, Shuliang
    INFORMATION SCIENCES, 2018, 432 : 462 - 462
  • [4] Multi-Modal fusion with multi-level attention for Visual Dialog
    Zhang, Jingping
    Wang, Qiang
    Han, Yahong
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)
  • [5] MMFusion: A Generalized Multi-Modal Fusion Detection Framework
    Cui, Leichao
    Li, Xiuxian
    Meng, Min
    Mo, Xiaoyu
    2023 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, ICDL, 2023, : 415 - 422
  • [6] Improving multi-modal data fusion by anomaly detection
    Jakub Simanek
    Vladimir Kubelka
    Michal Reinstein
    Autonomous Robots, 2015, 39 : 139 - 154
  • [7] Improving multi-modal data fusion by anomaly detection
    Simanek, Jakub
    Kubelka, Vladimir
    Reinstein, Michal
    AUTONOMOUS ROBOTS, 2015, 39 (02) : 139 - 154
  • [8] A multi-modal fusion YoLo network for traffic detection
    Zheng, Xinwang
    Zheng, Wenjie
    Xu, Chujie
    COMPUTATIONAL INTELLIGENCE, 2024, 40 (02)
  • [9] Visual Sorting Method Based on Multi-Modal Information Fusion
    Han, Song
    Liu, Xiaoping
    Wang, Gang
    APPLIED SCIENCES-BASEL, 2022, 12 (06):
  • [10] Stacked Multi-modal Refining and Fusion Network for Visual Entailment
    Yao, Yuan
    Hu, Min
    Wang, Xiaohua
    Liu, Chuqing
    THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083