YogNet: A two-stream network for realtime multiperson yoga action recognition and posture correction

被引:15
|
作者
Yadav, Santosh Kumar [1 ,2 ]
Agarwal, Aayush [3 ]
Kumar, Ashish [3 ]
Tiwari, Kamlesh [3 ]
Pandey, Hari Mohan [4 ]
Akbar, Shaik Ali [1 ,2 ]
机构
[1] Acad Sci & Innovat Res AcSIR, Ghaziabad 201002, Uttar Pradesh, India
[2] Cent Elect Engn Res Inst CEERI, Cyber Phys Syst, CSIR, Pilani 333031, India
[3] Birla Inst Technol & Sci Pilani, Dept CSIS, Pilani Campus, Pilani 333031, Rajasthan, India
[4] Bournemouth Univ, Dept Comp & informat, Poole, England
关键词
Action recognition; Computer vision; Posture correction; Yoga and exercise;
D O I
10.1016/j.knosys.2022.109097
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Yoga is a traditional Indian exercise. It specifies various body postures called asanas, practicing them is beneficial for the physical, mental, and spiritual well-being. To support the yoga practitioners, there is a need of an expert yoga asanas recognition system that can automatically analyze practitioner's postures and could provide suitable posture correction instructions. This paper proposes YogNet, a multi-person yoga expert system for 20 asanas using a two-stream deep spatiotemporal neural network architecture. The first stream utilizes a keypoint detection approach to detect the practitioner's pose, followed by the formation of bounding boxes across the subject. The model then applies time distributed convolutional neural networks (CNNs) to extract frame-wise postural features, followed by regularized long shortterm memory (LSTM) networks to give temporal predictions. The second stream utilizes 3D-CNNs for spatiotemporal feature extraction from RGB videos. Finally, the scores of two streams are fused using multiple fusion techniques. A yoga asana recognition database (YAR) containing 1206 videos is collected using a single 2D web camera for 367 min with the help of 16 participants and contains four view variations i.e. front, back, left, and right sides. The proposed system is novel as this is the earliest two-stream deep learning-based system that can perform multi-person yoga asanas recognition and correction in realtime. Simulation result reveals that YogNet system achieved 77.29%, 89.29%, and 96.31% accuracies using pose stream, RGB stream, and via fusion of both streams, respectively. These results are impressive and sufficiently high for recommendation towards general adaption of the system.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Human Action Recognition Fusing Two-Stream Networks and SVM
    Tong A.
    Tang C.
    Wang W.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (09): : 863 - 870
  • [42] An Improved Two-stream 3D Convolutional Neural Network for Human Action Recognition
    Chen, Jun
    Xu, Yuanping
    Zhang, Chaolong
    Xu, Zhijie
    Meng, Xiangxiang
    Wang, Jie
    2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 135 - 140
  • [43] Improved human action recognition approach based on two-stream convolutional neural network model
    Congcong Liu
    Jie Ying
    Haima Yang
    Xing Hu
    Jin Liu
    The Visual Computer, 2021, 37 : 1327 - 1341
  • [44] The Very Deep Multi-stage Two-stream Convolutional Neural Network for Action Recognition
    Gao, Xiuju
    Zhang, Hanling
    PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INFORMATION TECHNOLOGY (ICMIT), 2016, 49 : 265 - 269
  • [45] Improving human action recognition with two-stream 3D convolutional neural network
    Van-Minh Khong
    Thanh-Hai Tran
    2018 1ST INTERNATIONAL CONFERENCE ON MULTIMEDIA ANALYSIS AND PATTERN RECOGNITION (MAPR), 2018,
  • [46] Improved human action recognition approach based on two-stream convolutional neural network model
    Liu, Congcong
    Ying, Jie
    Yang, Haima
    Hu, Xing
    Liu, Jin
    VISUAL COMPUTER, 2021, 37 (06): : 1327 - 1341
  • [47] Unsupervised video-based action recognition using two-stream generative adversarial network
    Lin, Wei
    Zeng, Huanqiang
    Zhu, Jianqing
    Hsia, Chih-Hsien
    Hou, Junhui
    Ma, Kai-Kuang
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (09): : 5077 - 5091
  • [48] TWO-STREAM MULTI-TASK NETWORK FOR FASHION RECOGNITION
    Li, Peizhao
    Li, Yanjing
    Jiang, Xiaolong
    Zhen, Xiantong
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3038 - 3042
  • [49] Micro-expression recognition by two-stream difference network
    Pan, Hang
    Xie, Lun
    Li, Juan
    Lv, Zeping
    Wang, Zhiliang
    IET COMPUTER VISION, 2021, 15 (06) : 440 - 448
  • [50] Unsupervised video-based action recognition using two-stream generative adversarial network
    Wei Lin
    Huanqiang Zeng
    Jianqing Zhu
    Chih-Hsien Hsia
    Junhui Hou
    Kai-Kuang Ma
    Neural Computing and Applications, 2024, 36 : 5077 - 5091