MMA: a multi-view and multi-modality benchmark dataset for human action recognition

被引:5
|
作者
Gao, Zan [1 ,2 ]
Han, Tao-tao [1 ,2 ]
Zhang, Hua [1 ,2 ]
Xue, Yan-bing [1 ,2 ]
Xu, Guang-ping [1 ,2 ]
机构
[1] Tianjin Univ Technol, Key Lab Comp Vis & Syst, Minist Educ, Tianjin 300384, Peoples R China
[2] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, Tianjin 300384, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Benchmark dataset; Multi-view; Multi-modalidy; Cross-view; Multi-task; Cross-domain; FEATURE-SELECTION;
D O I
10.1007/s11042-018-5833-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition is an active research topic in both computer vision and machine learning communities, which has broad applications including surveillance, biometrics and human computer interaction. In the past decades, although some famous action datasets have been released, there still exist limitations, including the limited action categories and samples, camera views and variety of scenarios. Moreover, most of them are designed for a subset of the learning problems, such as single-view learning problem, cross-view learning problem and multi-task learning problem. In this paper, we introduce a multi-view, multi-modality benchmark dataset for human action recognition (abbreviated to MMA). MMA consists of 7080 action samples from 25 action categories, including 15 single-subject actions and 10 double-subject interactive actions in three views of two different scenarios. Further, we systematically benchmark the state-of-the-art approaches on MMA with respective to all three learning problems by different temporal-spatial feature representations. Experimental results demonstrate that MMA is challenging on all three learning problems due to significant intra-class variations, occlusion issues, views and scene variations, and multiple similar action categories. Meanwhile, we provide the baseline for the evaluation of existing state-of-the-art algorithms.
引用
收藏
页码:29383 / 29404
页数:22
相关论文
共 50 条
  • [31] MLRMV: Multi-layer representation for multi-view action recognition
    Liu, Zhigang
    Yin, Ziyang
    Wu, Yin
    IMAGE AND VISION COMPUTING, 2021, 116 (116)
  • [32] Single/multi-view human action recognition via regularized multi-task learning
    Liu, An-An
    Xu, Ning
    Su, Yu-Ting
    Lin, Hong
    Hao, Tong
    Yang, Zhao-Xuan
    NEUROCOMPUTING, 2015, 151 : 544 - 553
  • [33] Locoregional recurrence prediction in head and neck cancer based on multi-modality and multi-view feature expansion
    Wang, Rongfang
    Guo, Jinkun
    Zhou, Zhiguo
    Wang, Kai
    Gou, Shuiping
    Xu, Rongbin
    Sher, David
    Wang, Jing
    PHYSICS IN MEDICINE AND BIOLOGY, 2022, 67 (12):
  • [34] Predicting Locoregional Recurrence Through Multi-Modality and Multi-View Deep Learning for in Head & Neck Cancer
    Guo, J.
    Wang, R.
    Zhou, Z.
    Wang, K.
    Xu, R.
    Wang, J.
    MEDICAL PHYSICS, 2021, 48 (06)
  • [35] Model long-range dependencies for multi-modality and multi-view retinopathy diagnosis through transformers
    Huang, Yonghao
    Chen, Leiting
    Zhou, Chuan
    Yan, Ning
    Qiao, Lifeng
    Lan, Shanlin
    Wen, Yang
    KNOWLEDGE-BASED SYSTEMS, 2023, 271
  • [36] GCN-Based Multi-Modality Fusion Network for Action Recognition
    Liu, Shaocan
    Wang, Xingtao
    Xiong, Ruiqin
    Fan, Xiaopeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1242 - 1253
  • [37] VIEW-INDEPENDENT HUMAN ACTION RECOGNITION BASED ON MULTI-VIEW ACTION IMAGES AND DISCRIMINANT LEARNING
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    2013 IEEE 11TH IVMSP WORKSHOP: 3D IMAGE/VIDEO TECHNOLOGIES AND APPLICATIONS (IVMSP 2013), 2013,
  • [38] Dividing and Aggregating Network for Multi-view Action Recognition
    Wang, Dongang
    Ouyang, Wanli
    Li, Wen
    Xu, Dong
    COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 457 - 473
  • [39] Multi-view Player Action Recognition in Soccer Games
    Leo, Marco
    D'Orazio, Tiziana
    Spagnolo, Paolo
    Mazzeo, Pier Luigi
    Distante, Arcangelo
    COMPUTER VISION/COMPUTER GRAPHICS COLLABORATION TECHNIQUES, PROCEEDINGS, 2009, 5496 : 46 - +
  • [40] Automatic Multi-view Action Recognition with Robust Features
    Chou, Kuang-Pen
    Prasad, Mukesh
    Li, Dong-Lin
    Bharill, Neha
    Lin, Yu-Feng
    Hussain, Farookh
    Lin, Chin-Teng
    Lin, Wen-Chieh
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 554 - 563