Human behaviour recognition with mid-level representations for crowd understanding and analysis

被引:2
|
作者
Sun, Bangyong [1 ,2 ]
Yuan, Nianzeng [1 ]
Li, Shuying [4 ]
Wu, Siyuan [2 ]
Wang, Nan [2 ,3 ]
机构
[1] Xian Univ Technol, Coll Printing Packaging Engn & Digital Media, Xian 710048, Shaanxi, Peoples R China
[2] Chinese Acad Sci, Xian Inst Opt & Precis Mech, Key Lab Spectral Imaging Technol CAS, Xian 710119, Shaanxi, Peoples R China
[3] Univ Chinese Acad Sci, 19A Yuquanlu, Beijing 100049, Peoples R China
[4] Xian Univ Posts & Telecommun, Sch Automat, Xian 710121, Shaanxi, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
VIDEOS;
D O I
10.1049/ipr2.12147
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crowd understanding and analysis have received increasing attention for couples of decades, and development of human behaviour recognition strongly supports the application of crowd understanding and analysis. Human behaviour recognition usually seeks to automatically analyse ongoing movements and actions in different camera views by using various machine learning methodologies in unknown video clips or image sequences. Compared to other data modalities such as documents and images, processing video data demands much higher computational and storage resources. The idea of using middle level semantic concepts to represent human actions from videos is explored and it is argued that these semantic attributes enable the construction of more descriptive methods for human action recognition. The mid-level attributes, initialized by a cluster processing, are built upon low level features and fully utilize the discrepancies in different action classes, which can capture the importance of each attribute for each action class. In this way, the representation is constructed to be semantically rich and capable of highly discriminative performance even paired with simple linear classifiers. The method is verified on three challenging datasets (KTH, UCF50 and HMDB51), and the experimental results demonstrate that our method achieves better results than the baseline methods on human action recognition.
引用
收藏
页码:3414 / 3424
页数:11
相关论文
共 50 条
  • [1] Understanding mid-level representations in visual processing
    Peirce, Jonathan W.
    JOURNAL OF VISION, 2015, 15 (07):
  • [2] Human activity recognition based on mid-level representations in video surveillance applications
    Abdelhedi, Slim
    Wali, Ali
    Alimi, Add M.
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3984 - 3989
  • [3] EGOCENTRIC ACTIVITY RECOGNITION BY LEVERAGING MULTIPLE MID-LEVEL REPRESENTATIONS
    Hsieh, Peng-Ju
    Tin, Yen-Hang
    Chen, Yu-Hsiu
    Hsu, Winston
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
  • [4] Crowd Behavior Analysis Using Local Mid-Level Visual Descriptors
    Fradi, Hajer
    Luvison, Bertrand
    Quoc Cuong Pham
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (03) : 589 - 602
  • [5] Unsupervised learning of mid-level visual representations
    Matteucci, Giulio
    Piasini, Eugenio
    Zoccolan, Davide
    CURRENT OPINION IN NEUROBIOLOGY, 2024, 84
  • [6] Learning Mid-Level Features For Recognition
    Boureau, Y-Lan
    Bach, Francis
    LeCun, Yann
    Ponce, Jean
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 2559 - 2566
  • [7] Investigation of Factorized Optical Flows as Mid-Level Representations
    Yang, Hsuan-Kung
    Hsiao, Tsu-Ching
    Liao, Ting-Hsuan
    Liu, Hsu-Shen
    Tsao, Li-Yuan
    Wang, Tzu-Wen
    Yang, Shan-Ya
    Chen, Yu-Wen
    Liao, Huang-Ru
    Lee, Chun-Yi
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 746 - 753
  • [8] Emotion recognition from mid-level features
    Sanchez-Mendoza, David
    Masip, David
    Lapedriza, Agata
    PATTERN RECOGNITION LETTERS, 2015, 67 : 66 - 74
  • [9] Action Recognition with Discriminative Mid-Level Features
    Liu, Cuiwei
    Kong, Yu
    Wu, Xinxiao
    Jia, Yunde
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3366 - 3369
  • [10] EXMOVES: Mid-level Features for Efficient Action Recognition and Video Analysis
    Du Tran
    Lorenzo Torresani
    International Journal of Computer Vision, 2016, 119 : 239 - 253