Deep appearance and motion learning for egocentric activity recognition

被引:35
|
作者
Wang, Xuanhan [1 ]
Gao, Lianli [1 ]
Song, Jingkuan [2 ]
Zhen, Xiantong [3 ]
Sebe, Nicu [4 ]
Shen, Heng Tao [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
[2] Columbia Univ, Sch Engn & Appl Sci, New York, NY 10027 USA
[3] Univ Western Ontario, Digital Imaging Grp, London, ON N6A 4V2, Canada
[4] Univ Trento, Dept Informat Engn & Comp Sci, I-38100 Trento, Italy
基金
中国国家自然科学基金;
关键词
Multiple feature learning; Deep learning; Autoencoder; Egocentric video; Activity recognition;
D O I
10.1016/j.neucom.2017.08.063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Egocentric activity recognition has recently generated great popularity in computer vision due to its widespread applications in egocentric video analysis. However, it poses new challenges comparing to the conventional third-person activity recognition tasks, which are caused by significant body shaking, varied lengths, and poor recoding quality, etc. To handle these challenges, in this paper, we propose deep appearance and motion learning (DAML) for egocentric activity recognition, which leverages the great strength of deep learning networks in feature learning. In contrast to hand- crafted visual features or pre-trained convolutional neural network (CNN) features with limited generality to new egocentric videos, the proposed DAML is built on the deep autoencoder (DAE), and directly extracts appearance and motion feature, the main cue of activities, from egocentric videos. The DAML takes advantages of the great effectiveness and efficiency of the DAE in unsupervised feature learning, which provides a new representation learning framework of egocentric videos. The learned appearance and motion features by the DAML are seamlessly fused to accomplish a rich informative egocentric activity representation which can be readily fed into any supervised learning models for activity recognition. Experimental results on two challenging benchmark datasets show that the DAML achieves high performance on both short- and long-term egocentric activity recognition tasks, which is comparable to or even better than the state-of-the-art counterparts. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:438 / 447
页数:10
相关论文
共 50 条
  • [21] A Survey of Activity Recognition in Egocentric Lifelogging datasets
    Asnaoui Khalid, E. L.
    Hamid, Aksasse
    Brahim, Aksasse
    Mohammed, Ouanan
    2017 INTERNATIONAL CONFERENCE ON WIRELESS TECHNOLOGIES, EMBEDDED AND INTELLIGENT SYSTEMS (WITS), 2017,
  • [22] Multi-modal egocentric activity recognition using multi-kernel learning
    Arabaci, Mehmet Ali
    Ozkan, Fatih
    Surer, Elif
    Jancovic, Peter
    Temizel, Alptekin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16299 - 16328
  • [23] Deep Dual Relation Modeling for Egocentric Interaction Recognition
    Li, Haoxin
    Cai, Yijun
    Zheng, Wei-Shi
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7924 - 7933
  • [24] Multi-modal egocentric activity recognition using multi-kernel learning
    Mehmet Ali Arabacı
    Fatih Özkan
    Elif Surer
    Peter Jančovič
    Alptekin Temizel
    Multimedia Tools and Applications, 2021, 80 : 16299 - 16328
  • [25] Driver Activity Recognition Through Deep Learning
    Nel, Francois
    Ngxande, Mkhuseli
    2021 SOUTHERN AFRICAN UNIVERSITIES POWER ENGINEERING CONFERENCE/ROBOTICS AND MECHATRONICS/PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA (SAUPEC/ROBMECH/PRASA), 2021,
  • [26] Locomotion Activity Recognition: A Deep Learning Approach
    Gu, Fuqiang
    Khoshelham, Kourosh
    Valaee, Shahrokh
    2017 IEEE 28TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2017,
  • [27] Deep Feature Learning and Selection for Activity Recognition
    Mohammad, Yasser
    Matsumoto, Kazunori
    Hoashi, Keiichiro
    33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 930 - 939
  • [28] Human Activity Recognition using Deep Learning
    Moola, Ramu
    Hossain, Ashraf
    2022 URSI REGIONAL CONFERENCE ON RADIO SCIENCE, USRI-RCRS, 2022, : 165 - 168
  • [29] A Survey on Deep Learning for Human Activity Recognition
    Gu, Fuqiang
    Chung, Mu-Huan
    Chignell, Mark
    Valaee, Shahrokh
    Zhou, Baoding
    Liu, Xue
    ACM COMPUTING SURVEYS, 2021, 54 (08)
  • [30] EGOCENTRIC ACTIVITY RECOGNITION WITH MULTIMODAL FISHER VECTOR
    Song, Sibo
    Cheung, Ngai-Man
    Chandrasekhar, Vijay
    Mandal, Bappaditya
    Lin, Jie
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2717 - 2721