Fusion of Appearance and Motion Features for Daily Activity Recognition from Egocentric Perspective

被引:2
|
作者
Lye, Mohd Haris [1 ]
AlDahoul, Nouar [1 ,2 ]
Abdul Karim, Hezerul [1 ]
机构
[1] Multimedia Univ, Fac Engn, Cyberjaya 63100, Selangor, Malaysia
[2] NYU, Comp Sci, POB 1291888, Abu Dhabi, U Arab Emirates
关键词
activities of daily living; convolutional neural network; egocentric vision; feature fusion; optical flow; DESCRIPTORS;
D O I
10.3390/s23156804
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Vidos from a first-person or egocentric perspective offer a promising tool for recognizing various activities related to daily living. In the egocentric perspective, the video is obtained from a wearable camera, and this enables the capture of the person's activities in a consistent viewpoint. Recognition of activity using a wearable sensor is challenging due to various reasons, such as motion blur and large variations. The existing methods are based on extracting handcrafted features from video frames to represent the contents. These features are domain-dependent, where features that are suitable for a specific dataset may not be suitable for others. In this paper, we propose a novel solution to recognize daily living activities from a pre-segmented video clip. The pre-trained convolutional neural network (CNN) model VGG16 is used to extract visual features from sampled video frames and then aggregated by the proposed pooling scheme. The proposed solution combines appearance and motion features extracted from video frames and optical flow images, respectively. The methods of mean and max spatial pooling (MMSP) and max mean temporal pyramid (TPMM) pooling are proposed to compose the final video descriptor. The feature is applied to a linear support vector machine (SVM) to recognize the type of activities observed in the video clip. The evaluation of the proposed solution was performed on three public benchmark datasets. We performed studies to show the advantage of aggregating appearance and motion features for daily activity recognition. The results show that the proposed solution is promising for recognizing activities of daily living. Compared to several methods on three public datasets, the proposed MMSP-TPMM method produces higher classification performance in terms of accuracy (90.38% with LENA dataset, 75.37% with ADL dataset, 96.08% with FPPA dataset) and average per-class precision (AP) (58.42% with ADL dataset and 96.11% with FPPA dataset).
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Classification of bird species from video using appearance and motion features
    Atanbori, John
    Duan, Wenting
    Shaw, Edward
    Appiah, Kofi
    Dickinson, Patrick
    ECOLOGICAL INFORMATICS, 2018, 48 : 12 - 23
  • [42] Recognition of Activities of Daily Living from Egocentric Videos Using Hands Detected by a Deep Convolutional Network
    Thi-Hoa-Cuc Nguyen
    Nebel, Jean-Christophe
    Florez-Revuelta, Francisco
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2018), 2018, 10882 : 390 - 398
  • [43] Motion- and location-based online human daily activity recognition
    Zhu, Chun
    Sheng, Weihua
    PERVASIVE AND MOBILE COMPUTING, 2011, 7 (02) : 256 - 269
  • [44] Batch-based activity recognition from egocentric photo-streams revisited
    Cartas, Alejandro
    Marin, Juan
    Radeva, Petia
    Dimiccoli, Mariella
    PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (04) : 953 - 965
  • [45] Batch-based activity recognition from egocentric photo-streams revisited
    Alejandro Cartas
    Juan Marín
    Petia Radeva
    Mariella Dimiccoli
    Pattern Analysis and Applications, 2018, 21 : 953 - 965
  • [46] Bidirectional aggregated features fusion from CNN for palmprint recognition
    Zhang, Jianxin
    Yang, Aoqi
    Zhang, Mingli
    Zhang, Qiang
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2018, 10 (04) : 334 - 351
  • [47] Beyond local appearance: Category recognition from pairwise interactions of simple features
    Leordeanu, Marius
    Hebert, Martial
    Sukthankar, Rahul
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 928 - +
  • [48] View-invariant human activity recognition based on shape and motion features
    Niu, F.
    Abdel-Mottaleb, M.
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2007, 22 (03): : 235 - 243
  • [49] Manual Acupuncture Manipulation Recognition Method via Interactive Fusion of Spatial Multiscale Motion Features
    He, Jiyu
    Su, Chong
    Chen, Jie
    Li, Jinniu
    Yang, Jingwen
    Liu, Cunzhi
    IET SIGNAL PROCESSING, 2024, 2024
  • [50] Timely daily activity recognition from headmost sensor events
    Liu, Yaqing
    Wang, Xiangxin
    Zhai, Zhengguo
    Chen, Rong
    Zhang, Bin
    Jiang, Yu
    ISA TRANSACTIONS, 2019, 94 : 379 - 390