Fusion of Appearance and Motion Features for Daily Activity Recognition from Egocentric Perspective

被引:2
|
作者
Lye, Mohd Haris [1 ]
AlDahoul, Nouar [1 ,2 ]
Abdul Karim, Hezerul [1 ]
机构
[1] Multimedia Univ, Fac Engn, Cyberjaya 63100, Selangor, Malaysia
[2] NYU, Comp Sci, POB 1291888, Abu Dhabi, U Arab Emirates
关键词
activities of daily living; convolutional neural network; egocentric vision; feature fusion; optical flow; DESCRIPTORS;
D O I
10.3390/s23156804
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Vidos from a first-person or egocentric perspective offer a promising tool for recognizing various activities related to daily living. In the egocentric perspective, the video is obtained from a wearable camera, and this enables the capture of the person's activities in a consistent viewpoint. Recognition of activity using a wearable sensor is challenging due to various reasons, such as motion blur and large variations. The existing methods are based on extracting handcrafted features from video frames to represent the contents. These features are domain-dependent, where features that are suitable for a specific dataset may not be suitable for others. In this paper, we propose a novel solution to recognize daily living activities from a pre-segmented video clip. The pre-trained convolutional neural network (CNN) model VGG16 is used to extract visual features from sampled video frames and then aggregated by the proposed pooling scheme. The proposed solution combines appearance and motion features extracted from video frames and optical flow images, respectively. The methods of mean and max spatial pooling (MMSP) and max mean temporal pyramid (TPMM) pooling are proposed to compose the final video descriptor. The feature is applied to a linear support vector machine (SVM) to recognize the type of activities observed in the video clip. The evaluation of the proposed solution was performed on three public benchmark datasets. We performed studies to show the advantage of aggregating appearance and motion features for daily activity recognition. The results show that the proposed solution is promising for recognizing activities of daily living. Compared to several methods on three public datasets, the proposed MMSP-TPMM method produces higher classification performance in terms of accuracy (90.38% with LENA dataset, 75.37% with ADL dataset, 96.08% with FPPA dataset) and average per-class precision (AP) (58.42% with ADL dataset and 96.11% with FPPA dataset).
引用
收藏
页数:20
相关论文
共 50 条
  • [21] A multisource fusion framework driven by user-defined knowledge for egocentric activity recognition
    Yu, Haibin
    Jia, Wenyan
    Li, Zhen
    Gong, Feixiang
    Yuan, Ding
    Zhang, Hong
    Sun, Mingui
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2019, 2019 (1)
  • [22] Attention-Driven Appearance-Motion Fusion Network for Action Recognition
    Liu, Shaocan
    Ma, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2573 - 2584
  • [23] A multisource fusion framework driven by user-defined knowledge for egocentric activity recognition
    Haibin Yu
    Wenyan Jia
    Zhen Li
    Feixiang Gong
    Ding Yuan
    Hong Zhang
    Mingui Sun
    EURASIP Journal on Advances in Signal Processing, 2019
  • [24] Progressive Motion Representation Distillation With Two-Branch Networks for Egocentric Activity Recognition
    Liu, Tianshan
    Zhao, Rui
    Xiao, Jun
    Lam, Kin-Man
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 1320 - 1324
  • [25] Fusion of Smartphone Motion Sensors for Physical Activity Recognition
    Shoaib, Muhammad
    Bosch, Stephan
    Incel, Ozlem Durmaz
    Scholten, Hans
    Havinga, Paul J. M.
    SENSORS, 2014, 14 (06) : 10146 - 10176
  • [26] Cross-view action recognition understanding from exocentric to egocentric perspective
    Truong, Thanh-Dat
    Luu, Khoa
    NEUROCOMPUTING, 2025, 614
  • [27] An Experimental Study on New Features for Activity of Daily Living Recognition
    Ferretti, Daniele
    Principi, Emanuele
    Squartini, Stefano
    Mandolini, Luigi
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3958 - 3965
  • [28] A Hierarchical Deep Fusion Framework for Egocentric Activity Recognition using a Wearable Hybrid Sensor System
    Yu, Haibin
    Pan, Guoxiong
    Pan, Mian
    Li, Chong
    Jia, Wenyan
    Zhang, Li
    Sun, Mingui
    SENSORS, 2019, 19 (03)
  • [29] Human gait recognition using extraction and fusion of global motion features
    Milene Arantes
    Adilson Gonzaga
    Multimedia Tools and Applications, 2011, 55 : 655 - 675
  • [30] Human gait recognition using extraction and fusion of global motion features
    Arantes, Milene
    Gonzaga, Adilson
    MULTIMEDIA TOOLS AND APPLICATIONS, 2011, 55 (03) : 655 - 675