A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities

被引：21

作者：

Wang, Huogen ^{[1
,2
]}

Song, Zhanjie ^{[3
]}

Li, Wanqing ^{[2
]}

Wang, Pichao ^{[4
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Univ Wollongong, Adv Multimedia Res Lab, Wollongong, NSW 2522, Australia

[3] Tianjin Univ, Sch Math, Tianjin 300350, Peoples R China

[4] Alibaba Grp US Inc, Bellevue, WA 98004 USA

来源：

SENSORS | 2020年 / 20卷 / 11期

基金：

中国国家自然科学基金;

关键词：

action recognition; weighted rank pooling; weighted dynamic image; 3D convolutional LSTM network; canonical correlation analysis;

D O I：

10.3390/s20113305

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

The paper presents a novel hybrid network for large-scale action recognition from multiple modalities. The network is built upon the proposed weighted dynamic images. It effectively leverages the strengths of the emerging Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches to specifically address the challenges that occur in large-scale action recognition and are not fully dealt with by the state-of-the-art methods. Specifically, the proposed hybrid network consists of a CNN based component and an RNN based component. Features extracted by the two components are fused through canonical correlation analysis and then fed to a linear Support Vector Machine (SVM) for classification. The proposed network achieved state-of-the-art results on the ChaLearn LAP IsoGD, NTU RGB+D and Multi-modal & Multi-view & Interactive ((MI)-I-2) datasets and outperformed existing methods by a large margin (over 10 percentage points in some cases).

引用

页码：1 / 25

页数：25

共 50 条

[41] Hybrid Deep Learning Ensemble Model for Improved Large-Scale Car Recognition
Verma, Abhishek
Liu, Yu
2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2017,
[42] Isolated Sign Recognition with a Siamese Neural Network of RGB and Depth Streams
Tur, Anil Osman
Keles, Hacer Yalim
PROCEEDINGS OF 18TH INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES (IEEE EUROCON 2019), 2019,
[43] Large-Scale Visual Font Recognition
Chen, Guang
Yang, Jianchao
Jin, Hailin
Brandt, Jonathan
Shechtman, Eli
Agarwala, Aseem
Han, Tony X.
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3598 - 3605
[44] Large-Scale Visual Speech Recognition
Shillingford, Brendan
Assael, Yannis
Hoffman, Matthew W.
Paine, Thomas
Hughes, Cian
Prabhu, Utsav
Liao, Hank
Sak, Hasim
Rao, Kanishka
Bennett, Lorrayne
Mulville, Marie
Denil, Misha
Coppin, Ben
Laurie, Ben
Senior, Andrew
de Freitas, Nando
INTERSPEECH 2019, 2019, : 4135 - 4139
[45] On the preconditions for large-scale collective action
Jagers, Sverker C.
Harring, Niklas
Lofgren, Asa
Sjostedt, Martin
Alpizar, Francisco
Brulde, Bengt
Langlet, David
Nilsson, Andreas
Almroth, Bethanie Carney
Dupont, Sam
Steffen, Will
AMBIO, 2020, 49 (07) : 1282 - 1296
[46] On the preconditions for large-scale collective action
Sverker C. Jagers
Niklas Harring
Åsa Löfgren
Martin Sjöstedt
Francisco Alpizar
Bengt Brülde
David Langlet
Andreas Nilsson
Bethanie Carney Almroth
Sam Dupont
Will Steffen
Ambio, 2020, 49 : 1282 - 1296
[47] Large-scale Monocular Depth Estimation in the Wild
Haji-Esmaeili, Mohammad M.
Montazer, Gholamali
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
[48] Describing Trajectory of Surface Patch for Human Action Recognition on RGB and Depth Videos
Song, Yan
Liu, Shi
Tang, Jinhui
IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (04) : 426 - 429
[49] LPHD: A LARGE-SCALE HEAD POSE DATASET FOR RGB IMAGES
Sun, Wei
Fan, Yezhao
Min, Xiongkuo
Peng, Shihao
Ma, Siwei
Zhai, Guangtao
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1084 - 1089
[50] Local and Global Feature Descriptors Combination from RGB-Depth Videos for Human Action Recognition
Al-Akam, Rawya
Paulus, Dietrich
PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 265 - 272

← 1 2 3 4 5 →