Matching Video Net: Memory-based embedding for video action recognition

被引：0

作者：

Kim, Daesik ^{[1
]}

Lee, Myunggi ^{[1
]}

Kwak, Nojun ^{[1
]}

机构：

[1] Seoul Natl Univ, Grad Sch Convergence Sci & Technol, Seoul, South Korea

来源：

2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2017年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most of recent successful researches on action recognition are based on deep learning structures. Nonetheless, training deep neural networks is notorious for requiring huge amount of data. On the other hand, not enough data can lead to an overfitted model. In this work, we propose a novel model, matching video net (MVN), which can be trained with a small amount of data. In order to avoid the problem of overfitting, we use a non-parametric setup on top of parametric networks with external memories. An input clip of video is transformed into an embedding space and matched to the memorized samples in the embedding space. Then, the similarities between the input and the memorized data are measured to determine the nearest neighbors. We perform experiments in a supervised manner on action recognition datasets, achieving state-of-the-art results. Moreover, we applied our model to one-shot learning problems with a novel training strategy. Our model achieves surprisingly good results in predicting unseen action classes from only a few examples.

引用

页码：432 / 438

页数：7

共 50 条

[21] Action recognition on continuous video
Y. L. Chang
C. S. Chan
P. Remagnino
Neural Computing and Applications, 2021, 33 : 1233 - 1243
[22] Unsafe Action Recognition of Miners Based on Video Description
Ding, Enjie
Liu, Zhongyu
Liu, Yafeng
Xu, Dawei
Feng, Shimin
Liu, Xiaowen
2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
[23] Video-based cattle identification and action recognition
Chuong Nguyen
Wang, Dadong
Von Richter, Karl
Valencia, Philip
Alvarenga, Flavio A. P.
Bishop-Hurley, Gregory
2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 441 - 445
[24] An overview of sparse representation based action recognition in video
Ushapreethi, P.
Lakshmipriya, G. G.
2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 63 - 67
[25] ACTION RECOGNITION BASED ON KINEMATIC REPRESENTATION OF VIDEO DATA
Sun, Xin
Huang, Di
Wang, Yunhong
Qin, Jie
2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 1530 - 1534
[26] Video action recognition based on visual rhythm representation
Moreira, Thierry Pinheiro
Menotti, David
Pedrini, Helio
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2020, 71
[27] Style of Action based Individual Recognition in Video Sequences
Pratheepan, Y.
Prasad, G.
Condell, J. V.
2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 1236 - 1241
[28] Sensor Substitution for Video-based Action Recognition
Rupprecht, Christian
Lea, Colin
Tombari, Federico
Navab, Nassir
Hager, Gregory D.
2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 5230 - 5237
[29] A DISTRIBUTION BASED VIDEO REPRESENTATION FOR HUMAN ACTION RECOGNITION
Song, Yan
Tang, Sheng
Zheng, Yan-Tao
Chua, Tat-Seng
Zhang, Yongdong
Lin, Shouxun
2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 772 - 777
[30] Video-Based Temporal Enhanced Action Recognition
Zhang H.
Fu D.
Zhou K.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2020, 33 (10): : 951 - 958

← 1 2 3 4 5 →