Matching Video Net: Memory-based embedding for video action recognition

被引:0
|
作者
Kim, Daesik [1 ]
Lee, Myunggi [1 ]
Kwak, Nojun [1 ]
机构
[1] Seoul Natl Univ, Grad Sch Convergence Sci & Technol, Seoul, South Korea
来源
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2017年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of recent successful researches on action recognition are based on deep learning structures. Nonetheless, training deep neural networks is notorious for requiring huge amount of data. On the other hand, not enough data can lead to an overfitted model. In this work, we propose a novel model, matching video net (MVN), which can be trained with a small amount of data. In order to avoid the problem of overfitting, we use a non-parametric setup on top of parametric networks with external memories. An input clip of video is transformed into an embedding space and matched to the memorized samples in the embedding space. Then, the similarities between the input and the memorized data are measured to determine the nearest neighbors. We perform experiments in a supervised manner on action recognition datasets, achieving state-of-the-art results. Moreover, we applied our model to one-shot learning problems with a novel training strategy. Our model achieves surprisingly good results in predicting unseen action classes from only a few examples.
引用
收藏
页码:432 / 438
页数:7
相关论文
共 50 条
  • [21] Action recognition on continuous video
    Y. L. Chang
    C. S. Chan
    P. Remagnino
    Neural Computing and Applications, 2021, 33 : 1233 - 1243
  • [22] Unsafe Action Recognition of Miners Based on Video Description
    Ding, Enjie
    Liu, Zhongyu
    Liu, Yafeng
    Xu, Dawei
    Feng, Shimin
    Liu, Xiaowen
    2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
  • [23] Video-based cattle identification and action recognition
    Chuong Nguyen
    Wang, Dadong
    Von Richter, Karl
    Valencia, Philip
    Alvarenga, Flavio A. P.
    Bishop-Hurley, Gregory
    2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 441 - 445
  • [24] An overview of sparse representation based action recognition in video
    Ushapreethi, P.
    Lakshmipriya, G. G.
    2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 63 - 67
  • [25] ACTION RECOGNITION BASED ON KINEMATIC REPRESENTATION OF VIDEO DATA
    Sun, Xin
    Huang, Di
    Wang, Yunhong
    Qin, Jie
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 1530 - 1534
  • [26] Video action recognition based on visual rhythm representation
    Moreira, Thierry Pinheiro
    Menotti, David
    Pedrini, Helio
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2020, 71
  • [27] Style of Action based Individual Recognition in Video Sequences
    Pratheepan, Y.
    Prasad, G.
    Condell, J. V.
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 1236 - 1241
  • [28] Sensor Substitution for Video-based Action Recognition
    Rupprecht, Christian
    Lea, Colin
    Tombari, Federico
    Navab, Nassir
    Hager, Gregory D.
    2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 5230 - 5237
  • [29] A DISTRIBUTION BASED VIDEO REPRESENTATION FOR HUMAN ACTION RECOGNITION
    Song, Yan
    Tang, Sheng
    Zheng, Yan-Tao
    Chua, Tat-Seng
    Zhang, Yongdong
    Lin, Shouxun
    2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 772 - 777
  • [30] Video-Based Temporal Enhanced Action Recognition
    Zhang H.
    Fu D.
    Zhou K.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2020, 33 (10): : 951 - 958