Fast Retinomorphic Event-Driven Representations for Video Gameplay and Action Recognition

被引:4
|
作者
Chen, Huaijin [1 ,2 ]
Liu, Wanjia [3 ,4 ]
Goel, Rishab [5 ,6 ]
Lua, Rhonald C. [7 ]
Mittal, Siddharth [8 ,9 ]
Huang, Yuzhong [10 ,11 ]
Veeraraghavan, Ashok [1 ]
Patel, Ankit B. [7 ]
机构
[1] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77005 USA
[2] SenseBrain Technol LLC, San Jose, CA 95131 USA
[3] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA
[4] Google Inc, Mountain View, CA 94043 USA
[5] Indian Inst Technol Delhi, New Delhi 110016, India
[6] Borealis AI, Montreal, PQ H2S 3H1, Canada
[7] Baylor Coll Med, Dept Neurosci, Houston, TX 77030 USA
[8] Indian Inst Technol Kanpur, Kanpur 208016, Uttar Pradesh, India
[9] Quadeye, Gurgaon 122009, India
[10] Olin Coll Engn, Needham, MA 02492 USA
[11] Kensho Technol, Cambridge, MA 02138 USA
基金
美国国家科学基金会;
关键词
Smart cameras; retina; real-time systems; streaming media; cells (biology); reinforcement learning; video signal processing; video; ON-CENTER; CELLS; CONTRAST;
D O I
10.1109/TCI.2019.2948755
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Good temporal representations are crucial for video understanding, and the state-of-the-art video recognition framework is based on two-stream networks. In such framework, besides the regular ConvNets responsible for RGB frame inputs, a second network is introduced to handle the temporal representation, usually the optical flow (OF). However, OF or other task-oriented flow is computationally costly, and is thus typically pre-computed. Critically, this prevents the two-stream approach from being applied to reinforcement learning (RL) applications such as video game playing, where the next state depends on current state and action choices. Inspired by the early vision systems of mammals and insects, we propose a fast event-driven representation (EDR) that models several major properties of early retinal circuits: (1) log-arithmic input response, (2) multi-timescale temporal smoothing to filter noise, and (3) bipolar (ON/OFF) pathways for primitive event detection. Trading off the directional information for fast speed (>9000 fps), EDR enables fast real-time inference/learning in video applications that require interaction between an agent and the world such as game-playing, virtual robotics, and domain adaptation. In this vein, we use EDR to demonstrate performance improvements over state-of-the-art reinforcement learning algorithms for Atari games, something that has not been possible with pre-computed OF. Moreover, with UCF-101 video action recognition experiments, we show that EDR performs near state-of-the-art in accuracy while achieving a 1,500x speedup in input representation processing, as compared to optical flow.
引用
收藏
页码:276 / 290
页数:15
相关论文
共 50 条
  • [1] An Event-Driven Approach to the Recognition Problem in Video Surveillance System Development
    Bazhenov, Nikita
    Rybin, Egor
    Korzun, Dmitry
    2022 32ND CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2022, : 65 - 74
  • [2] Event-driven Video Frame Synthesis
    Wang, Zihao W.
    Jiang, Weixin
    He, Kuan
    Shi, Boxin
    Katsaggelos, Aggelos
    Cossairt, Oliver
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4320 - 4329
  • [3] Event-Driven Video Abstraction and Visualization
    Jeho Nam
    Ahmed H. Tewfik
    Multimedia Tools and Applications, 2002, 16 : 55 - 77
  • [4] Event-driven video abstraction and visualization
    Nam, J
    Tewfik, AH
    MULTIMEDIA TOOLS AND APPLICATIONS, 2002, 16 (1-2) : 55 - 77
  • [5] Event-Driven Heterogeneous Network for Video Deraining
    Fu, Xueyang
    Cao, Chengzhi
    Xu, Senyan
    Zhang, Fanrui
    Wang, Kunyu
    Zha, Zheng-Jun
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 5841 - 5861
  • [6] Event-driven video awareness providing physical security
    Georgakopoulos, Dimitrios
    Baker, Donald
    Nodine, Marian
    Cichoki, Andrzej
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2007, 10 (01): : 85 - 109
  • [7] Event-driven weakly supervised video anomaly detection
    Sun, Shengyang
    Gong, Xiaojin
    IMAGE AND VISION COMPUTING, 2024, 149
  • [8] Event-driven Video Awareness Providing Physical Security
    Dimitrios Georgakopoulos
    Donald Baker
    Marian Nodine
    Andrzej Cichoki
    World Wide Web, 2007, 10 : 85 - 109
  • [9] Event-driven video adaptation: A powerful tool for industrial video supervision
    Doulamis, Anastasios
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 69 (02) : 339 - 358
  • [10] Event-driven video adaptation: A powerful tool for industrial video supervision
    Anastasios Doulamis
    Multimedia Tools and Applications, 2014, 69 : 339 - 358