End-to-end learning interpolation for object tracking in low frame-rate video

被引:7
|
作者
Liu, Liqiang [1 ,2 ]
Cao, Jianzhong [1 ]
机构
[1] Chinese Acad Sci, Xian Inst Opt & Precis Mech, 17 Xinxi Rd, Xian, Peoples R China
[2] Univ Chinese Acad Sci, 19 Yuquan Rd, Beijing, Peoples R China
关键词
video signal processing; learning (artificial intelligence); object tracking; interpolation; mobile computing; low frame rates; implicit video frame interpolation sub-network; low frame-rate video; high frame-rate latent video; effective end-to-end optimisation; frame rate; tracking accuracy; semantic video analytics; end-to-end learning interpolation; subsequent semantic analytics; bandwidth constraints; analytics performance; MOTION ESTIMATION; SIAMESE NETWORKS;
D O I
10.1049/iet-ipr.2019.0944
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many scenarios, where videos are transmitted through bandwidth-limited channels for subsequent semantic analytics, the choice of frame rates has to balance between bandwidth constraints and analytics performance. Faced with this practical challenge, this study focuses on enhancing object tracking at low frame rates and proposes a learning Interpolation for tracking framework. This framework embeds an implicit video frame interpolation sub-network, which is concatenated and jointly trained with another object tracking sub-network. Once a low frame-rate video is an input, it is first mapped into a high frame-rate latent video, based on which the tracker is learned. Novel strategies and loss functions are derived to ensure the effective end-to-end optimisation of the authors' network. On several challenging benchmarks and settings, their method achieves a highly competitive tradeoff between frame rate and tracking accuracy. As is known, the implications of interpolation on semantic video analytics and tracking remain unexplored, and the authors expect their method to find many applications in mobile embedded vision, Internet of Things and edge computing.
引用
收藏
页码:1066 / 1072
页数:7
相关论文
共 50 条
  • [31] End-to-End Active Object Tracking and Its Real-World Deployment via Reinforcement Learning
    Luo, Wenhan
    Sun, Peng
    Zhong, Fangwei
    Liu, Wei
    Zhang, Tong
    Wang, Yizhou
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (06) : 1317 - 1332
  • [32] End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models
    Xiang, Jun
    Xu, Guohan
    Ma, Chao
    Hou, Jianhua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (01) : 275 - 288
  • [33] END-TO-END LEARNING OF VARIATIONAL MODELS AND SOLVERS FOR THE RESOLUTION OF INTERPOLATION PROBLEMS
    Fablet, R.
    Drumetz, L.
    Rousseau, F.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2360 - 2364
  • [34] End-to-End Video Captioning
    Olivastri, Silvio
    Singh, Gurkirt
    Cuzzolin, Fabio
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1474 - 1482
  • [35] End-to-end Scene Text Recognition in Videos Based on Multi Frame Tracking
    Wang, Xiaobing
    Jiang, Yingying
    Yang, Shuli
    Zhu, Xiangyu
    Li, Wei
    Fu, Pei
    Wang, Hua
    Luo, Zhenbo
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1255 - 1260
  • [36] Learning End-to-end Video Classification with Rank-Pooling
    Fernando, Basura
    Gould, Stephen
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [37] Tracking Ransomware End-to-end
    Huang, Danny Yuxing
    Aliapoulios, Maxwell Matthaios
    Li, Vector Guo
    Invernizzi, Luca
    McRoberts, Kylie
    Bursztein, Elie
    Levin, Jonathan
    Levchenko, Kirill
    Snoeren, Alex C.
    McCoy, Damon
    2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2018, : 618 - 631
  • [38] End-to-end representation learning for Correlation Filter based tracking
    Valmadre, Jack
    Bertinetto, Luca
    Henriques, Joao
    Vedaldi, Andrea
    Torr, Philip H. S.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5000 - 5008
  • [39] Learning reinforced attentional representation for end-to-end visual tracking
    Gao, Peng
    Zhang, Qiquan
    Wang, Fei
    Xiao, Liyi
    Fujita, Hamido
    Zhang, Yan
    INFORMATION SCIENCES, 2020, 517 : 52 - 67
  • [40] End-to-end Learning of Action Detection from Frame Glimpses in Videos
    Yeung, Serena
    Russakovsky, Olga
    Mori, Greg
    Li Fei-Fei
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2678 - 2687