Temporal Query Networks for Fine-grained Video Understanding

被引:50
|
作者
Zhang, Chuhan [1 ]
Gupta, Ankush [2 ]
Zisserman, Andrew [1 ]
机构
[1] Univ Oxford, Oxford, England
[2] DeepMind, London, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/CVPR46437.2021.00446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Our objective in this work is fine-grained classification of actions in untrimmed videos, where the actions may be temporally extended or may span only a few frames of the video. We cast this into a query-response mechanism, where each query addresses a particular question, and has its own response label set. We make the following four contributions: (i) We propose a new model-a Temporal Query Network-which enables the query-response functionality, and a structural understanding of fine-grained actions. It attends to relevant segments for each query with a temporal attention mechanism, and can be trained using only the labels for each query. (ii) We propose a new way-stochastic feature bank update-to train a network on videos of various lengths with the dense sampling required to respond to fine-grained queries. (iii) we compare the TQN to other architectures and text supervision methods, and analyze their pros and cons. Finally, (iv) we evaluate the method extensively on the FineGym and Diving48 benchmarks for fine-grained action classification and surpass the state-of-the-art using only RGB features. Project page: https://www.robots.ox.ac.uk/-vgg/research/tqn/.
引用
收藏
页码:4484 / 4494
页数:11
相关论文
共 50 条
  • [11] Conditional Video Diffusion Network for Fine-Grained Temporal Sentence Grounding
    Liu, Daizong
    Zhu, Jiahao
    Fang, Xiang
    Xiong, Zeyu
    Wang, Huan
    Li, Renfu
    Zhou, Pan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5461 - 5476
  • [12] On the Fine-Grained Query Complexity of Symmetric Functions
    Podder, Supartha
    Yao, Penghui
    Ye, Zekun
    COMPUTATIONAL COMPLEXITY, 2025, 34 (01)
  • [13] Fine-grained Audible Video Description
    Shen, Xuyang
    Li, Dong
    Zhou, Jinxing
    Qin, Zhen
    He, Bowen
    Han, Xiaodong
    Li, Aixuan
    Dai, Yuchao
    Kong, Lingpeng
    Wang, Meng
    Qiao, Yu
    Zhong, Yiran
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10585 - 10596
  • [14] Fine-Grained Scalable Video Caching
    Gong, Qiushi
    Woods, John W.
    Kar, Koushik
    Chakareski, Jacob
    2015 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2015, : 101 - 106
  • [15] Fine-grained rate shaping for video streaming over wireless networks
    Chen, TPC
    Chen, TH
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 688 - 691
  • [16] Fine-grained rate shaping for video streaming over wireless networks
    Chen, TPC
    Chen, TH
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (02) : 176 - 191
  • [17] Fine-Grained Rate Shaping for Video Streaming over Wireless Networks
    Trista Pei-chun Chen
    Tsuhan Chen
    EURASIP Journal on Advances in Signal Processing, 2004
  • [18] TEMPORAL STABILITY OF A FINE-GRAINED MAGNETITE
    MURAD, E
    SCHWERTMANN, U
    CLAYS AND CLAY MINERALS, 1993, 41 (01) : 111 - 113
  • [19] Fine-Grained Complexity of Temporal Problems
    Dabrowski, Konrad K.
    Jonsson, Peter
    Ordyniak, Sebastian
    Osipov, George
    KR2020: PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, 2020, : 284 - 293
  • [20] Fine-Grained Temporal Relation Extraction
    Vashishtha, Siddharth
    Van Durme, Benjamin
    White, Aaron Steven
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2906 - 2919