Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning

被引:2
|
作者
Zhu, Minghao [1 ]
Lin, Xiao [1 ]
Dang, Ronghao [1 ]
Liu, Chengju [1 ]
Chen, Qijun [1 ]
机构
[1] Tongji Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-supervised Learning; Action Recognition;
D O I
10.1145/3581783.3611932
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the most essential property in a video, motion information is critical to a robust and generalized video representation. To inject motion dynamics, recent works have adopted frame difference as the source of motion information in video contrastive learning, considering the trade-off between quality and cost. However, existing works align motion features at the instance level, which suffers from spatial and temporal weak alignment across modalities. In this paper, we present a Fine-grained Motion Alignment (FIMA) framework, capable of introducing well-aligned and significant motion information. Specifically, we first develop a dense contrastive learning framework in the spatiotemporal domain to generate pixel-level motion supervision. Then, we design a motion decoder and a foreground sampling strategy to eliminate the weak alignments in terms of time and space. Moreover, a frame-level motion contrastive loss is presented to improve the temporal diversity of the motion features. Extensive experiments demonstrate that the representations learned by FIMA possess great motion-awareness capabilities and achieve state-of-the-art or competitive results on downstream tasks across UCF101, HMDB51, and Diving48 datasets. Code is available at https://github.com/ZMHH- H/FIMA.
引用
收藏
页码:4725 / 4736
页数:12
相关论文
共 50 条
  • [21] Learning Fine-Grained Motion Embedding for Landscape Animation
    Xue, Hongwei
    Liu, Bei
    Yang, Huan
    Fu, Jianlong
    Li, Houqiang
    Luo, Jiebo
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 291 - 299
  • [22] Contrastive Learning for Fine-Grained Ship Classification in Remote Sensing Images
    Chen, Jianqi
    Chen, Keyan
    Chen, Hao
    Li, Wenyuan
    Zou, Zhengxia
    Shi, Zhenwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [23] Class-Balanced Contrastive Learning for Fine-Grained Airplane Detection
    Li, Yan
    Wang, Qixiong
    Luo, Xiaoyan
    Yin, Jihao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [24] Fine-grained biomedical knowledge negation detection via contrastive learning
    Zhu, Tiantian
    Xiang, Yang
    Chen, Qingcai
    Qin, Yang
    Hu, Baotian
    Zhang, Wentai
    KNOWLEDGE-BASED SYSTEMS, 2023, 272
  • [25] Discrimination-Aware Mechanism for Fine-grained Representation Learning
    Xu, Furong
    Wang, Meng
    Zhang, Wei
    Cheng, Yuan
    Chu, Wei
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 813 - 822
  • [26] Fine-Grained Fashion Representation Learning by Online Deep Clustering
    Jiao, Yang
    Xie, Ning
    Gao, Yan
    Wang, Chien-Chih
    Sun, Yi
    COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 19 - 35
  • [27] Learning Deep Bilinear Transformation for Fine-grained Image Representation
    Zheng, Heliang
    Fu, Jianlong
    Zha, Zheng-Jun
    Luo, Jiebo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [28] Fine-Grained Motion Representation For Template-Free Visual Tracking
    Shuang, Kai
    Huang, Yuheng
    Sun, Yue
    Cai, Zhun
    Guo, Hao
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 660 - 669
  • [29] Fine-grained Action Recognition with Robust Motion Representation Decoupling and Concentration
    Sun, Baoli
    Ye, Xinchen
    Yan, Tiantian
    Wang, Zhihui
    Li, Haojie
    Wang, Zhiyong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4779 - 4788
  • [30] RADIAL LOSS FOR LEARNING FINE-GRAINED VIDEO SIMILARITY METRIC
    Jain, Abhinav
    Agarwal, Prerna
    Mujumdar, Shashank
    Gupta, Nitin
    Mehta, Sameep
    Chattopadhyay, Chiranjoy
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1652 - 1656