VIDEOWHISPER: Toward Discriminative Unsupervised Video Feature Learning With Attention-Based Recurrent Neural Networks

被引:20
|
作者
Zhao, Na [1 ]
Zhang, Hanwang [1 ]
Hong, Richang [2 ]
Wang, Meng [2 ]
Chua, Tat-Seng [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
[2] Hefei Univ Technol, Sch Comp & Informat, Hefei 230009, Anhui, Peoples R China
基金
新加坡国家研究基金会;
关键词
Recurrent neural networks; sequence learning; unsupervised feature learning; video features; RECOGNITION;
D O I
10.1109/TMM.2017.2722687
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present VIDEOWHISPER, a novel approach for unsupervised video representation learning. Based on the observation that the frame sequence encodes the temporal dynamics of a video (e.g., object movement and event evolution), we treat the frame sequential order as a self-supervision to learn video representations. Unlike other unsupervised video feature learning methods based on frame-level feature reconstruction that is sensitive to visual variance, VIDEOWHISPER is driven by a novel video "sequence-to-whisper" learning strategy. Specifically, for each video sequence, we use a prelearned visual dictionary to generate a sequence of high-level semantics, dubbed "whisper," which can be considered as the language describing the video dynamics. In this way, we model VIDEOWHISPER as an end-to-end sequence-to-sequence learning model using attention-based recurrent neural networks. This model is trained to predict the whisper sequence and hence it is able to learn the temporal structure of videos. We propose two ways to generate video representation from the model. Through extensive experiments on two real-world video datasets, we demonstrate that video representation learned by VIDEOWHISPER is effective to boost fundamental multimedia applications such as video retrieval and event classification.
引用
收藏
页码:2080 / 2092
页数:13
相关论文
共 50 条
  • [41] PredictPTB: an interpretable preterm birth prediction model using attention-based recurrent neural networks
    AlSaad, Rawan
    Malluhi, Qutaibah
    Boughorbel, Sabri
    BIODATA MINING, 2022, 15 (01)
  • [42] PredictPTB: an interpretable preterm birth prediction model using attention-based recurrent neural networks
    Rawan AlSaad
    Qutaibah Malluhi
    Sabri Boughorbel
    BioData Mining, 15
  • [43] Attention-based Convolutional Neural Networks for Sentence Classification
    Zhao, Zhiwei
    Wu, Youzheng
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 705 - 709
  • [44] OPEN-SOURCE: ATTENTION-BASED NEURAL NETWORKS FOR CHROMA INTRA PREDICTION IN VIDEO CODING
    Blanch, Marc Gorriz
    Blasi, Saverio
    Smeaton, Alan
    O'Connor, Noel E.
    Mrak, Marta
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [45] Attention-based Encoder-Decoder Recurrent Neural Networks for HTTP Payload Anomaly Detection
    Wu, Shang
    Wang, Yijie
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 1452 - 1459
  • [46] Attention-based recurrent neural network for influenza epidemic prediction
    Zhu, Xianglei
    Fu, Bofeng
    Yang, Yaodong
    Ma, Yu
    Hao, Jianye
    Chen, Siqi
    Liu, Shuang
    Li, Tiegang
    Liu, Sen
    Guo, Weiming
    Liao, Zhenyu
    BMC BIOINFORMATICS, 2019, 20 (Suppl 18)
  • [47] Conversational Analysis using Utterance-level Attention-based Bidirectional Recurrent Neural Networks
    Bothe, Chandrakant
    Magg, Sven
    Weber, Cornelius
    Wermter, Stefan
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 996 - 1000
  • [48] Seizure localisation with attention-based graph neural networks
    Grattarola, Daniele
    Livi, Lorenzo
    Alippi, Cesare
    Wennberg, Richard
    Valiante, Taufik A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
  • [49] Causal Discovery with Attention-Based Convolutional Neural Networks
    Nauta, Meike
    Bucur, Doina
    Seifert, Christin
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01):
  • [50] MPA-RNN: A Novel Attention-Based Recurrent Neural Networks for Total Nitrogen Prediction
    Geng, Jingxuan
    Yang, Chunhua
    Li, Yonggang
    Lan, Lijuan
    Luo, Qiwu
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 6516 - 6525