Video Emotion Recognition with Transferred Deep Feature Encodings

被引:39
|
作者
Xu, Baohan [1 ]
Fu, Yanwei [2 ,3 ]
Jiang, Yu-Gang [1 ]
Li, Boyang [3 ]
Sigal, Leonid [3 ]
机构
[1] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Sch Comp Sci, Yangpu Qu, Shanghai Shi, Peoples R China
[2] Fudan Univ, Sch Data Sci, Yangpu Qu, Shanghai Shi, Peoples R China
[3] Disney Res, Orlando, FL 32830 USA
关键词
D O I
10.1145/2911996.2912006
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Despite growing research interest, emotion understanding for user-generated videos remains a challenging problem. Major obstacles include the diversity and complexity of video content, as well as the sparsity of expressed emotions. For the first time, we systematically study large-scale video emotion recognition by transferring deep feature encodings. In addition to the traditional, supervised recognition, we study the problem of zero-shot emotion recognition, where emotions in the test set are unseen during training. To cope with this task, we utilize knowledge transferred from auxiliary image and text corpora. A novel auxiliary Image Transfer Encoding (ITE) process is proposed to efficiently encode and generate video representation. We also thoroughly investigate different configurations of convolutional neural networks. Comprehensive experiments on multiple datasets demonstrate the effectiveness of our framework.
引用
收藏
页码:15 / 22
页数:8
相关论文
共 50 条
  • [21] A Novel Approach for Video Text Detection and Recognition Based on a Corner Response Feature Map and Transferred Deep Convolutional Neural Network
    Lu, Wei
    Sun, Hongbo
    Chu, Jinghui
    Huang, Xiangdong
    Yu, Jiexiao
    IEEE ACCESS, 2018, 6 : 40198 - 40211
  • [22] A statistical feature extraction for deep speech emotion recognition in a bilingual scenario
    Sara Sekkate
    Mohammed Khalil
    Abdellah Adib
    Multimedia Tools and Applications, 2023, 82 : 11443 - 11460
  • [23] Deep Convolutional Neural Networks for Feature Extraction in Speech Emotion Recognition
    Heracleous, Panikos
    Mohammad, Yasser
    Yoneyama, Akio
    HUMAN-COMPUTER INTERACTION. RECOGNITION AND INTERACTION TECHNOLOGIES, HCI 2019, PT II, 2019, 11567 : 117 - 132
  • [24] Deep Feature Selection for Facial Emotion Recognition Based on BPSO and SVM
    Donuk, Kenan
    Ari, Ali
    Ozdemir, Mehmet Fatih
    Hanbay, Davut
    JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2023, 26 (01): : 131 - 142
  • [25] A statistical feature extraction for deep speech emotion recognition in a bilingual scenario
    Sekkate, Sara
    Khalil, Mohammed
    Adib, Abdellah
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (08) : 11443 - 11460
  • [26] Speech Emotion Recognition with Heterogeneous Feature Unification of Deep Neural Network
    Jiang, Wei
    Wang, Zheng
    Jin, Jesse S.
    Han, Xianfeng
    Li, Chunguang
    SENSORS, 2019, 19 (12)
  • [27] Facial emotion recognition with a reduced feature set for video game and metaverse avatars
    Bellenger, Darren
    Chen, Minsi
    Xu, Zhijie
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2024, 35 (02)
  • [28] Combining Modality Specific Deep Neural Networks for Emotion Recognition in Video
    Kahou, Samira Ebrahimi
    Pal, Christopher
    Bouthillier, Xavier
    Froumenty, Pierre
    Gulcehre, Caglar
    Memisevic, Roland
    Vincent, Pascal
    Courville, Aaron
    Bengio, Yoshua
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 543 - 550
  • [29] Emotion Recognition Based on Multi-Composition Deep Forest and Transferred Convolutional Neural Network
    Liu, Xiaobo
    Yin, Xu
    Wang, Min
    Cai, Yaoming
    Qi, Guang
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2019, 23 (05) : 883 - 890
  • [30] Dynamic Facial Emotion Recognition Using Deep Spatial Feature and Handcrafted Spatiotemporal Feature On Spark
    Uddin, Md Azher
    Lee, Young-Koo
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020), 2020, : 21 - 27