Efficient Decomposition Convolution and Temporal Pyramid Network for Video Face Recognition

被引:0
|
作者
Zhou S.-T. [1 ]
Yan X. [1 ]
Xie Z.-S. [1 ]
机构
[1] Glasgow Collge, University of Electronic Science and Technology of China, Chengdu
关键词
Convolutional neural network; Decomposition convolution; Face recognition; Temporal pyramid network; Video analysis;
D O I
10.12178/1001-0548.2020319
中图分类号
学科分类号
摘要
With a large number of video surveillance and camera networks, face recognition of continuous video frames in unrestricted scenes is becoming more and more attractive. Most of the traditional face recognition methods for continuous video frames have the problem of fluctuating recognition results and intensive computing resources. In this paper, an efficient 3D decomposition convolution is designed, which can effectively reduce the computational consumption of video face recognition and improve the recognition accuracy. Finally, we also propose a temporal pyramid network to further effectively mine complementary information between frames to improve the recognition accuracy. The performance has been tested on YTF and PaSC datasets. Copyright ©2021 Journal of University of Electronic Science and Technology of China. All rights reserved.
引用
收藏
页码:231 / 235
页数:4
相关论文
共 20 条
  • [1] DENG J, GUO J, XUE N, Et al., Arcface: Additive angular margin loss for deep face recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690-4699, (2019)
  • [2] WANG H, WANG Y, ZHOU Z, Et al., Cosface: Large margin cosine loss for deep face recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265-5274, (2018)
  • [3] SCHROFF F, KALENICHENKO D, PHILBIN J., Facenet: A unified embedding for face recognition and clustering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815-823, (2015)
  • [4] HASSNER T, MASI I, KIM J, Et al., Pooling faces: Template based face recognition with pooled face images, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 59-67, (2016)
  • [5] RAO Y, LIN J, LU J, Et al., Learning discriminative aggregation network for video-based face recognition, Proceedings of the IEEE International Conference on Computer Vision, pp. 3781-3790, (2017)
  • [6] DING C, TAO D., Trunk-branch ensemble convolutional neural networks for video-based face recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 4, pp. 1002-1014, (2017)
  • [7] YANG J, REN P, ZHANG D, Et al., Neural aggregation network for video face recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4362-4371, (2017)
  • [8] JI S, XU W, YANG M, Et al., 3D convolutional neural networks for human action recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 1, pp. 221-231, (2012)
  • [9] WOLF L, HASSNER T, MAOZ I., Face recognition in unconstrained videos with matched background similarity, CVPR 2011, pp. 529-534, (2011)
  • [10] HE K, ZHANG X, REN S, Et al., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)