共 15 条
- [1] LI D X, CHEN Y M, GAO M K, Et al., Multimodal gesture recognition using densely connected convolution and BLSTM, 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3365-3370, (2018)
- [2] ABAVISANI M, JOZE H R V, PATEL V M., Improving the performance of unimodal dynamic hand-gesture recognition with multimodal training, 2019 IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1165-1174, (2020)
- [3] WU C Y, ZAHEER M, HU H X, Et al., Compressed video action recognition, 2018 IEEE / CVF Conference on Computer Vision and Pattern Recognition, pp. 6026-6035, (2018)
- [4] XIE X Y, ZHAO H, JIANG L., Dynamic gesture recognition based on characteristics of encoded video data, Journal of Beijing University of Posts and Telecommunications, 43, 5, pp. 91-97, (2020)
- [5] WANG L M, XIONG Y J, WANG Z, Et al., Temporal segment networks: towards good practices for deep action recognition, European Conference on Computer Vision, pp. 20-36, (2016)
- [6] LIN J, GAN C, HAN S., TSM: temporal shift module for efficient video understanding, 2019 IEEE / CVF International Conference on Computer Vision (ICCV), pp. 7082-7092, (2020)
- [7] WANG Q L, WU B G, ZHU P F, Et al., ECA-net: efficient channel attention for deep convolutional neural networks, 2020 IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11531-11539, (2020)
- [8] CHO K, VAN MERRIENBOER B, GULCEHRE C, Et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724-1734, (2014)
- [9] OHN-BAR E, TRIVEDI M M., Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations, IEEE Transactions on Intelligent Transportation Systems, 15, 6, pp. 2368-2377, (2014)
- [10] TRAN D, BOURDEV L, FERGUS R, Et al., Learning spa-tiotemporal features with 3D convolutional networks, 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4489-4497, (2016)