共 31 条
- [21] Chang X B, Hospedales T M, Xiang T., Multi-level factorisation net for person re-identification, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2109-2118, (2018)
- [22] Howard A G, Zhu M, Chen B, Et al., Mobilenets: efficient convolutional neural networks for mobile vision applications
- [23] Buades A, Coll B, Morel J M., A non-local algorithm for image denoising, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 60-65, (2005)
- [24] Vaswani A, Shazeer N, Parmar N, Et al., Attention is all you need, Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000-6010, (2017)
- [25] Soomro K, Zamir A R, Shah M., UCF101: a dataset of 101 human actions classes from videos in the wild
- [26] Kuehne H, Jhuang H, Garrote E, Et al., HMDB: a large video database for human motion recognition, Proceedings of the IEEE International Conference on Computer Vision, pp. 2556-2563, (2011)
- [27] Sandler M, Howard A, Zhu M, Et al., Mobilenetv2: inverted residuals and linear bottlenecks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510-4520, (2018)
- [28] Varol G, Laptev I, Schmid C., Long-term temporal convolutions for action recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 6, pp. 1510-1517, (2018)
- [29] Duta I C, Ionescu B, Aizawa K, Et al., Spatio-temporal vector of locally max pooled features for action recognition in videos, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3205-3214, (2017)
- [30] Butt A M, Yousaf M H, Murtaza F, Et al., Agglomerative clustering and residual-VLAD encoding for human action recognition, Applied Sciences, 10, 12, (2020)