共 50 条
- [41] STVGBert: A Visual-linguistic Transformer based Framework for Spatio-temporal Video Grounding 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1513 - 1522
- [44] PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation INTERSPEECH 2021, 2021, : 4568 - 4572
- [46] VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 73 - 91