共 50 条
- [42] TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval COMPUTER VISION - ECCV 2022, PT XIV, 2022, 13674 : 319 - 335
- [43] Text-guided visual representation learning for medical image retrieval systems 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 593 - 598
- [44] Video and Text Matching with Conditioned Embeddings 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 478 - 487
- [46] CelebV-Text: A Large-Scale Facial Text-Video Dataset 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14805 - 14814
- [48] Text-Guided Object Detector for Multi-modal Video Question Answering 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1032 - 1042
- [49] Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4626 - 4636
- [50] Fine-grained Cross-modal Alignment Network for Text-Video Retrieval PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3826 - 3834