共 50 条
- [1] Utilizing Text-Video Relationships: A Text-Driven Multi-modal Fusion Framework for Moment Retrieval and Highlight Detection PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT X, 2025, 15040 : 254 - 268
- [2] CRET: Cross-Modal Retrieval Transformer for Efficient Text-Video Retrieval PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 949 - 959
- [4] Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3376 - 3383
- [6] Efficient text-to-video retrieval via multi-modal multi-tagger derived pre-screening Visual Intelligence, 2025, 3 (1):
- [9] Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 396 - 404
- [10] Fine-grained Cross-modal Alignment Network for Text-Video Retrieval PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3826 - 3834