共 50 条
- [41] ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3135 - 3146
- [42] EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [43] Fine-Grained Semantically Aligned Vision-Language Pre-Training ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [45] BUS : Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2888 - 2898
- [47] IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4573 - 4583
- [48] MAKE: Vision-Language Pre-training based Product Retrieval in Taobao Search COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 356 - 360
- [49] VLMO: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [50] Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5120 - 5131