共 50 条
- [21] Image Captioning in Turkish Language 2019 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2019, : 413 - 417
- [22] Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9847 - 9857
- [23] NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7641 - 7649
- [25] ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
- [27] Vital information matching in vision-and-language navigation FRONTIERS IN NEUROROBOTICS, 2022, 16
- [28] VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [29] Local Slot Attention for Vision-and-Language Navigation PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 545 - 553