共 50 条
- [23] NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8312 - 8322
- [24] Unified Visual Relationship Detection with Vision and Language Models 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6939 - 6950
- [26] TALON: Improving Large Language Model Cognition with Tactility-Vision Fusion 2024 IEEE 19TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, ICIEA 2024, 2024,
- [27] Distilling Large Vision-Language Model with Out-of-Distribution Generalizability 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2492 - 2503
- [28] Hierarchical Vision and Language Transformer for Efficient Visual Dialog ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 421 - 432
- [29] IVTP: Instruction-Guided Visual Token Pruning for Large Vision-Language Models COMPUTER VISION - ECCV 2024, PT XVII, 2025, 15075 : 214 - 230
- [30] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13872 - 13882