共 7 条
- [2] Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5038 - 5047
- [4] VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [5] ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22168 - 22178
- [6] Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery COMPUTER VISION-ECCV 2024, PT LXXXII, 2025, 15140 : 379 - 397
- [7] Active vision and image/video understanding systems built upon network-symbolic models for perception-based navigation of mobile robots in real-world environments MOBILE ROBOTS XVII, 2004, 5609 : 35 - 49