共 50 条
- [22] Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5288 - 5304
- [23] Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23034 - 23044
- [24] Text-to-Image Retrieval Based on Incremental Association via Multimodal Hypernetworks PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 3245 - 3250
- [26] Semantic Enhanced Sketch Based Image Retrieval with Incomplete Multimodal Query 2020 IEEE SIXTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2020), 2020, : 86 - 93
- [28] Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models MULTIMEDIA MODELING, MMM 2025, PT IV, 2025, 15523 : 413 - 427
- [29] Understanding image-text relations and news values for multimodal news analysis FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
- [30] Sequential Structured Fusion of Image and Text for Enhanced Multimodal Abstractive Summarization NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT IV, NLPCC 2024, 2025, 15362 : 290 - 302