共 50 条
- [1] Compact Trilinear Interaction for Visual Question Answering 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 392 - 401
- [2] Trilinear Distillation Learning and Question Feature Capturing for Medical Visual Question Answering NEURAL COMPUTING FOR ADVANCED APPLICATIONS, NCAA 2024, PT III, 2025, 2183 : 162 - 177
- [3] Multimodal Learning and Reasoning for Visual Question Answering ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
- [4] LEARNING REPRESENTATIONS FROM EXPLAINABLE AND CONNECTIONIST APPROACHES FOR VISUAL QUESTION ANSWERING 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6420 - 6424
- [6] Multimodal Attention for Visual Question Answering INTELLIGENT COMPUTING, VOL 1, 2019, 858 : 783 - 792
- [7] Fusing Visual and Textual Representations via Multi-layer Fusing Transformers for Vietnamese Visual Question Answering ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2024, PT II, 2024, 2166 : 185 - 196
- [8] Faithful Multimodal Explanation for Visual Question Answering BLACKBOXNLP WORKSHOP ON ANALYZING AND INTERPRETING NEURAL NETWORKS FOR NLP AT ACL 2019, 2019, : 103 - 112
- [9] Visual Question Answering with Textual Representations for Images 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3147 - 3150
- [10] Adaptive Transformers for Learning Multimodal Representations 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 1 - 7