共 50 条
- [5] Multi-Modal Fusion Transformer for Visual Question Answering in Remote Sensing IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXVIII, 2022, 12267
- [9] Gated Multi-modal Fusion with Cross-modal Contrastive Learning for Video Question Answering ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 427 - 438