共 5 条
- [1] Visually Guided Binaural Audio Generation Method Based on Hierarchical Feature Encoding and Decoding Ruan Jian Xue Bao/Journal of Software, 2024, 35 (05): : 2165 - 2175
- [2] QUALIFIER: Question-Guided Self-Attentive Multimodal Fusion Network for Audio Visual Scene-Aware Dialog 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2503 - 2511
- [3] TMT: A Transformer-based Modal Translator for Improving Multimodal Sequence Representations in Audio Visual Scene-aware Dialog INTERSPEECH 2020, 2020, : 3501 - 3505
- [4] Hierarchical multimodal attention for end -to -end audio-visual scene -aware dialogue response generation COMPUTER SPEECH AND LANGUAGE, 2020, 63
- [5] END-TO-END AUDIO VISUAL SCENE-AWARE DIALOG USING MULTIMODAL ATTENTION-BASED VIDEO FEATURES 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2352 - 2356