共 50 条
- [1] Clotho-AQA: A Crowdsourced Dataset for Audio Question Answering 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1140 - 1144
- [2] AUDIO DIFFERENCE LEARNING FOR AUDIO CAPTIONING 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 1456 - 1460
- [4] TRAINING AUDIO CAPTIONING MODELS WITHOUT AUDIO 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 371 - 375
- [5] Audio Captioning Based on Combined Audio and Semantic Embeddings 2020 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2020), 2020, : 41 - 48
- [8] MEMECAP: A Dataset for Captioning and Interpreting Memes 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1433 - 1445
- [10] JOINT SPEECH RECOGNITION AND AUDIO CAPTIONING 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7892 - 7896