共 50 条
- [1] Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7190 - 7198
- [2] Multimodal Pretraining for Dense Video Captioning 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 470 - 490
- [3] Multimodal Interaction Fusion Network Based on Transformer for Video Captioning ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2022, PT I, 2022, 1700 : 21 - 36
- [7] Position embedding fusion on transformer for dense video captioning DEVELOPMENTS OF ARTIFICIAL INTELLIGENCE TECHNOLOGIES IN COMPUTATION AND ROBOTICS, 2020, 12 : 792 - 799
- [8] Cross-Domain Modality Fusion for Dense Video Captioning IEEE Transactions on Artificial Intelligence, 2022, 3 (05): : 763 - 777