共 50 条
- [31] Multimodal-enhanced hierarchical attention network for video captioning Multimedia Systems, 2023, 29 : 2469 - 2482
- [34] Multimodal Interaction Fusion Network Based on Transformer for Video Captioning ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2022, PT I, 2022, 1700 : 21 - 36
- [37] Image/video captioning Ushiku, Yoshitaka, 2018, Inst. of Image Information and Television Engineers (72): : 650 - 654
- [39] Semantic Enhanced Encoder-Decoder Network (SEN) for Video Captioning PROCEEDINGS OF THE 2ND WORKSHOP ON MULTIMEDIA FOR ACCESSIBLE HUMAN COMPUTER INTERFACES (MAHCI '19), 2019, : 25 - 32