共 50 条
- [41] Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [42] Adoption of social robots as pedagogical aids for efficient learning of second language vocabulary to children JOURNAL OF E-LEARNING AND KNOWLEDGE SOCIETY, 2021, 17 (03): : 119 - 126
- [44] A developmental model of audio-visual attention (MAVA) for bimodal language learning in infants and robots SCIENTIFIC REPORTS, 2024, 14 (01):
- [45] BRIDGING HIGH-QUALITY AUDIO AND VIDEO VIA LANGUAGE FOR SOUND EFFECTS RETRIEVAL FROM VISUAL QUERIES 2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
- [46] Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8992 - 8999
- [47] An Actor-Based Video Segmentation System Using Visual and Audio Information in E-Learning ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 3, PROCEEDINGS, 2008, : 575 - 580
- [48] CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing COMPUTER VISION - ECCV 2024, PT XI, 2025, 15069 : 1 - 17