共 50 条
- [31] LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15200 - 15212
- [32] Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question Answering PROCEEDINGS OF THE 2023 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2023, 2023, : 169 - 176
- [33] LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 2579 - 2591
- [34] RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 547 - 556
- [36] WenLan: Efficient Large-Scale Multi-Modal Pre-Training on Real World Data MMPT '21: PROCEEDINGS OF THE 2021 WORKSHOP ON MULTI-MODAL PRE-TRAINING FOR MULTIMEDIA UNDERSTANDING, 2021, : 3 - 3
- [37] Multi-modal U-Nets with Boundary Loss and Pre-training for Brain Tumor Segmentation BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES (BRAINLES 2019), PT II, 2020, 11993 : 135 - 147
- [39] Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15888 - 15899