共 5 条
- [1] DocLLM: A Layout-Aware Generative Language Model for Multimodal Document Understanding PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 8529 - 8548
- [2] Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7252 - 7260
- [3] XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4573 - 4582
- [4] TIB: A Dataset for Abstractive Summarization of Long Multimodal Videoconference Records 20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023, 2023, : 61 - 70
- [5] Presenting an Order-Aware Multimodal Fusion Framework for Financial Advisory Summarization With an Exclusive Video Dataset IEEE ACCESS, 2025, 13 : 48367 - 48379