LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization

被引：0

作者：

Nguyen, Laura ^{[1
,3
]}

Scialom, Thomas ^{[1
,2
]}

Piwowarski, Benjamin ^{[3
]}

Staiano, Jacopo ^{[1
,4
]}

机构：

[1] reciTAL, Paris, France

[2] Meta AI, Paris, France

[3] Sorbonne Univ, CNRS, ISIR, F-75005 Paris, France

[4] Univ Trento, Trento, TN, Italy

来源：

17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text Summarization is a popular task and an active area of research for the Natural Language Processing community. It requires accounting for long input texts, a characteristic which poses computational challenges for neural models. Moreover, real-world documents come in a variety of complex, visually-rich, layouts. This information is of great relevance, whether to highlight salient content or to encode long-range interactions between textual passages. Yet, all publicly available summarization datasets only provide plain text content. To facilitate research on how to exploit visual/layout information to better capture longrange dependencies in summarization models, we present LoRaLay, a collection of datasets for long-range summarization with accompanying visual/layout information. We extend existing and popular English datasets (arXiv and PubMed) with visual/layout information and propose four novel datasets - consistently built from scholar resources - covering French, Spanish, Portuguese, and Korean languages. Further, we propose new baselines merging layout-aware and long-range models - two orthogonal approaches - and obtain state-of-theart results, showing the importance of combining both lines of research.

引用

页码：636 / 651

页数：16

共 5 条

[1] DocLLM: A Layout-Aware Generative Language Model for Multimodal Document Understanding
Wang, Dongsheng
Raman, Natraj
Sibue, Mathieu
Ma, Zhiqiang
Babkin, Petr
Kaur, Simerjot
Pei, Yulong
Nourbakhsh, Armineh
Liu, Xiaomo
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 8529 - 8548
[2] Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration
Zhang, Zhenyu
Yu, Bowen
Yu, Haiyang
Liu, Tingwen
Fu, Cheng
Li, Jingyang
Tang, Chengguang
Sun, Jian
Li, Yongbin
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7252 - 7260
[3] XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding
Gu, Zhangxuan
Meng, Changhua
Wang, Ke
Lan, Jun
Wang, Weiqiang
Gu, Ming
Zhang, Liqing
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4573 - 4582
[4] TIB: A Dataset for Abstractive Summarization of Long Multimodal Videoconference Records
Gigant, Theo
Dufaux, Frederic
Guinaudeau, Camille
Decombas, Marc
20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023, 2023, : 61 - 70
[5] Presenting an Order-Aware Multimodal Fusion Framework for Financial Advisory Summarization With an Exclusive Video Dataset
Das, Sarmistha
Ghosh, Samrat
Tiwari, Abhisek
Lynghoi, R. E. Zera Marveen
Saha, Sriparna
Murad, Zak
Maurya, Alka
IEEE ACCESS, 2025, 13 : 48367 - 48379

← 1 →