KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer

被引：0

作者：

Sohrab, Mohammad Golam ^{[1
]}

Miwa, Makoto ^{[1
,2
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Artificial Intelligence Res Ctr, Tokyo, Japan

[2] Toyota Technol Inst, Nagoya, Aichi, Japan

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT IX, ECML PKDD 2024 | 2024年 / 14949卷

关键词：

Natural language processing; Transfer learning; Language model; Sequence-to-Sequence; Language understanding and generation; Information extraction; Machine translation;

D O I：

10.1007/978-3-031-70378-2_10

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub (https://github.com/aistairc/kat5) under the Apache 2.0 License.

引用

页码：157 / 173

页数：17

共 50 条

[21] Compositional Zero-Shot Domain Transfer with Text-to-Text Models
Liu, Fangyu
Liu, Qianchu
Bannur, Shruthi
Perez-Garcia, Fernando
Usuyama, Naoto
Zhang, Sheng
Naumann, Tristan
Nori, Aditya
Poon, Hoifung
Alvarez-Valle, Javier
Oktay, Ozan
Hyland, Stephanie L.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 1097 - 1113
[22] UniTRec: A Unified Text-to-Text Transformer and Joint Contrastive Learning Framework for Text-based Recommendation
Mao, Zhiming
Wang, Huimin
Du, Yiming
Wong, Kam-Fai
61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1160 - 1170
[23] Knowledge-Aware Meta-learning for Low-Resource Text Classification
Yao, Huaxiu
Wu, Yingxin
Al-Shedivat, Maruan
Xing, Eric P.
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1814 - 1821
[24] LegalT5-ABSA: a framework for aspect-based sentiment analysis of parties in legal cases using text-to-text transfer transformer
Melal, Sevda Rezaei
Melal, Sepehr Rezaei
Khanjani-Shiraz, Rashed
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2025,
[25] Cross-Domain Transfer of Generative Explanations Using Text-to-Text Models
Erliksson, Karl Fredrik
Arpteg, Anders
Matskin, Mihhail
Payberah, Amir H.
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 76 - 89
[26] mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer
Xue, Linting
Constant, Noah
Roberts, Adam
Kale, Mihir
Al-Rfou, Rami
Siddhant, Aditya
Barua, Aditya
Raffel, Colin
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 483 - 498
[27] Automatic Ellipsis Reconstruction in Coordinated German Sentences Based on Text-to-Text Transfer Transformers
Schmidt, Marisa
Harbusch, Karin
Memmesheimer, Denis
TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT I, 2024, 15048 : 171 - 183
[28] Turkish abstractive text document summarization using text to text transfer transformer
Ay, Betul
Ertam, Fatih
Fidan, Guven
Aydin, Galip
ALEXANDRIA ENGINEERING JOURNAL, 2023, 68 : 1 - 13
[29] Bootstrap an End-to-end ASR System by Multilingual Training, Transfer Learning, Text-to-text Mapping and Synthetic Audio
Giollo, Manuel
Gunceler, Deniz
Liu, Yulan
Willett, Daniel
INTERSPEECH 2021, 2021, : 2416 - 2420
[30] Masked transformer through knowledge distillation for unsupervised text style transfer
Scalercio, Arthur
Paes, Aline
NATURAL LANGUAGE ENGINEERING, 2024, 30 (05) : 973 - 1008

← 1 2 3 4 5 →