KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer

被引：0

作者：

Sohrab, Mohammad Golam ^{[1
]}

Miwa, Makoto ^{[1
,2
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Artificial Intelligence Res Ctr, Tokyo, Japan

[2] Toyota Technol Inst, Nagoya, Aichi, Japan

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT IX, ECML PKDD 2024 | 2024年 / 14949卷

关键词：

Natural language processing; Transfer learning; Language model; Sequence-to-Sequence; Language understanding and generation; Information extraction; Machine translation;

D O I：

10.1007/978-3-031-70378-2_10

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub (https://github.com/aistairc/kat5) under the Apache 2.0 License.

引用

页码：157 / 173

页数：17

共 50 条

[31] VIHATET5: Enhancing Hate Speech Detection in Vietnamese With a Unified Text-to-Text Transformer Model
Luan Thanh Nguyen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5948 - 5961
[32] Knowledge-Aware Procedural Text Understanding with Multi-Stage Training
Zhang, Zhihan
Geng, Xiubo
Qin, Tao
Wu, Yunfang
Jiang, Daxin
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 3512 - 3523
[33] Knowledge-Aware Text-Image Retrieval for Remote Sensing Images
Mi, Li
Dai, Xianjie
Castillo-Navarro, Javiera
Tuia, Devis
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[34] EXTRACTING SEVERITY MARKERS FROM UNSTRUCTURED CLINICAL DATA OF CONGESTIVE HEART FAILURE PATIENTS USING A PRETRAINED TEXT-TO-TEXT TRANSFER TRANSFORMER MODEL
Kumar, V
Rasouliyan, L.
Althoff, A. G.
Long, S.
Zema, C.
Rao, M. B.
VALUE IN HEALTH, 2022, 25 (07) : S526 - S526
[35] Context Aware Automatic Subjective and Objective Question Generation using Fast Text to Text Transfer Learning
Agrawal, Arpit
Shukla, Pragya
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (04) : 456 - 463
[36] Text Feature Adversarial Learning for Text Generation With Knowledge Transfer From GPT2
Zhang, Hao
Cong, Yulai
Wang, Zhengjue
Zhang, Lei
Zhao, Miaoyun
Chen, Liqun
Si, Shijing
Henao, Ricardo
Carin, Lawrence
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 6558 - 6569
[37] Knowledge-Aware Deep Dual Networks for Text-Based Mortality Prediction
Liu, Ning
Lu, Pan
Zhang, Wei
Wang, Jianyong
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1406 - 1417
[38] Towards a Universal Text Classifier: Transfer Learning using Encyclopedic Knowledge
Wang, Pu
Domeniconi, Carlotta
2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 435 - 440
[39] Learning to Transfer Prompts for Text Generation
Li, Junyi
Tang, Tianyi
Nie, Jian-Yun
Wen, Ji-Rong
Zhaol, Wayne Xin
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3506 - 3518
[40] Transfer Learning beyond Text Classification
Yang, Qiang
ADVANCES IN MACHINE LEARNING, PROCEEDINGS, 2009, 5828 : 10 - 22

← 1 2 3 4 5 →