KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer

被引:0
|
作者
Sohrab, Mohammad Golam [1 ]
Miwa, Makoto [1 ,2 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Artificial Intelligence Res Ctr, Tokyo, Japan
[2] Toyota Technol Inst, Nagoya, Aichi, Japan
关键词
Natural language processing; Transfer learning; Language model; Sequence-to-Sequence; Language understanding and generation; Information extraction; Machine translation;
D O I
10.1007/978-3-031-70378-2_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub (https://github.com/aistairc/kat5) under the Apache 2.0 License.
引用
收藏
页码:157 / 173
页数:17
相关论文
共 50 条
  • [21] Compositional Zero-Shot Domain Transfer with Text-to-Text Models
    Liu, Fangyu
    Liu, Qianchu
    Bannur, Shruthi
    Perez-Garcia, Fernando
    Usuyama, Naoto
    Zhang, Sheng
    Naumann, Tristan
    Nori, Aditya
    Poon, Hoifung
    Alvarez-Valle, Javier
    Oktay, Ozan
    Hyland, Stephanie L.
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 1097 - 1113
  • [22] UniTRec: A Unified Text-to-Text Transformer and Joint Contrastive Learning Framework for Text-based Recommendation
    Mao, Zhiming
    Wang, Huimin
    Du, Yiming
    Wong, Kam-Fai
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1160 - 1170
  • [23] Knowledge-Aware Meta-learning for Low-Resource Text Classification
    Yao, Huaxiu
    Wu, Yingxin
    Al-Shedivat, Maruan
    Xing, Eric P.
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1814 - 1821
  • [24] LegalT5-ABSA: a framework for aspect-based sentiment analysis of parties in legal cases using text-to-text transfer transformer
    Melal, Sevda Rezaei
    Melal, Sepehr Rezaei
    Khanjani-Shiraz, Rashed
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2025,
  • [25] Cross-Domain Transfer of Generative Explanations Using Text-to-Text Models
    Erliksson, Karl Fredrik
    Arpteg, Anders
    Matskin, Mihhail
    Payberah, Amir H.
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 76 - 89
  • [26] mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer
    Xue, Linting
    Constant, Noah
    Roberts, Adam
    Kale, Mihir
    Al-Rfou, Rami
    Siddhant, Aditya
    Barua, Aditya
    Raffel, Colin
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 483 - 498
  • [27] Automatic Ellipsis Reconstruction in Coordinated German Sentences Based on Text-to-Text Transfer Transformers
    Schmidt, Marisa
    Harbusch, Karin
    Memmesheimer, Denis
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT I, 2024, 15048 : 171 - 183
  • [28] Turkish abstractive text document summarization using text to text transfer transformer
    Ay, Betul
    Ertam, Fatih
    Fidan, Guven
    Aydin, Galip
    ALEXANDRIA ENGINEERING JOURNAL, 2023, 68 : 1 - 13
  • [29] Bootstrap an End-to-end ASR System by Multilingual Training, Transfer Learning, Text-to-text Mapping and Synthetic Audio
    Giollo, Manuel
    Gunceler, Deniz
    Liu, Yulan
    Willett, Daniel
    INTERSPEECH 2021, 2021, : 2416 - 2420
  • [30] Masked transformer through knowledge distillation for unsupervised text style transfer
    Scalercio, Arthur
    Paes, Aline
    NATURAL LANGUAGE ENGINEERING, 2024, 30 (05) : 973 - 1008