KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer

被引:0
|
作者
Sohrab, Mohammad Golam [1 ]
Miwa, Makoto [1 ,2 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Artificial Intelligence Res Ctr, Tokyo, Japan
[2] Toyota Technol Inst, Nagoya, Aichi, Japan
关键词
Natural language processing; Transfer learning; Language model; Sequence-to-Sequence; Language understanding and generation; Information extraction; Machine translation;
D O I
10.1007/978-3-031-70378-2_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub (https://github.com/aistairc/kat5) under the Apache 2.0 License.
引用
收藏
页码:157 / 173
页数:17
相关论文
共 50 条
  • [1] Exploring the limits of transfer learning with a unified text-to-text transformer
    Raffel, Colin
    Shazeer, Noam
    Roberts, Adam
    Lee, Katherine
    Narang, Sharan
    Matena, Michael
    Zhou, Yanqi
    Li, Wei
    Liu, Peter J.
    Journal of Machine Learning Research, 2020, 21
  • [2] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
    Raffel, Colin
    Shazeer, Noam
    Roberts, Adam
    Lee, Katherine
    Narang, Sharan
    Matena, Michael
    Zhou, Yanqi
    Li, Wei
    Liu, Peter J.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [3] Homograph Disambiguation with Text-to-Text Transfer Transformer
    Rezackova, Marketa
    Tihelka, Daniel
    Matousek, Jindrich
    INTERSPEECH 2024, 2024, : 2785 - 2789
  • [4] Leveraging sensory knowledge into Text-to-Text Transfer Transformer for enhanced emotion analysis
    Zhao, Qingqing
    Xia, Yuhan
    Long, Yunfei
    Xu, Ge
    Wang, Jia
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
  • [5] HaT5: Hate Language Identification using Text-to-Text Transfer Transformer
    Sabry, Sana Sabah
    Adewumi, Tosin
    Abid, Nosheen
    Kovacs, Gyorgy
    Liwicki, Foteini
    Liwicki, Marcus
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] Evaluation of Transfer Learning for Polish with a Text-to-Text Model
    Chrabrowa, Aleksandra
    Dragan, Lukasz
    Grzegorczyk, Karol
    Kajtoch, Dariusz
    Koszowski, Mikolaj
    Mroczkowski, Robert
    Rybak, Piotr
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4374 - 4394
  • [7] Text-to-Text Transfer Transformer Phrasing Model Using Enriched Text Input
    Rezackova, Marketa
    Matousek, Jindrich
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 389 - 400
  • [8] Keyword Extraction from Short Texts with a Text-to-Text Transfer Transformer
    Pezik, Piotr
    Mikolajczyk, Agnieszka
    Wawrzynski, Adam
    Niton, Bartlomiej
    Ogrodniczuk, Maciej
    RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, 2022, 1716 : 530 - 542
  • [9] Investigating Numeracy Learning Ability of a Text-to-Text Transfer Model
    Pal, Kuntal Kumar
    Baral, Chitta
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3095 - 3101
  • [10] Enhance Text-to-Text Transfer Transformer with Generated Questions for Thai Question Answering
    Phakmongkol, Puri
    Vateekul, Peerapon
    APPLIED SCIENCES-BASEL, 2021, 11 (21):