KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer

被引:0
|
作者
Sohrab, Mohammad Golam [1 ]
Miwa, Makoto [1 ,2 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Artificial Intelligence Res Ctr, Tokyo, Japan
[2] Toyota Technol Inst, Nagoya, Aichi, Japan
关键词
Natural language processing; Transfer learning; Language model; Sequence-to-Sequence; Language understanding and generation; Information extraction; Machine translation;
D O I
10.1007/978-3-031-70378-2_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub (https://github.com/aistairc/kat5) under the Apache 2.0 License.
引用
收藏
页码:157 / 173
页数:17
相关论文
共 50 条
  • [31] VIHATET5: Enhancing Hate Speech Detection in Vietnamese With a Unified Text-to-Text Transformer Model
    Luan Thanh Nguyen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5948 - 5961
  • [32] Knowledge-Aware Procedural Text Understanding with Multi-Stage Training
    Zhang, Zhihan
    Geng, Xiubo
    Qin, Tao
    Wu, Yunfang
    Jiang, Daxin
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 3512 - 3523
  • [33] Knowledge-Aware Text-Image Retrieval for Remote Sensing Images
    Mi, Li
    Dai, Xianjie
    Castillo-Navarro, Javiera
    Tuia, Devis
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [34] EXTRACTING SEVERITY MARKERS FROM UNSTRUCTURED CLINICAL DATA OF CONGESTIVE HEART FAILURE PATIENTS USING A PRETRAINED TEXT-TO-TEXT TRANSFER TRANSFORMER MODEL
    Kumar, V
    Rasouliyan, L.
    Althoff, A. G.
    Long, S.
    Zema, C.
    Rao, M. B.
    VALUE IN HEALTH, 2022, 25 (07) : S526 - S526
  • [35] Context Aware Automatic Subjective and Objective Question Generation using Fast Text to Text Transfer Learning
    Agrawal, Arpit
    Shukla, Pragya
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (04) : 456 - 463
  • [36] Text Feature Adversarial Learning for Text Generation With Knowledge Transfer From GPT2
    Zhang, Hao
    Cong, Yulai
    Wang, Zhengjue
    Zhang, Lei
    Zhao, Miaoyun
    Chen, Liqun
    Si, Shijing
    Henao, Ricardo
    Carin, Lawrence
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 6558 - 6569
  • [37] Knowledge-Aware Deep Dual Networks for Text-Based Mortality Prediction
    Liu, Ning
    Lu, Pan
    Zhang, Wei
    Wang, Jianyong
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1406 - 1417
  • [38] Towards a Universal Text Classifier: Transfer Learning using Encyclopedic Knowledge
    Wang, Pu
    Domeniconi, Carlotta
    2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 435 - 440
  • [39] Learning to Transfer Prompts for Text Generation
    Li, Junyi
    Tang, Tianyi
    Nie, Jian-Yun
    Wen, Ji-Rong
    Zhaol, Wayne Xin
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3506 - 3518
  • [40] Transfer Learning beyond Text Classification
    Yang, Qiang
    ADVANCES IN MACHINE LEARNING, PROCEEDINGS, 2009, 5828 : 10 - 22