KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer

被引:0
|
作者
Sohrab, Mohammad Golam [1 ]
Miwa, Makoto [1 ,2 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Artificial Intelligence Res Ctr, Tokyo, Japan
[2] Toyota Technol Inst, Nagoya, Aichi, Japan
关键词
Natural language processing; Transfer learning; Language model; Sequence-to-Sequence; Language understanding and generation; Information extraction; Machine translation;
D O I
10.1007/978-3-031-70378-2_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub (https://github.com/aistairc/kat5) under the Apache 2.0 License.
引用
收藏
页码:157 / 173
页数:17
相关论文
共 50 条
  • [41] Knowledge-aware recommendation based on hypergraph representation learning and transformer model optimization
    Zuo, Yuqi
    Zhang, Yunfeng
    Zhang, Qiuyue
    Zhang, Wenbo
    APPLIED INTELLIGENCE, 2025, 55 (05)
  • [42] Learning to Selectively Transfer: Reinforced Transfer Learning for Deep Text Matching
    Qu, Chen
    Ji, Feng
    Qiu, Minghui
    Yang, Liu
    Min, Zhiyu
    Chen, Haiqing
    Huang, Jun
    Croft, W. Bruce
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 699 - 707
  • [43] Construction of Tourism Attraction Knowledge Graph Based on Web Text and Transfer Learning
    Gao J.
    Lu F.
    Peng P.
    Xu Y.
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2022, 47 (08): : 1191 - 1200and1219
  • [44] Turkish Text Classification with Machine Learning and Transfer Learning
    Aydogan, Murat
    Karci, Ali
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [45] Deep Learning for Text Style Transfer: A Survey
    Jin, Di
    Jin, Zhijing
    Hu, Zhiting
    Vechtomova, Olga
    Mihalcea, Rada
    COMPUTATIONAL LINGUISTICS, 2022, 48 (01) : 155 - 205
  • [46] Source Free Transfer Learning for Text Classification
    Lu, Zhongqi
    Zhu, Yin
    Pan, Sinno Jialin
    Xiang, Evan Wei
    Wang, Yujing
    Yang, Qiang
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 122 - 128
  • [47] Automatic transfer learning for short text mining
    Yang, Lei
    Zhang, Jianpei
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2017,
  • [48] COOPERATIVE DYADS - IMPACT ON TEXT LEARNING AND TRANSFER
    MCDONALD, BA
    LARSON, CO
    DANSEREAU, DF
    SPURLIN, JE
    CONTEMPORARY EDUCATIONAL PSYCHOLOGY, 1985, 10 (04) : 369 - 377
  • [49] Transductive Learning for Unsupervised Text Style Transfer
    Xiao, Fei
    Pang, Liang
    Lan, Yanyan
    Wang, Yan
    Shen, Huawei
    Cheng, Xueqi
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 2510 - 2521
  • [50] On Learning Text Style Transfer with Direct Rewards
    Liu, Yixin
    Neubig, Graham
    Wieting, John
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 4262 - 4273