Neural Pipeline for Zero-Shot Data-to-Text Generation

被引:0
|
作者
Kasner, Zdenek [1 ]
Dusek, Ondrej [1 ]
机构
[1] Charles Univ Prague, Fac Math & Phys, Inst Formal & Appl Linguist, Prague, Czech Republic
基金
欧洲研究理事会;
关键词
NATURAL-LANGUAGE GENERATION; OF-THE-ART;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In data-to-text (D2T) generation, training on in-domain data leads to overfitting to the data representation and repeating training data noise. We examine how to avoid finetuning pretrained language models (PLMs) on D2T generation datasets while still taking advantage of surface realization capabilities of PLMs. Inspired by pipeline approaches, we propose to generate text by transforming single-item descriptions with a sequence of modules trained on general-domain text-based operations: ordering, aggregation, and paragraph compression. We train PLMs for performing these operations on a synthetic corpus WIKIFLUENT which we build from English Wikipedia. Our experiments on two major triple-to-text datasets-WebNLG and E2E-show that our approach enables D2T generation from RDF triples in zero-shot settings.(1)
引用
收藏
页码:3914 / 3932
页数:19
相关论文
共 50 条
  • [21] Reinforced Zero-Shot Cross-Lingual Neural Headline Generation
    Ayana
    Chen, Yun
    Yang, Cheng
    Liu, Zhiyuan
    Sun, Maosong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 2572 - 2584
  • [22] Data-to-text Generation with Entity Modeling
    Puduppully, Ratish
    Dong, Li
    Lapata, Mirella
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2023 - 2035
  • [23] Data-to-Text Generation with Style Imitation
    Lin, Shuai
    Wang, Wentao
    Yang, Zichao
    Liang, Xiaodan
    Xu, Frank F.
    Xing, Eric P.
    Hu, Zhiting
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1589 - 1598
  • [24] Compositional Generalization for Data-to-Text Generation
    Xul, Xinnuo
    Titov, Ivan
    Lapata, Mirella
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9299 - 9317
  • [25] Zero-Shot-BERT-Adapters: a Zero-Shot Pipeline for Unknown Intent Detection
    Comi, Daniele
    Christofidellis, Dimitrios
    Piazza, Pier Francesco
    Manica, Matteo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 650 - 663
  • [26] Data-to-text Generation with Macro Planning
    Puduppully, Ratish
    Lapata, Mirella
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 510 - 527
  • [27] Enhancing Neural Data-To-Text Generation Models with External Background Knowledge
    Chen, Shuang
    Wang, Jinpeng
    Feng, Xiaocheng
    Jiang, Feng
    Qin, Bing
    Lin, Chin-Yew
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3022 - 3032
  • [28] ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
    Tewel, Yoad
    Shalev, Yoav
    Schwartz, Idan
    Wolf, Lior
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17897 - 17907
  • [29] Label Augmentation for Zero-Shot Hierarchical Text Classification
    Paletto, Lorenzo
    Basile, Valerio
    Esposito, Roberto
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 7697 - 7706
  • [30] Unified benchmark for zero-shot Turkish text classification
    celik, Emrecan
    Dalyan, Tugba
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)