Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering

被引:11
|
作者
Bogin, Ben [1 ]
Subramanian, Sanjay [2 ]
Gardner, Matt [2 ]
Berant, Jonathan [1 ,2 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] Allen Inst AI, Seattle, WA USA
基金
欧洲研究理事会;
关键词
46;
D O I
10.1162/tacl_a_00361
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Answering questions that involve multi-step reasoning requires decomposing them and using the answers of intermediate steps to reach the final answer. However, state-of-the-art models in grounded question answering often do not explicitly perform decomposition, leading to difficulties in generalization to out-of-distribution examples. In this work, we propose a model that computes a representation and denotation for all question spans in a bottom-up, compositional manner using a CKY-style parser. Our model induces latent trees, driven by end-to-end (the answer) supervision only. We show that this inductive bias towards tree structures dramatically improves systematic generalization to out-of-distribution examples, compared to strong baselines on an arithmetic expressions benchmark as well as on CLOSURE, a dataset that focuses on systematic generalization for grounded question answering. On this challenging dataset, our model reaches an accuracy of 96.1%, significantly higher than prior models that almost perfectly solve the task on a random, in-distribution split.
引用
收藏
页码:195 / 210
页数:16
相关论文
共 50 条
  • [1] Grounded Graph Decoding Improves Compositional Generalization in Question Answering
    Gai, Yu
    Jain, Paras
    Zhang, Wendi
    Gonzalez, Joseph
    Song, Dawn
    Stoica, Ion
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1829 - 1838
  • [2] Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
    Saqur, Raeid
    Narasimhan, Karthik
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [3] Compositional Networks Enable Systematic Generalization for Grounded Language Understanding
    Kuo, Yen-Ling
    Katz, Boris
    Barbu, Andrei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 216 - 226
  • [4] Transformer Module Networks for Systematic Generalization in Visual Question Answering
    Yamada, Moyuru
    D'Amario, Vanessa
    Takemoto, Kentaro
    Boix, Xavier
    Sasaki, Tomotake
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10096 - 10105
  • [5] Compositional Generalization with Grounded Language Models
    Wold, Sondre
    Simon, Etienne
    Georges, Lucas
    Charpentier, Gabriel
    Kostylev, Egor V.
    Velldal, Erik
    Ovrelid, Lilja
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3447 - 3460
  • [6] BERT Representations for Video Question Answering
    Yang, Zekun
    Garcia, Noa
    Chu, Chenhui
    Otani, Mayu
    Nakashima, Yuta
    Takemura, Haruo
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1545 - 1554
  • [7] TVQA: Localized, Compositional Video Question Answering
    Lei, Jie
    Yu, Licheng
    Bansal, Mohit
    Berg, Tamara L.
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1369 - 1379
  • [8] Compositional question answering: A divide and conquer approach
    Oh, Hyo-Jung
    Sung, Ki-Youn
    Tang, Myung-Gil
    Myaeng, Sung Hyon
    INFORMATION PROCESSING & MANAGEMENT, 2011, 47 (06) : 808 - 824
  • [9] Neural Compositional Denotational Semantics for Question Answering
    Gupta, Nitish
    Lewis, Mike
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2152 - 2161
  • [10] Measuring Compositional Consistency for Video Question Answering
    Gandhi, Mona
    Gul, Mustafa Omer
    Prakash, Eva
    Grunde-McLaughlin, Madeleine
    Krishna, Ranjay
    Agrawala, Maneesh
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5036 - 5045