Phrase-aware Unsupervised Constituency Parsing

被引:0
|
作者
Gu, Xiaotao [1 ]
Shen, Yikang [3 ,4 ]
Shen, Jiaming [5 ]
Shang, Jingbo [2 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] Univ Calif San Diego, San Diego, CA 92103 USA
[3] Univ Montreal, Mila, Montreal, PQ, Canada
[4] Tencent Inc, WeChat AI, Pattern Recognit Ctr, Beijing, Peoples R China
[5] Google Res, Mountain View, CA USA
来源
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) | 2022年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies have achieved inspiring success in unsupervised grammar induction using masked language modeling (MLM) as the proxy task. Despite their high accuracy in identifying low-level structures, prior arts tend to struggle in capturing high-level structures like clauses, since the MLM task usually only requires information from local context. In this work, we revisit LM-based constituency parsing from a phrase-centered perspective. Inspired by the natural reading process of human readers, we propose to regularize the parser with phrases extracted by an unsupervised phrase tagger to help the LM model quickly manage low-level structures. For a better understanding of high-level structures, we propose a phrase-guided masking strategy for LM to emphasize more on reconstructing non-phrase words. We show that the initial phrase regularization serves as an effective bootstrap, and phrase-guided masking improves the identification of high-level structures. Experiments on the public benchmark with two different backbone models demonstrate the effectiveness and generality of our method.
引用
收藏
页码:6406 / 6415
页数:10
相关论文
共 50 条
  • [1] Unsupervised Parsing via Constituency Tests
    Cao, Steven
    Kitaev, Nikita
    Klein, Dan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4798 - 4808
  • [2] Rule Augmented Unsupervised Constituency Parsing
    Sahay, Atul
    Nasery, Anshul
    Maheshwari, Ayush
    Ramakrishnan, Ganesh
    Iyer, Rishabh
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4923 - 4932
  • [3] Word Segmentation as Unsupervised Constituency Parsing
    Alhama, Raquel G.
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4103 - 4112
  • [4] On the Role of Supervision in Unsupervised Constituency Parsing
    Shi, Haoyue
    Livescu, Karen
    Gimpel, Kevin
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7611 - 7621
  • [5] An Empirical Comparison of Unsupervised Constituency Parsing Methods
    Li, Jun
    Cao, Yifan
    Cai, Jiong
    Jiang, Yong
    Tu, Kewei
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3278 - 3283
  • [6] Phrase-Aware Financial Sentiment Analysis Based on Constituent Syntax
    Xiang, Chunli
    Zhang, Junchi
    Zhou, Jun
    Li, Fei
    Teng, Chong
    Ji, Donghong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1994 - 2005
  • [7] Unsupervised Discourse Constituency Parsing Using Viterbi EM
    Nishida, Noriki
    Nakayama, Hideki
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 (08) : 215 - 230
  • [8] Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars
    Yang, Songlin
    Levy, Roger P.
    Kim, Yoon
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 5747 - 5766
  • [9] Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads
    Li, Bowen
    Kim, Taeuk
    Amplayo, Reinald Kim
    Keller, Frank
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 409 - 424
  • [10] Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing
    Shayegh, Behzad
    Wen, Yugiao
    Mou, Lili
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 15135 - 15156