Efficient parsing for Information Extraction

被引:0
|
作者
Basili, R [1 ]
Pazienza, MT [1 ]
Zanzotto, FM [1 ]
机构
[1] Univ Roma Tor Vergata, Dipartimento Informat Sist & Prod, I-00173 Rome, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several (and successfull) Information Extraction systems have recently replaced the core parsing components with shallow but more efficient recognizers. In this paper we argue that the absence of an underlying grammatical recognizer, given the complex nature of several (non-english) languages, is a strong limitation for text processing functionalities, like those an IE system needs. We propose a robust and efficient syntactic recognizer mainly aimed to capture grammatical information crucial for several linguistic and non linguistic inferences. The proposed system is based on a novel architecture exploiting two major principles: lexicalization and stratification of the parsing process. As several linguistic theories (e.g. HPSG) and parsing frameworks (e.g. LTAG, SLTAG, lexicalized probabilistic parsing) suggest, lexicon-driven systems ensure the suitable forms of grammatical control for many complex phenomena. In our system an analysis guided by information on typical verb projections (e.g. verb subcategorization structures) is coupled with extended locality constraints (i.e. recognition of clause boundaries). Futhermore, stratification is also employed. A cascade of processing steps starts from chunk recognition and proceeds through clause analysis to dependency detection. Recognition of chunks allows to minimize the input ambiguity to the remaining phases. The resulting system is thus robust against ungrammatical phenomena (e.g. complex clause embedding, misspellings, unknown words). Efficiency is also retained, although ambiguous phenomena (multiple PP attachments) are recognized.
引用
收藏
页码:135 / 139
页数:5
相关论文
共 50 条
  • [1] Unified parsing and information extraction language
    Bednar, Peter
    2016 IEEE 14TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI), 2016, : 131 - 135
  • [2] Semantic Frame Parsing for Information Extraction : the CALOR corpus
    Marzinotto, Gabriel
    Auguste, Jeremy
    Bechet, Frederic
    Damnati, Geraldine
    Nasr, Alexis
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 986 - 993
  • [3] Active learning for natural language parsing and information extraction
    Thompson, CA
    Califf, ME
    Mooney, RJ
    MACHINE LEARNING, PROCEEDINGS, 1999, : 406 - 414
  • [4] Dependency Parsing Representation Learning for Open Information Extraction
    Li Zekun
    Ning Nianwen
    Peng Chengcheng
    Wu Bin
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 433 - 444
  • [5] Mining information extraction rules from datasheets without linguistic parsing
    Agrawal, R
    Ho, H
    Jacquenet, F
    Jacquenet, M
    INNOVATIONS IN APPLIED ARTIFICIAL INTELLIGENCE, 2005, 3533 : 510 - 520
  • [6] MIRACLE at GeoCLEF Query Parsing 2007: Extraction and Classification of Geographical Information
    Lana-Serrano, Sara
    Villena-Roman, Julio
    Gonzalez-Cristobal, Jose Carlos
    Goni-Menoyo, Jose Miguel
    ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 786 - +
  • [7] Leveraging Full Dependency Parsing Graph Information For Biomedical Event Extraction
    Noravesh, Farshad
    Haffari, Reza
    Fang, Ong Huey
    Soon, Layki
    Rajalana, Sailaja
    Pal, Arghya
    arXiv,
  • [8] Spatial Dependency Parsing for Semi-Structured Document Information Extraction
    Hwang, Wonseok
    Yim, Jinyeong
    Park, Seunghyun
    Yang, Sohee
    Seo, Minjoon
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 330 - 343
  • [9] Task-Oriented Evaluation of Dependency Parsing with Open Information Extraction
    Gamallo, Pablo
    Garcia, Marcos
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 77 - 82
  • [10] Full text parsing using cascades of rules: an information extraction perspective
    Ciravegna, F
    Lavelli, A
    NINTH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS, 1999, : 102 - 109