Efficient parsing for Information Extraction

被引:0
|
作者
Basili, R [1 ]
Pazienza, MT [1 ]
Zanzotto, FM [1 ]
机构
[1] Univ Roma Tor Vergata, Dipartimento Informat Sist & Prod, I-00173 Rome, Italy
来源
ECAI 1998: 13TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS | 1998年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several (and successfull) Information Extraction systems have recently replaced the core parsing components with shallow but more efficient recognizers. In this paper we argue that the absence of an underlying grammatical recognizer, given the complex nature of several (non-english) languages, is a strong limitation for text processing functionalities, like those an IE system needs. We propose a robust and efficient syntactic recognizer mainly aimed to capture grammatical information crucial for several linguistic and non linguistic inferences. The proposed system is based on a novel architecture exploiting two major principles: lexicalization and stratification of the parsing process. As several linguistic theories (e.g. HPSG) and parsing frameworks (e.g. LTAG, SLTAG, lexicalized probabilistic parsing) suggest, lexicon-driven systems ensure the suitable forms of grammatical control for many complex phenomena. In our system an analysis guided by information on typical verb projections (e.g. verb subcategorization structures) is coupled with extended locality constraints (i.e. recognition of clause boundaries). Futhermore, stratification is also employed. A cascade of processing steps starts from chunk recognition and proceeds through clause analysis to dependency detection. Recognition of chunks allows to minimize the input ambiguity to the remaining phases. The resulting system is thus robust against ungrammatical phenomena (e.g. complex clause embedding, misspellings, unknown words). Efficiency is also retained, although ambiguous phenomena (multiple PP attachments) are recognized.
引用
收藏
页码:135 / 139
页数:5
相关论文
共 50 条
  • [41] SheetReader: Efficient Specialized Spreadsheet Parsing
    Gavriilidis, Haralampos
    Henze, Felix
    Zacharatou, Eleni Tzirita
    Markl, Volker
    INFORMATION SYSTEMS, 2023, 115
  • [42] AMR Parsing with Latent Structural Information
    Zhou, Qiji
    Zhang, Yue
    Ji, Donghong
    Tang, Hao
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4306 - 4319
  • [43] Efficient ambiguous parsing of mathematical formulae
    Coen, CS
    Zacchiroli, S
    MATHEMATICAL KNOWLEDGE MANAGEMENT, PROCEEDINGS, 2004, 3119 : 347 - 362
  • [44] The Ministry of Information: Parsing the Facts of Fiction
    Hastie, Amelie
    FILM QUARTERLY, 2017, 71 (01) : 65 - 72
  • [45] An Efficient SENT Signal Parsing Method
    Wen, Qian
    He, Yongyi
    2020 INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC), 2020, : 250 - 254
  • [46] Dynamic programming as frame for efficient parsing
    Ferro, MV
    Souto, DC
    Pardo, MAA
    SCCC'98 - XVIII INTERNATIONAL CONFERENCE OF THE CHILEAN SOCIETY OF COMPUTER SCIENCE, PROCEEDINGS, 1998, : 68 - 75
  • [47] SpecializingWord Embeddings (for Parsing) by Information Bottleneck
    Li, Xiang Lisa
    Eisner, Jason
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2744 - 2754
  • [48] Terms for Efficient Proof Checking and Parsing
    Faerber, Michael
    PROCEEDINGS OF THE 12TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON CERTIFIED PROGRAMS AND PROOFS, CPP 2023, 2023, : 135 - 147
  • [49] FEATURE EXTRACTION BY INCREMENTAL PARSING FOR MUSIC INDEXING
    Almoosa, Nawaf I.
    Bae, Soo Hyun
    Juang, Biing-Hwang
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 2410 - 2413
  • [50] Super Parsing: Sentiment classification with review extraction
    Liu, J
    Yao, JX
    Wu, GF
    FIFTH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - PROCEEDINGS, 2005, : 216 - 222