Efficient parsing for Information Extraction

被引:0
|
作者
Basili, R [1 ]
Pazienza, MT [1 ]
Zanzotto, FM [1 ]
机构
[1] Univ Roma Tor Vergata, Dipartimento Informat Sist & Prod, I-00173 Rome, Italy
来源
ECAI 1998: 13TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS | 1998年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several (and successfull) Information Extraction systems have recently replaced the core parsing components with shallow but more efficient recognizers. In this paper we argue that the absence of an underlying grammatical recognizer, given the complex nature of several (non-english) languages, is a strong limitation for text processing functionalities, like those an IE system needs. We propose a robust and efficient syntactic recognizer mainly aimed to capture grammatical information crucial for several linguistic and non linguistic inferences. The proposed system is based on a novel architecture exploiting two major principles: lexicalization and stratification of the parsing process. As several linguistic theories (e.g. HPSG) and parsing frameworks (e.g. LTAG, SLTAG, lexicalized probabilistic parsing) suggest, lexicon-driven systems ensure the suitable forms of grammatical control for many complex phenomena. In our system an analysis guided by information on typical verb projections (e.g. verb subcategorization structures) is coupled with extended locality constraints (i.e. recognition of clause boundaries). Futhermore, stratification is also employed. A cascade of processing steps starts from chunk recognition and proceeds through clause analysis to dependency detection. Recognition of chunks allows to minimize the input ambiguity to the remaining phases. The resulting system is thus robust against ungrammatical phenomena (e.g. complex clause embedding, misspellings, unknown words). Efficiency is also retained, although ambiguous phenomena (multiple PP attachments) are recognized.
引用
收藏
页码:135 / 139
页数:5
相关论文
共 50 条
  • [31] Information locking and its resource-efficient extraction
    Goswami, Suchetana
    Halder, Saronath
    PHYSICAL REVIEW A, 2023, 108 (01)
  • [32] RePersian:An Efficient Open Information Extraction Tool in Persian
    Saheb-Nassagh, Raana
    Asgari, Majid
    Minaei-Bidgoli, Behrouz
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 93 - 99
  • [33] Historical Map Toponym Extraction for Efficient Information Retrieval
    Lenc, Ladislav
    Martinek, Jiri
    Baloun, Josef
    Prantl, Martin
    Kral, Pavel
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 171 - 183
  • [34] ON EFFICIENT GLOBAL INFORMATION EXTRACTION METHODS FOR PARALLEL PROCESSORS
    REEVES, AP
    COMPUTER GRAPHICS AND IMAGE PROCESSING, 1980, 14 (02): : 159 - 169
  • [35] Efficient information extraction over evolving text data
    Chen, Fei
    Doan, AnHai
    Yang, Jun
    Ramakrishnan, Raghu
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 943 - +
  • [36] Efficient Temporal Information Extraction from Korean Documents
    Lim, Chae-Gyun
    Choi, Ho-Jin
    2017 18TH IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (IEEE MDM 2017), 2017, : 366 - 370
  • [38] SYNTACTIC PARSING FOR INFORMATION-RETRIEVAL
    METZLER, DP
    NOREAULT, T
    HEIDORN, B
    PROCEEDINGS OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1983, 20 : 269 - 273
  • [39] Efficient Gerber File Parsing and Drawing
    Qi, Min
    Wang, Zitong
    Wei, Xiaoyu
    Wang, Aili
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND MACHINE INTELLIGENCE (MLMI 2018), 2018, : 13 - 17
  • [40] Efficient techniques for parsing with tree automata
    Groschwitz, Jonas
    Koller, Alexander
    Johnson, Mark
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 2042 - 2051