Efficient parsing for Information Extraction

被引：0

作者：

Basili, R ^{[1
]}

Pazienza, MT ^{[1
]}

Zanzotto, FM ^{[1
]}

机构：

[1] Univ Roma Tor Vergata, Dipartimento Informat Sist & Prod, I-00173 Rome, Italy

来源：

ECAI 1998: 13TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS | 1998年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Several (and successfull) Information Extraction systems have recently replaced the core parsing components with shallow but more efficient recognizers. In this paper we argue that the absence of an underlying grammatical recognizer, given the complex nature of several (non-english) languages, is a strong limitation for text processing functionalities, like those an IE system needs. We propose a robust and efficient syntactic recognizer mainly aimed to capture grammatical information crucial for several linguistic and non linguistic inferences. The proposed system is based on a novel architecture exploiting two major principles: lexicalization and stratification of the parsing process. As several linguistic theories (e.g. HPSG) and parsing frameworks (e.g. LTAG, SLTAG, lexicalized probabilistic parsing) suggest, lexicon-driven systems ensure the suitable forms of grammatical control for many complex phenomena. In our system an analysis guided by information on typical verb projections (e.g. verb subcategorization structures) is coupled with extended locality constraints (i.e. recognition of clause boundaries). Futhermore, stratification is also employed. A cascade of processing steps starts from chunk recognition and proceeds through clause analysis to dependency detection. Recognition of chunks allows to minimize the input ambiguity to the remaining phases. The resulting system is thus robust against ungrammatical phenomena (e.g. complex clause embedding, misspellings, unknown words). Efficiency is also retained, although ambiguous phenomena (multiple PP attachments) are recognized.

引用

页码：135 / 139

页数：5

共 50 条

[31] Information locking and its resource-efficient extraction
Goswami, Suchetana
Halder, Saronath
PHYSICAL REVIEW A, 2023, 108 (01)
[32] RePersian:An Efficient Open Information Extraction Tool in Persian
Saheb-Nassagh, Raana
Asgari, Majid
Minaei-Bidgoli, Behrouz
2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 93 - 99
[33] Historical Map Toponym Extraction for Efficient Information Retrieval
Lenc, Ladislav
Martinek, Jiri
Baloun, Josef
Prantl, Martin
Kral, Pavel
DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 171 - 183
[34] ON EFFICIENT GLOBAL INFORMATION EXTRACTION METHODS FOR PARALLEL PROCESSORS
REEVES, AP
COMPUTER GRAPHICS AND IMAGE PROCESSING, 1980, 14 (02): : 159 - 169
[35] Efficient information extraction over evolving text data
Chen, Fei
Doan, AnHai
Yang, Jun
Ramakrishnan, Raghu
2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 943 - +
[36] Efficient Temporal Information Extraction from Korean Documents
Lim, Chae-Gyun
Choi, Ho-Jin
2017 18TH IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (IEEE MDM 2017), 2017, : 366 - 370
[37] Impossibility of efficient information-theoretic fuzzy extraction
Fuller, Benjamin
DESIGNS CODES AND CRYPTOGRAPHY, 2024, 92 (07) : 1983 - 2009
[38] SYNTACTIC PARSING FOR INFORMATION-RETRIEVAL
METZLER, DP
NOREAULT, T
HEIDORN, B
PROCEEDINGS OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1983, 20 : 269 - 273
[39] Efficient Gerber File Parsing and Drawing
Qi, Min
Wang, Zitong
Wei, Xiaoyu
Wang, Aili
PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND MACHINE INTELLIGENCE (MLMI 2018), 2018, : 13 - 17
[40] Efficient techniques for parsing with tree automata
Groschwitz, Jonas
Koller, Alexander
Johnson, Mark
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 2042 - 2051

← 1 2 3 4 5 →