Multilingual open information extraction: Challenges and opportunities

被引：0

作者：

Claro D.B. ^{[1
]}

Souza M. ^{[1
]}

Xavier C.C. ^{[2
]}

Oliveira L. ^{[1
]}

机构：

[1] FORMAS Research Group, Computer Science Department, Federal University of Bahia, Salvador - BA

[2] FORMAS Research Group, Federal Institute of Rio Grande do Sul, Porto Alegre - RS

来源：

Information (Switzerland) | 2019年 / 10卷 / 07期

关键词：

Multilingual; Open information extraction; Parallel corpus;

D O I：

10.3390/INFO10070228

中图分类号：

学科分类号：

摘要：

The number of documents published on theWeb in languages other than English grows every year. As a consequence, the need to extract useful information from different languages increases, highlighting the importance of research into Open Information Extraction (OIE) techniques. Different OIE methods have dealt with features from a unique language; however, few approaches tackle multilingual aspects. In those approaches, multilingualism is restricted to processing text in different languages, rather than exploring cross-linguistic resources, which results in low precision due to the use of general rules. Multilingual methods have been applied to numerous problems in Natural Language Processing, achieving satisfactory results and demonstrating that knowledge acquisition for a language can be transferred to other languages to improve the quality of the facts extracted. We argue that a multilingual approach can enhance OIE methods as it is ideal to evaluate and compare OIE systems, and therefore can be applied to the collected facts. In this work, we discuss how the transfer knowledge between languages can increase acquisition from multilingual approaches. We provide a roadmap of the Multilingual Open IE area concerning state of the art studies. Additionally, we evaluate the transfer of knowledge to improve the quality of the facts extracted in each language. Moreover, we discuss the importance of a parallel corpus to evaluate and compare multilingual systems. © 2019 by the authors.

引用

共 50 条

[41] Multi2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT
Ro, Youngbin
Lee, Yukyung
Kang, Pilsung
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1107 - 1117
[42] THE FUTURE OF ONLINE INFORMATION - CHALLENGES AND OPPORTUNITIES
SUMMIT, RK
ONLINE & CDROM REVIEW, 1993, 17 (05): : 317 - 317
[43] Information systems ethics - challenges and opportunities
Rogerson, Simon
Miller, Keith W.
Winter, Jenifer Sunrise
Larson, David
JOURNAL OF INFORMATION COMMUNICATION & ETHICS IN SOCIETY, 2019, 17 (01): : 87 - 97
[44] THE FUTURE OF ONLINE INFORMATION - CHALLENGES AND OPPORTUNITIES
SUMMIT, RK
ELECTRONIC LIBRARY, 1993, 11 (4-5): : 233 - 236
[45] Keyword extraction in open-domain multilingual textual resources
Panunzi, A
Fabbri, M
Moneglia, M
FIRST INTERNATIONAL CONFERENCE ON AUTOMATED PRODUCTION OF CROSS MEDIA CONTENT FOR MULTI-CHANNEL DISTRIBUTION, PROCEEDINGS, 2005, : 253 - 256
[46] Neural Open Information Extraction
Cui, Lei
Wei, Furu
Zhou, Ming
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 407 - 413
[47] Abstractive Open Information Extraction
Pei, Kevin
Jindal, Ishan
Chang, Kevin Chen-Chuan
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6146 - 6158
[48] Open Information Extraction usingWikipedia
Wu, Fei
Weld, Daniel S.
ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 118 - 127
[49] Browserless Web Data Extraction: Challenges and Opportunities
Fayzrakhmanov, Ruslan R.
Sallinger, Emanuel
Spencer, Ben
Furche, Tim
Gottlob, Georg
WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, : 1095 - 1104
[50] Going Open Source Software in IT - Opportunities and Challenges
Thomas, Dave
JOURNAL OF OBJECT TECHNOLOGY, 2005, 4 (02): : 7 - 13

← 1 2 3 4 5 →