Exploiting Comparable Corpora for Cross-Language Information Retrieval

被引：0

作者：

Sadat, Fatiha ^{[1
]}

机构：

[1] Univ Quebec, Dept Comp Sci, Montreal, PQ H3C 3P8, Canada

来源：

PRICAI 2010: TRENDS IN ARTIFICIAL INTELLIGENCE | 2010年 / 6230卷

关键词：

Cross-language information retrieval; comparable corpora; similarity; co-occurrence tendency;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large-scale comparable corpora became more abundant and accessible than parallel corpora, with the explosive growth of the World Wide Web. Therefore, strategies on bilingual terminology extraction from comparable texts must be given more attention in order to enrich existing bilingual lexicons and thesauri and to enhance Cross-Language Information Retrieval. In the present paper, we focus on the enhancement of Cross-Language Information Retrieval using a two-stage corpus-based translation model that includes bi-directional extraction of bilingual terminology from comparable corpora and selection of best translation alternatives on the basis of their morphological knowledge. The impact of comparable corpora on the performance of the Cross-Language Information Retrieval process is evaluated in this study and the results indicate that the effect is clearly positive, especially when using the linear combination with bilingual dictionaries and Japanese-English pair of languages.

引用

页码：662 / 667

页数：6

共 50 条

[1] Effects of Comparable Corpora on Cross-language Information Retrieval
Sadat, Fatiha
NLPCS 2010: NATURAL LANGUAGE PROCESSING AND COGNITIVE SCIENCE, 2010, : 53 - 59
[2] Creating and exploiting a comparable corpus in cross-language information retrieval
Talvensaari, Tuomas
Laurikkala, Jorma
Jarvelin, Kalervo
Juhola, Martti
Keskustalo, Heikki
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2007, 25 (01)
[3] Knowledge acquisition from comparable corpora for cross-language information retrieval
Fatiha, Sadat
INFORMATION MANAGEMENT IN THE MODERN ORGANIZATIONS: TRENDS & SOLUTIONS, VOLS 1 AND 2, 2008, : 745 - 747
[4] Using Comparable Corpora to Improve the Effectiveness of Cross-Language Information Retrieval
Sadat, Fatiha
ADVANCES IN NATURAL LANGUAGE PROCESSING, 2010, 6233 : 320 - 331
[5] Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization
Gliozzo, Alfio
Strapparava, Carlo
COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 553 - 560
[6] Extracting translations from comparable corpora for Cross-Language Information Retrieval using the language modeling framework
Rahimi, Razieh
Shakery, Azadeh
King, Irwin
INFORMATION PROCESSING & MANAGEMENT, 2016, 52 (02) : 299 - 318
[7] Cross-language information retrieval: experiments based on CLEF 2000 corpora
Savoy, J
INFORMATION PROCESSING & MANAGEMENT, 2003, 39 (01) : 75 - 115
[8] Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora
Vulic, Ivan
De Smet, Wim
Moens, Marie-Francine
INFORMATION RETRIEVAL, 2013, 16 (03): : 331 - 368
[9] Cross-language information retrieval
Nie J.-Y.
Synthesis Lectures on Human Language Technologies, 2010, 3 (01): : 1 - 142
[10] Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora
Ivan Vulić
Wim De Smet
Marie-Francine Moens
Information Retrieval, 2013, 16 : 331 - 368

← 1 2 3 4 5 →