Using EuroWordNet in a concept-based approach to cross-language text retrieval

被引:0
|
作者
Gonzalo, Julio [1 ,2 ]
Verdejo, Felisa [1 ]
Chugur, Irina [1 ]
机构
[1] UNED, Ciudad Universitaria, Madrid, Spain
[2] Depto. Ing. Electrica, E., UNED, Ciudad Universitaria, s.n., 28040 Madrid, Spain
来源
关键词
Computational linguistics - Database systems - Errors - Indexing (of information) - Mathematical models - Natural language processing systems - Sensitivity analysis - Text processing;
D O I
暂无
中图分类号
学科分类号
摘要
We present an approach to cross-language text retrieval based on the EuroWordNet (EWN) multilingual semantic database. EuroWordNet is a multilingual, WordNet-like database with basic semantic relations between words for several European languages (English, Dutch, Spanish, Italian, German, French, Czech, and Estonian). In addition to the relations in WordNet 1.5, EWN includes domain labels, cross-language, and cross-part-of-speech relations, which are directly useful for multilingual information retrieval. In our approach, documents in any language covered by EuroWordNet are indexed in a space of language-independent concepts (the EuroWordNet Inter Lingual Index), thus turning term weighting and query/document matching into language-independent tasks. We report on the results of a number of experiments that measure the potential benefits of the approach and its tolerance to word sense disambiguation errors. In our monolingual experiments, the classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms. This result is obtained for a manually disambiguated test collection derived from the Semcor annotated corpus. The sensitivity of retrieval performance to (automatic) disambiguation errors is also measured. Our preliminary bilingual experiments, also reported here, show that our approach can sensibly outperform a naive, dictionary-based, translation of the query terms into the target language.
引用
收藏
页码:647 / 678
相关论文
共 50 条
  • [41] Study on cross-language information retrieval
    Si, Shen
    PROCEEDINGS OF 2008 INTERNATIONAL PRE-OLYMPIC CONGRESS ON COMPUTER SCIENCE, VOL I: COMPUTER SCIENCE AND ENGINEERING, 2008, : 6 - 10
  • [42] Cross-language multimedia information retrieval
    Flank, S
    6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : 13 - 20
  • [43] Using an Image-Text Parallel Corpus and the Web for Query Expansion in Cross-Language Image Retrieval
    Chang, Yih-Chen
    Chen, Hsin-Hsi
    ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 504 - 511
  • [44] Using concept-based indexing to improve language modeling approach to genomic IR
    Zhou, Xiaohua
    Zhang, Xiaodan
    Hu, Xiaohua
    ADVANCES IN INFORMATION RETRIEVAL, 2006, 3936 : 444 - 455
  • [45] A Concept-Based Interactive Biomedical Image Retrieval Approach using Visualness and Spatial Information
    Rahman, Md Mahmudur
    Antani, Sameer K.
    Demner-Fushman, Dina
    Thoma, George R.
    MEDICAL IMAGING 2015: PACS AND IMAGING INFORMATICS: NEXT GENERATION AND INNOVATIONS, 2015, 9418
  • [46] Cross-Language Image Retrieval Using Hitting Time Model
    Zhang, Lu
    Su, Qi
    Sun, Bin
    Wang, Chanjuan
    11TH CHINESE LEXICAL SEMANTICS WORKSHOP (CKSW2010), 2010, : 348 - 355
  • [47] Using Mutual Information Technique in Cross-Language Information Retrieval
    Sari, Syandra
    Adriani, Mirna
    DIGITAL LIBRARIES: UNIVERSAL AND UBIQUITOUS ACCESS TO INFORMATION, PROCEEDINGS, 2008, 5362 : 276 - +
  • [48] Cross-Language Information Retrieval Using PARAFAC2
    Chew, Peter A.
    Bader, Brett W.
    Kolda, Tamara G.
    Abdelali, Ahmed
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 143 - +
  • [49] Cross-Language Information Retrieval using Japanese and English WordNets
    Ueno, Ryo
    Klyuev, Vitaly
    2012 INTERNATIONAL CONFERENCE ON APPLIED INFORMATICS AND COMMUNICATION (ICAIC 2012), 2013, : 198 - 203
  • [50] CROSS-LANGUAGE DOCUMENT RETRIEVAL BY USING NONLINEAR SEMANTIC MAPPING
    Banchs, Rafael E.
    Costa-Jussa, Marta R.
    APPLIED ARTIFICIAL INTELLIGENCE, 2013, 27 (09) : 781 - 802