Using EuroWordNet in a concept-based approach to cross-language text retrieval

被引:0
|
作者
Gonzalo, Julio [1 ,2 ]
Verdejo, Felisa [1 ]
Chugur, Irina [1 ]
机构
[1] UNED, Ciudad Universitaria, Madrid, Spain
[2] Depto. Ing. Electrica, E., UNED, Ciudad Universitaria, s.n., 28040 Madrid, Spain
来源
关键词
Computational linguistics - Database systems - Errors - Indexing (of information) - Mathematical models - Natural language processing systems - Sensitivity analysis - Text processing;
D O I
暂无
中图分类号
学科分类号
摘要
We present an approach to cross-language text retrieval based on the EuroWordNet (EWN) multilingual semantic database. EuroWordNet is a multilingual, WordNet-like database with basic semantic relations between words for several European languages (English, Dutch, Spanish, Italian, German, French, Czech, and Estonian). In addition to the relations in WordNet 1.5, EWN includes domain labels, cross-language, and cross-part-of-speech relations, which are directly useful for multilingual information retrieval. In our approach, documents in any language covered by EuroWordNet are indexed in a space of language-independent concepts (the EuroWordNet Inter Lingual Index), thus turning term weighting and query/document matching into language-independent tasks. We report on the results of a number of experiments that measure the potential benefits of the approach and its tolerance to word sense disambiguation errors. In our monolingual experiments, the classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms. This result is obtained for a manually disambiguated test collection derived from the Semcor annotated corpus. The sensitivity of retrieval performance to (automatic) disambiguation errors is also measured. Our preliminary bilingual experiments, also reported here, show that our approach can sensibly outperform a naive, dictionary-based, translation of the query terms into the target language.
引用
收藏
页码:647 / 678
相关论文
共 50 条
  • [1] Using EuroWordNet in a concept-based approach to cross-language text retrieval
    Gonzalo, J
    Verdejo, F
    Chugur, I
    APPLIED ARTIFICIAL INTELLIGENCE, 1999, 13 (07) : 647 - 678
  • [2] Applying EuroWordNet to cross-language text retrieval
    Gonzalo, J
    Verdejo, F
    Peters, C
    Calzolari, N
    COMPUTERS AND THE HUMANITIES, 1998, 32 (2-3): : 185 - 207
  • [3] Applying EuroWordNet to Cross-Language Text Retrieval
    Julio Gonzalo
    Felisa Verdejo
    Carol Peters
    Nicoletta Calzolari
    Computers and the Humanities, 1998, 32 : 185 - 207
  • [4] Semantic annotation for concept-based cross-language medical information retrieval
    Volk, M
    Ripplinger, B
    Vintar, S
    Buitelaar, P
    Raileanu, D
    Sacaleanu, B
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2002, 67 (1-3) : 97 - 112
  • [5] Cross-language information retrieval using EuroWordNet and word sense disambiguation
    Clough, P
    Stevenson, M
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2004, 2997 : 327 - 337
  • [6] The effects of conjunction, facet structure, and dictionary combinations in concept-based cross-language retrieval
    Pirkola A.
    Keskustalo H.
    Järvelin K.
    Information Retrieval, 1999, 1 (3): : 217 - 250
  • [7] Concept-Based Cross Language Retrieval for Thai Medicine Recipes
    Polpinij, Jantima
    EMERGENCE OF DIGITAL LIBRARIES - RESEARCH AND PRACTICES, 2014, 8839 : 320 - 327
  • [8] Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-language Information Retrieval
    Clough, Paul
    Stevenson, Mark
    GWC 2004: SECOND INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS, 2003, : 97 - 105
  • [9] Adaptive support for cross-language text retrieval
    De Luca, Ernesto William
    Nuernberger, Andreas
    ADAPTIVE HYPERMEDIA AND ADAPTIVE WEB-BASED SYSTEMS, PROCEEDINGS, 2006, 4018 : 425 - 429
  • [10] Cross-language text retrieval by query translation using term reweighting
    Kang, I
    Kwon, OW
    Lee, JH
    Lee, G
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2000, 14 (05) : 617 - 629