Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages

被引:0
|
作者
Ferenczi, Zsanett [1 ]
Mittelholcz, Ivan [1 ]
Simon, Eszter [1 ]
Varadi, Tamas [1 ]
机构
[1] Hungarian Acad Sci, Res Inst Linguist, Benczur U 33, H-1068 Budapest, Hungary
基金
匈牙利科学研究基金会;
关键词
bilingual dictionaries; evaluation; under-resourced languages; dictionary building methods;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we present the evaluation of several bilingual dictionary building methods applied to {Komi-Permyak, Komi-Zyrian, Hill Mari, Meadow Mari, Northern Saami, Udmurt}-{English, Finnish, Hungarian, Russian} language pairs. Since these Finno-Ugric minority languages are under-resourced and standard dictionary building methods require a large amount of pre-processed data, we had to find alternative methods. In a thorough evaluation, we compare the results for each method, which proved our expectations that the precision of standard lexicon building methods is quite low for under-resourced languages. However, utilizing Wikipedia title pairs extracted via inter-language links and Wiktionary-based methods provided useful results. The newly created word pairs enriched with several linguistic information are to be deployed on the web in the framework of Wiktionary. With our dictionaries, the number of Wiktionary entries in the above mentioned Finno-Ugric minority languages can be multiplied.
引用
收藏
页码:1989 / 1994
页数:6
相关论文
共 50 条