Towards Producing Bilingual Lexica from Monolingual Corpora

被引:0
|
作者
Han, Jingyi [1 ]
Bel, Nuria [1 ]
机构
[1] Univ Pompeu Fabra, Roc Boronat 138, Barcelona 08018, Spain
关键词
automatic bilingual lexicon production; lexical resources; bilingual dictionaries;
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
Bilingual lexica are the basis for many cross-lingual natural language processing tasks. Recent works have shown success in learning bilingual dictionary by taking advantages of comparable corpora and a diverse set of signals derived from monolingual corpora. In the present work, we describe an approach to automatically learn bilingual lexica by training a supervised classifier using word embedding-based vectors of only a few hundred translation equivalent word pairs. The word embedding representations of translation pairs were obtained from source and target monolingual corpora, which are not necessarily related. Our classifier is able to predict whether a new word pair is under a translation relation or not. We tested it on two quite distinct language pairs Chinese-Spanish and English-Spanish. The classifiers achieved more than 0.90 precision and recall for both language pairs in different evaluation scenarios. These results show a high potential for this method to be used in bilingual lexica production for language pairs with reduced amount of parallel or comparable corpora, in particular for phrase table expansion in Statistical Machine Translation systems.
引用
收藏
页码:2222 / 2227
页数:6
相关论文
共 50 条
  • [31] Monolingual and Bilingual Phonological Activation in Cantonese
    Yan, Ming
    Luo, Yingyi
    Pan, Jinger
    BILINGUALISM-LANGUAGE AND COGNITION, 2023, 26 (04) : 751 - 761
  • [32] LINGUISTIC FUNCTIONING OF BILINGUAL AND MONOLINGUAL CHILDREN
    CARROW, MA
    JOURNAL OF SPEECH AND HEARING DISORDERS, 1957, 22 (03): : 371 - 380
  • [33] Experiments with monolingual, bilingual, and robust retrieval
    Savoy, Jacques
    Abdou, Samir
    EVALUATION OF MULTILINGUAL AND MULTI-MODAL INFORMATION RETRIEVAL, 2007, 4730 : 137 - +
  • [34] THE READING STRATEGIES OF BILINGUAL AND MONOLINGUAL STUDENTS
    PADRON, YN
    WAXMAN, HC
    JOURNAL OF SOCIAL PSYCHOLOGY, 1988, 128 (05): : 697 - 698
  • [35] Phonological translation in bilingual and monolingual children
    Oller, DK
    Cobo-Lewis, AB
    Eilers, RE
    APPLIED PSYCHOLINGUISTICS, 1998, 19 (02) : 259 - 278
  • [36] Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages
    Doddapaneni, Sumanth
    Aralikatte, Rahul
    Ramesh, Gowtham
    Goyal, Shreya
    Khapra, Mitesh M.
    Kunchukuttan, Anoop
    Kumar, Pratyush
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 12402 - 12426
  • [37] Alignment by bilingual generation and monolingual derivation
    Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
    Int. Conf. Comput. Linguist. - Proc. COLING: Tech. Pap., 1600, (1963-1978):
  • [38] RELATIONSHIP BETWEEN BILINGUAL DEMENTED IMMIGRANTS AND BILINGUAL MONOLINGUAL CAREGIVERS
    EKMAN, SL
    WAHLIN, TBR
    NORBERG, A
    WINBLAD, B
    INTERNATIONAL JOURNAL OF AGING & HUMAN DEVELOPMENT, 1993, 37 (01): : 37 - 54
  • [39] Developmental change in tone perception in Mandarin monolingual, English monolingual, and Mandarin-English bilingual infants: Divergences between monolingual and bilingual learners
    Singh, Leher
    Fu, Charlene S. L.
    Seet, Xian Hui
    Tong, Ashley P. Y.
    Wang, Joelle L.
    Best, Catherine T.
    JOURNAL OF EXPERIMENTAL CHILD PSYCHOLOGY, 2018, 173 : 59 - 77
  • [40] On compiling the corpora of bilingual dictionaries
    Ozolinya, L. V.
    SIBIRSKII FILOLOGICHESKII ZHURNAL, 2018, (02): : 185 - 195