CogNet: a Large-Scale Cognate Database

被引:0
|
作者
Batsuren, Khuyagbaatar [1 ]
Bella, Gabor [1 ]
Giunchiglia, Fausto [1 ,2 ]
机构
[1] Univ Trento, DISI, Trento, Italy
[2] Jilin Univ, Changchun, Jilin, Peoples R China
基金
欧盟地平线“2020”;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces CogNet, a new, large-scale lexical database that provides cognates-words of common origin and meaning-across languages. The database currently contains 3.1 million cognate pairs across 338 languages using 35 writing systems. The paper also describes the automated method by which cognates were computed from publicly available wordnets, with an accuracy evaluated to 94%. Finally, statistics and early insights about the cognate data are presented, hinting at a possible future exploitation of the resource' by various fields of lingustics.
引用
收藏
页码:3136 / 3145
页数:10
相关论文
共 50 条
  • [1] Creating Large-Scale Multilingual Cognate Tables
    Wu, Winston
    Yarowsky, David
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3411 - 3418
  • [2] A large and evolving cognate database
    Batsuren, Khuyagbaatar
    Bella, Gabor
    Giunchiglia, Fausto
    LANGUAGE RESOURCES AND EVALUATION, 2022, 56 (01) : 165 - 189
  • [3] A SURVEY OF LARGE-SCALE DATABASE ISSUES
    ANTENUCCI, JC
    GIS-87 SAN FRANCISCO, VOL 3: INTO THE HANDS OF THE DECISION MAKER, 1988, : 17 - 21
  • [4] A large and evolving cognate database
    Khuyagbaatar Batsuren
    Gábor Bella
    Fausto Giunchiglia
    Language Resources and Evaluation, 2022, 56 : 165 - 189
  • [5] LARGE-SCALE TELECOMMUNICATIONS AND DATABASE STANDARDS
    LEFKON, RG
    DATA MANAGEMENT, 1987, 25 (05): : 18 - 24
  • [6] MMsINC: a large-scale chemoinformatics database
    Masciocchi, Joel
    Frau, Gianfranco
    Fanton, Marco
    Sturlese, Mattia
    Floris, Matteo
    Pireddu, Luca
    Palla, Piergiorgio
    Cedrati, Fabian
    Rodriguez-Tome, Patricia
    Moro, Stefano
    NUCLEIC ACIDS RESEARCH, 2009, 37 : D284 - D290
  • [7] A Large-Scale Japanese Speech Database
    1600, (The International Society for Computers and Their Applications (ISCA)):
  • [8] A global database of large-scale transverse drainages
    Lee, Jacqueline
    DATA IN BRIEF, 2019, 23
  • [9] A large-scale stream benthic diatom database
    Gosselain, W
    Coste, M
    Campeau, S
    Ector, L
    Fauville, C
    Delmas, F
    Knoflacher, M
    Licursi, M
    Rimet, F
    Tison, J
    Tudesque, L
    Descy, JP
    HYDROBIOLOGIA, 2005, 542 (1) : 151 - 163
  • [10] ImageNet: A Large-Scale Hierarchical Image Database
    Deng, Jia
    Dong, Wei
    Socher, Richard
    Li, Li-Jia
    Li, Kai
    Li Fei-Fei
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 248 - 255