BLARK for multi-dialect languages: towards the Kurdish BLARK

被引:3
|
作者
Hassani, Hossein [1 ,2 ]
机构
[1] Univ Kurdistan Hewler, Dept Comp Sci & Engn, Erbil, Kurdistan Regio, Iraq
[2] Univ Sch Sci & Technol, Dept Comp Sci, Sarajevo, Bosnia & Herceg
关键词
Kurdish BLARK; Language tools; Computational linguistics; Natural language processing;
D O I
10.1007/s10579-017-9400-0
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we introduce the Kurdish BLARK (Basic Language Resource Kit). The original BLARK has not considered multi-dialect characteristics and generally has targeted reasonably well-resourced languages. To consider these two features, we extended BLARK and applied the proposed extension to Kurdish. Kurdish language not only faces a paucity in resources, but also embraces several dialects within a complex linguistic context. This paper presents the Kurdish BLARK and shows that from Natural language processing and computational linguistics perspectives the revised BLARK provides a more applicable view of languages with similar characteristics to Kurdish.
引用
收藏
页码:625 / 644
页数:20
相关论文
共 49 条
  • [1] BLARK for multi-dialect languages: towards the Kurdish BLARK
    Hossein Hassani
    Language Resources and Evaluation, 2018, 52 : 625 - 644
  • [2] Diabase: Towards a diachronic BLARK in support of historical studies
    Borin, Lars
    Forsberg, Markus
    Kokkinakis, Dimitrios
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 35 - 42
  • [3] Multi-dialect Workflows
    Kalinichenko, Leonid
    Stupnikov, Sergey
    Vovchenko, Alexey
    Kovalev, Dmitry
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2014), 2014, 8716 : 352 - 365
  • [4] Tibetan Multi-Dialect Speech and Dialect Identity Recognition
    Zhao, Yue
    Yue, Jianjian
    Song, Wei
    Xu, Xiaona
    Li, Xiali
    Wu, Licheng
    Ji, Qiang
    CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 60 (03): : 1223 - 1235
  • [5] Multi-Dialect Arabic Speech Recognition
    Ali, Abbas Raza
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [6] Continual Learning for Multi-Dialect Acoustic Models
    Houston, Brady
    Kirchhoff, Katrin
    INTERSPEECH 2020, 2020, : 576 - 580
  • [7] Multi-Dialect Arabic POS Tagging: A CRF Approach
    Darwish, Kareem
    Mubarak, Hamdy
    Eldesouki, Mohamed
    Abdelali, Ahmed
    Samih, Younes
    Alharbi, Randah
    Attia, Mohammed
    Magdy, Walid
    Kallmeyer, Laura
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 93 - 98
  • [8] A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic
    Cotterell, Ryan
    Callison-Burch, Chris
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [9] MD3: The Multi-Dialect Dataset of Dialogues
    Eisenstein, Jacob
    Prabhakaran, Vinodkumar
    Rivera, Clara
    Demszky, Dorottya
    Sharma, Devyani
    INTERSPEECH 2023, 2023, : 4059 - 4063
  • [10] An open speech resource for Tibetan multi-dialect and multitask recognition
    Zhao, Yue
    Xu, Xiaona
    Yue, Jianjian
    Song, Wei
    Li, Xiali
    Wu, Licheng
    Ji, Qiang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2020, 22 (2-3) : 297 - 304