BLARK for multi-dialect languages: towards the Kurdish BLARK

被引:3
|
作者
Hassani, Hossein [1 ,2 ]
机构
[1] Univ Kurdistan Hewler, Dept Comp Sci & Engn, Erbil, Kurdistan Regio, Iraq
[2] Univ Sch Sci & Technol, Dept Comp Sci, Sarajevo, Bosnia & Herceg
关键词
Kurdish BLARK; Language tools; Computational linguistics; Natural language processing;
D O I
10.1007/s10579-017-9400-0
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we introduce the Kurdish BLARK (Basic Language Resource Kit). The original BLARK has not considered multi-dialect characteristics and generally has targeted reasonably well-resourced languages. To consider these two features, we extended BLARK and applied the proposed extension to Kurdish. Kurdish language not only faces a paucity in resources, but also embraces several dialects within a complex linguistic context. This paper presents the Kurdish BLARK and shows that from Natural language processing and computational linguistics perspectives the revised BLARK provides a more applicable view of languages with similar characteristics to Kurdish.
引用
收藏
页码:625 / 644
页数:20
相关论文
共 49 条
  • [21] Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling
    Zalmout, Nasser
    Habash, Nizar
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1775 - 1786
  • [22] Estimating the Level of Dialectness Predicts Interannotator Agreement in Multi-dialect Arabic Datasets
    Keleg, Amr
    Magdy, Walid
    Goldwater, Sharon
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 778 - 789
  • [23] Arabic Sentiment Analysis for Multi-dialect Text using Machine Learning Techniques
    Hussein, Aya H.
    Moawad, Ibrahim F.
    Badry, Rasha M.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (12) : 693 - 700
  • [24] Estimating the Level of Dialectness Predicts Interannotator Agreement in Multi-dialect Arabic Datasets
    Keleg, Amr
    Magdy, Walid
    Goldwater, Sharon
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 766 - 777
  • [25] Dialect-aware Semi-supervised Learning for End-to-End Multi-dialect Speech Recognition
    Shiota, Sayaka
    Imaizumi, Ryo
    Masumura, Ryo
    Kiya, Hitoshi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 240 - 244
  • [26] THE MGB-2 CHALLENGE: ARABIC MULTI-DIALECT BROADCAST MEDIA RECOGNITION
    Ali, Ahmed
    Bell, Peter
    Glass, James
    Messaoui, Yacine
    Mubarak, Hamdy
    Renals, Steve
    Zhang, Yifan
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 279 - 284
  • [27] SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis
    Abdul-Mageed, Muhammad
    Diab, Mona
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1162 - 1169
  • [28] Multi-dialect acoustic modeling using phone mapping and online i-vectors
    Arsikere, Harish
    Sapru, Ashtosh
    Garimella, Sri
    INTERSPEECH 2019, 2019, : 2125 - 2129
  • [29] Multi-Task Transformer with Adaptive Cross-Entropy Loss for Multi-Dialect Speech Recognition
    Dan, Zhengjia
    Zhao, Yue
    Bi, Xiaojun
    Wu, Licheng
    Ji, Qiang
    ENTROPY, 2022, 24 (10)
  • [30] Exploring task-diverse meta-learning on Tibetan multi-dialect speech recognition
    Liu, Yigang
    Zhao, Yue
    Xu, Xiaona
    Xu, Liang
    Zhang, Xubei
    Ji, Qiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):