BLARK for multi-dialect languages: towards the Kurdish BLARK

被引:3
|
作者
Hassani, Hossein [1 ,2 ]
机构
[1] Univ Kurdistan Hewler, Dept Comp Sci & Engn, Erbil, Kurdistan Regio, Iraq
[2] Univ Sch Sci & Technol, Dept Comp Sci, Sarajevo, Bosnia & Herceg
关键词
Kurdish BLARK; Language tools; Computational linguistics; Natural language processing;
D O I
10.1007/s10579-017-9400-0
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we introduce the Kurdish BLARK (Basic Language Resource Kit). The original BLARK has not considered multi-dialect characteristics and generally has targeted reasonably well-resourced languages. To consider these two features, we extended BLARK and applied the proposed extension to Kurdish. Kurdish language not only faces a paucity in resources, but also embraces several dialects within a complex linguistic context. This paper presents the Kurdish BLARK and shows that from Natural language processing and computational linguistics perspectives the revised BLARK provides a more applicable view of languages with similar characteristics to Kurdish.
引用
收藏
页码:625 / 644
页数:20
相关论文
共 49 条
  • [31] TeleSpeechPT: Large-Scale Chinese Multi-dialect and Multi-accent Speech Pre-training
    Chen, Hongjie
    Li, Zehan
    Xia, Guangmin
    Liu, Boqing
    Yang, Yan
    Kang, Jian
    Li, Jie
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 183 - 190
  • [32] OCCAM'S ADAPTATION: A COMPARISON OF INTERPOLATION OF BASES ADAPTATION METHODS FOR MULTI-DIALECT ACOUSTIC MODELING WITH LSTMS
    Grace, Mikaela
    Bastani, Meysam
    Weinstein, Eugene
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 174 - 181
  • [33] Teaching English pronunciation to multi-dialect first language learners: The revival of the Lingua Franca Core (LFC)
    Zoghbor, Wafa Shahada
    SYSTEM, 2018, 78 : 1 - 14
  • [34] Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification
    Zaghouani, Wajdi
    Charfi, Anis
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 694 - 700
  • [35] Multi-task Learning with Auxiliary Cross-attention Transformer for Low-Resource Multi-dialect Speech Recognition
    Dan, Zhengjia
    Zhao, Yue
    Bi, Xiaojun
    Wu, Licheng
    Ji, Qiang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 107 - 118
  • [36] Robust Multi-Dialect End-to-End ASR Model Jointly with Beam Search Threshold Pruning and LLM
    M. C. Shunmuga Priya
    D. Karthika Renuka
    L. Ashok Kumar
    SN Computer Science, 6 (4)
  • [37] JLMS25 and Jiao-Liao Mandarin Speech Recognition Based on Multi-Dialect Knowledge Transfer
    Li, Xuchen
    Wang, Yiqun
    Liu, Xiaoyang
    Su, Kun
    Li, Zhaochen
    Wang, Yitian
    Jiang, Bin
    Xie, Kang
    Liu, Jie
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [38] QCRI ADVANCED TRANSCRIPTION SYSTEM (QATS) FOR THE ARABIC MULTI-DIALECT BROADCAST MEDIA RECOGNITION: MGB-2 CHALLENGE
    Khurana, Sameer
    Ali, Ahmed
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 292 - 298
  • [39] AUTOMATED MULTI-DIALECT SPEECH RECOGNITION USING STACKED ATTENTION-BASED DEEP LEARNING WITH NATURAL LANGUAGE PROCESSING MODEL
    AL Mazroa, Alanoud
    Miled, Achraf ben
    Asiri, Mashael m
    Alzahrani, Yazeed
    Sayed, Ahmed
    Nafie, Faisal mohammed
    FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 2024, 32 (09N10)
  • [40] Analyzing Arabic Twitter-Based Patient Experience Sentiments Using Multi-Dialect Arabic Bidirectional Encoder Representations from Transformers
    Almuhaideb, Sarab
    Alnegheimish, Yasmeen
    Alomar, Taif
    Alsabti, Reem
    Alkathery, Maha
    Alolyyan, Ghala
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (01): : 195 - 220