Minimal feature set in language identification and finding suitable classification method with it

被引:4
|
作者
Takci, Hidayet [1 ]
Ekinci, Ekin [1 ]
机构
[1] Gebze Inst Technol, Fac Engn, Dept Comp Engn, TR-41400 Gebze, Turkey
关键词
language identification; feature based methods; letter features; weighting factor; classification algorithms;
D O I
10.1016/j.protcy.2012.02.099
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Language identification (LI) is a phase of natural language processing. Although LI is formerly studied, there is still much work to do for better performance. The purpose of this study is to present low dimensional feature set which is built from letters and diacritics and suitable classification algorithm (C-SVC, MLP or LDA) with it for high performance. In addition, a weight factor has been integrated to language identification system for increasing the performance. Experiments have been done on ECI corpus. Weight factor has increased the classification accuracies. The most accurate and the fastest method is C-SVC for our feature set. (C) 2011 Published by Elsevier Ltd.
引用
收藏
页码:444 / 448
页数:5
相关论文
共 50 条
  • [1] ON FINDING A MINIMAL SET OF DIAGNOSTIC TESTS
    HADLOCK, F
    IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (05): : 674 - +
  • [2] Centroid-based language identification using letter feature set
    Takci, H
    Sogukpinar, I
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 640 - 648
  • [3] Improving the Robustness of Financial Models through Identification of the Minimal Vulnerable Feature Set
    Pandey, Anubha
    Chaudhary, Himanshu
    Bhatraju, Alekhya
    Bhatt, Deepak
    Singh, Maneet
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023, 2023, : 297 - 304
  • [4] Language Identification Method Based on Fusion Feature MGCC
    Wang Y.
    Long H.
    Shao Y.
    Du Q.
    Wang Y.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (02): : 116 - 121
  • [5] Feature Extraction and Classification Method for Identification of Batik Cloth
    Mulaab
    ADVANCED SCIENCE LETTERS, 2017, 23 (12) : 12409 - 12412
  • [6] FINDING A MINIMAL SET OF BASE PATHS OF A PROGRAM
    KOH, H
    CHUANG, HYH
    INTERNATIONAL JOURNAL OF COMPUTER & INFORMATION SCIENCES, 1979, 8 (06): : 473 - 488
  • [7] FINDING THE MINIMAL SET FOR COLLAPSIBLE GRAPHICAL MODELS
    Wang, Xiaofei
    Guo, Jianhua
    He, Xuming
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY, 2011, 139 (01) : 361 - 373
  • [8] Automatic Text Independent Language Identification Using Reduct Set of Feature Vectors
    Sadanandam, M.
    Prasad, V. Kamakshi
    2013 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ - IEEE 2013), 2013,
  • [9] Development of an identification method for the minimal set of inertial parameters of a multibody system
    Homma, T.
    Yamaura, H.
    MULTIBODY SYSTEM DYNAMICS, 2024, : 435 - 452
  • [10] Lithology Classification Based on Set-Valued Identification Method
    LI Jing
    WU Lifang
    Lü Wenjun
    WANG Ting
    KANG Yu
    FENG Deyong
    ZHOU Hansheng
    Journal of Systems Science & Complexity, 2022, 35 (05) : 1637 - 1652