Minimal feature set in language identification and finding suitable classification method with it

被引:4
|
作者
Takci, Hidayet [1 ]
Ekinci, Ekin [1 ]
机构
[1] Gebze Inst Technol, Fac Engn, Dept Comp Engn, TR-41400 Gebze, Turkey
关键词
language identification; feature based methods; letter features; weighting factor; classification algorithms;
D O I
10.1016/j.protcy.2012.02.099
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Language identification (LI) is a phase of natural language processing. Although LI is formerly studied, there is still much work to do for better performance. The purpose of this study is to present low dimensional feature set which is built from letters and diacritics and suitable classification algorithm (C-SVC, MLP or LDA) with it for high performance. In addition, a weight factor has been integrated to language identification system for increasing the performance. Experiments have been done on ECI corpus. Weight factor has increased the classification accuracies. The most accurate and the fastest method is C-SVC for our feature set. (C) 2011 Published by Elsevier Ltd.
引用
收藏
页码:444 / 448
页数:5
相关论文
共 50 条
  • [41] Feature Hashing for Language and Dialect Identification
    Malmasi, Shervin
    Dras, Mark
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 399 - 403
  • [42] A method of classification of shale set
    School of Geosciences in China University of Petroleum, Qingdao
    266580, China
    不详
    266580, China
    Zhongguo Shiyou Daxue Xuebao (Ziran Kexue Ban), 3 (1-7):
  • [43] An Approach Toward Classification of Minimal Groupoids on a Finite Set
    Behrisch, Mike
    Machida, Hajime
    2019 IEEE 49TH INTERNATIONAL SYMPOSIUM ON MULTIPLE-VALUED LOGIC (ISMVL), 2019, : 164 - 169
  • [44] A Linear Programming Method for Finding a Minimal Set of Axial Lines Representing an Entire Geometry of Building and Urban Layout
    Jung, Sung Kwon
    Kim, Youngchul
    APPLIED SCIENCES-BASEL, 2020, 10 (12):
  • [45] Classification of Feature Set using K-means Clustering from Histogram Refinement Method
    An, Youngeun
    Baek, Jungak
    Shin, Sangwook
    Chang, Minhyuk
    Park, Jongan
    NCM 2008: 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 2, PROCEEDINGS, 2008, : 320 - +
  • [46] Develop multi-hierarchy classification model: Rough set based feature decomposition method
    Wang, QD
    Dai, HP
    Sun, YX
    PATTERN RECOGNITION AND DATA MINING, PT 1, PROCEEDINGS, 2005, 3686 : 164 - 171
  • [47] Finding minimal observation set for finite (belief) state set in non-deterministic planning
    Jiang, Zhi-Hua
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 703 - 706
  • [48] FINDING A MINIMAL POLYNOMIAL VECTOR SET OF A VECTOR OF (N)D ARRAYS
    SAKATA, S
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 539 : 414 - 425
  • [49] A method for finding the maximal set in excess demand
    Andersson, T.
    Erlanson, A.
    Gudmundsson, J.
    Habis, H.
    Carlson, J. Ingebretsen
    Kratz, J.
    ECONOMICS LETTERS, 2014, 125 (01) : 18 - 20
  • [50] Feature Set to sEMG Classification Obtained With Fisher Score
    Toledo-Perez, Diana C.
    Aviles, Marcos
    Gomez-Loenzo, Roberto A.
    Rodriguez-Resendiz, Juvenal
    IEEE ACCESS, 2024, 12 : 13962 - 13970