Statistical-based approach to non-segmented language processing

被引:0
|
作者
Sornlertlamvanich, Virach [1 ]
Charoenporn, Thatsanee
Tongchim, Shisanu
Kruengkrai, Canasai
Isahara, Hitoshi
机构
[1] TCL, NICT Asia Res Ctr, Pathum Thani, Thailand
[2] NICT, Kyoto 6190289, Japan
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2007年 / E90D卷 / 10期
关键词
non-segmented language; unified language processing; statistical approach; probability language identification; word extraction; search engine;
D O I
10.1093/ietisy/e90-d.10.1565
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Several approaches have been studied to cope with the exceptional features of non-segmented languages. When there is no explicit information about the boundary of a word, segmenting an input text is a formidable task in language processing. Not only the contemporary word list, but also usages of the words have to be maintained to cover the use in the current texts. The accuracy and efficiency in higher processing do heavily rely on this word boundary identification task. In this paper, we introduce some statistical based approaches to tackle the problem due to the ambiguity in word segmentation. The word boundary identification problem is then defined as a part of others for performing the unified language processing in total. To exhibit the ability in conducting the unified language processing, we selectively study the tasks of language identification, word extraction, and dictionary-less search engine.
引用
收藏
页码:1565 / 1573
页数:9
相关论文
共 50 条
  • [41] Novel non-segmented negative-sense RNA virus-based vaccine platforms for Zika virus
    Li, Anzhong
    Lu, Mijia
    Yu, Jingyou
    Harder, Olivia
    Attia, Zayed
    Shan, Chao
    Liang, Xueya
    Xue, Miaoge
    Shi, Pei-Yong
    Peeples, Mark E.
    Liu, Shan-Lu
    Boyaka, Prosper N.
    Niewiesk, Stefan
    Li, Jianrong
    JOURNAL OF IMMUNOLOGY, 2020, 204 (01):
  • [42] ROLE OF NON-SEGMENTED NEUTROPHILS IN ABNORMALITY OF LEUKOCYTE CHEMOTAXIS OBSERVED DURING INFECTIONS
    FREI, PC
    HERMANOVICZ, A
    EUROPEAN JOURNAL OF CLINICAL INVESTIGATION, 1976, 6 (04) : 326 - 326
  • [43] A Non-Segmented PSpice Model of SiC MOSFET With Temperature-Dependent Parameters
    Li, Hong
    Zhao, Xingran
    Sun, Kai
    Zhao, Zhengming
    Cao, Guoen
    Zheng, Trillion Q.
    IEEE TRANSACTIONS ON POWER ELECTRONICS, 2019, 34 (05) : 4603 - 4612
  • [44] Genetic manipulation of non-segmented negative-strand RNA viruses.
    Conzelmann, KK
    JOURNAL OF GENERAL VIROLOGY, 1996, 77 : 381 - 389
  • [45] Using Clustering Techniques for Non-segmented Language Document Management: A Comparison of K-mean and Self Organizing Map Techniques
    Chumwatana, Todsanai
    PROCEEDING OF KNOWLEDGE MANAGEMENT INTERNATIONAL CONFERENCE (KMICE) 2014, VOLS 1 AND 2, 2014, : 600 - 605
  • [46] A Statistical-Based Treatment Plan Prediction Method
    Lu, J.
    Hu, W.
    Wang, J.
    Fan, J.
    Qing, G.
    Huang, L.
    Ying, H.
    MEDICAL PHYSICS, 2017, 44 (06) : 2912 - 2912
  • [47] Statistical-based Anomaly Detection for NFV Services
    Kourtis, Michail-Alexandros
    Xilouris, George
    Gardikis, Georgios
    Koutras, Ioannis
    2016 IEEE CONFERENCE ON NETWORK FUNCTION VIRTUALIZATION AND SOFTWARE DEFINED NETWORKS (NFV-SDN), 2016, : 161 - 166
  • [48] RAPID ANALYSIS OF DISCRETE SAMPLES - USE OF NON-SEGMENTED, CONTINUOUS-FLOW
    STEWART, KK
    BEECHER, GR
    HARE, PE
    ANALYTICAL BIOCHEMISTRY, 1976, 70 (01) : 167 - 173
  • [49] A statistical-based decision for arabic pronunciation assessment
    Necibi, Khaled
    Bahi, Halima
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (01) : 37 - 44
  • [50] A NOVEL-APPROACH TO NON-SEGMENTED FLOW-ANALYSIS .2. A PROTOTYPE HIGH-PERFORMANCE ANALYZER
    MALCOLMELAWES, DJ
    PASQUINI, C
    JOURNAL OF AUTOMATIC CHEMISTRY, 1988, 10 (01): : 25 - 30