Statistical-based approach to non-segmented language processing

被引:0
|
作者
Sornlertlamvanich, Virach [1 ]
Charoenporn, Thatsanee
Tongchim, Shisanu
Kruengkrai, Canasai
Isahara, Hitoshi
机构
[1] TCL, NICT Asia Res Ctr, Pathum Thani, Thailand
[2] NICT, Kyoto 6190289, Japan
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2007年 / E90D卷 / 10期
关键词
non-segmented language; unified language processing; statistical approach; probability language identification; word extraction; search engine;
D O I
10.1093/ietisy/e90-d.10.1565
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Several approaches have been studied to cope with the exceptional features of non-segmented languages. When there is no explicit information about the boundary of a word, segmenting an input text is a formidable task in language processing. Not only the contemporary word list, but also usages of the words have to be maintained to cover the use in the current texts. The accuracy and efficiency in higher processing do heavily rely on this word boundary identification task. In this paper, we introduce some statistical based approaches to tackle the problem due to the ambiguity in word segmentation. The word boundary identification problem is then defined as a part of others for performing the unified language processing in total. To exhibit the ability in conducting the unified language processing, we selectively study the tasks of language identification, word extraction, and dictionary-less search engine.
引用
收藏
页码:1565 / 1573
页数:9
相关论文
共 50 条
  • [1] Wavelet based compression of segmented images using baseline non-segmented approach
    Vargic, R
    Procháska, J
    2003 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2003, : 955 - 958
  • [2] NON-SEGMENTED AND SEGMENTED NEUTROPHILIC GRANULOCYTE COUNTS IN NEWBORN
    FEHLMANN, U
    LOHER, E
    FANCONI, A
    HELVETICA PAEDIATRICA ACTA, 1976, 31 (01) : 21 - 32
  • [3] Statistical-based approach to word segmentation
    Wang, YL
    Phillips, IT
    Haralick, R
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 555 - 558
  • [4] A statistical-based approach for acoustic tomography of the atmosphere
    Azimi-Sadjadi, M.R. (azimi@engr.colostate.edu), 1600, Acoustical Society of America (135):
  • [5] A Non-segmented PSpice Model of SiC MOSFETs
    Li, Hong
    Zhao, Xingran
    Hao, Ruixiang
    Sun, Kai
    IECON 2017 - 43RD ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2017, : 4823 - 4828
  • [6] A statistical-based approach for acoustic tomography of the atmosphere
    Kolouri, Soheil
    Azimi-Sadjadi, Mahmood R.
    Ziemann, Astrid
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 135 (01): : 104 - 114
  • [7] Segmented and non-segmented stacked denoising autoencoder for hyperspectral band reduction
    Ahmad, Muhammad
    Alqarni, Mohammed A.
    Khan, Adil Mehmood
    Hussain, Rasheed
    Mazzara, Manuel
    Distefano, Salvatore
    OPTIK, 2019, 180 : 370 - 378
  • [8] A Statistical-Based Approach to Load Model Parameter Identification
    Gulakhmadov, Aminjon
    Tavlintsev, Alexander
    Pankratov, Aleksey
    Suvorov, Anton
    Kovaleva, Anastasia
    Lipnitskiy, Ilya
    Safaraliev, Murodbek
    Semenenko, Sergey
    Gubin, Pavel
    Dmitriev, Stepan
    Rasulzoda, Khusrav
    IEEE ACCESS, 2021, 9 : 66915 - 66928
  • [9] KANA-TO-KANJI TRANSLATION BASED ON COLLOCATIONAL ANALYSIS FOR NON-SEGMENTED INPUT
    YAMASHINA, M
    OBASHI, F
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1989, 37 (01): : 65 - 70
  • [10] Non-segmented Grain Oriented steel in induction machines
    Cassoret, Bertrand
    Lopez, Samuel
    Brudny, Jean-François
    Belgrand, Thierry
    Progress In Electromagnetics Research C, 2014, 47 : 1 - 10