A comparative study on supervised and unsupervised learning approaches for multilingual text categorization

被引:0
|
作者
Lee, Chung-Hong [1 ]
Yang, Hsin-Chang [2 ]
Chen, Ting-Chung [1 ]
Ma, Sheng-Min [1 ]
机构
[1] Natl Kaohsiung Univ Appl Sci, Dept Elect Engn, Kaohsiung, Taiwan
[2] Chang Jung Univ, Dept Informat Management, Tainan, Taiwan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently users of internationally distributed information networks need tools and methods that will enable them to discover, retrieve and categorize relevant information, in whatever language and form it may have been stored. This drives a convergence of numerous interests from diverse research communities focusing on the issues related to multilingual text categorization. In this work we compare and evaluate the performance of the leading supervised and unsupervised approaches for multilingual text categorization by using various performance measures and standard document corpora. For simplicity, we selected Support Vector Machines (SVM) and Latent Semantic Indexing (LSI) techniques as representatives of supervised and unsupervised methods for multilingual text categorization, respectively. The preliminary results show that our platform models including both supervised and unsupervised learning methods have the potentials for multilingual text categorization.
引用
收藏
页码:511 / +
页数:2
相关论文
共 50 条