Accelerated k-nearest neighbors algorithm based on principal component analysis for text categorization

被引:0
|
作者
Min DU
Xing-shu CHEN
机构
[1] SchoolofComputerScience,SichuanUniversity
关键词
D O I
暂无
中图分类号
TP391.1 [文字信息处理];
学科分类号
摘要
Text categorization is a significant technique to manage the surging text data on the Internet.The k-nearest neighbors(kNN) algorithm is an effective,but not efficient,classification model for text categorization.In this paper,we propose an effective strategy to accelerate the standard kNN,based on a simple principle:usually,near points in space are also near when they are projected into a direction,which means that distant points in the projection direction are also distant in the original space.Using the proposed strategy,most of the irrelevant points can be removed when searching for the k-nearest neighbors of a query point,which greatly decreases the computation cost.Experimental results show that the proposed strategy greatly improves the time performance of the standard kNN,with little degradation in accuracy.Specifically,it is superior in applications that have large and high-dimensional datasets.
引用
收藏
页码:407 / 416
页数:10
相关论文
共 50 条
  • [1] Accelerated k-nearest neighbors algorithm based on principal component analysis for text categorization
    Min DU
    Xing-shu CHEN
    Frontiers of Information Technology & Electronic Engineering, 2013, (06) : 407 - 416
  • [2] Accelerated k-nearest neighbors algorithm based on principal component analysis for text categorization
    Min Du
    Xing-shu Chen
    Journal of Zhejiang University SCIENCE C, 2013, 14 : 407 - 416
  • [3] Accelerated k-nearest neighbors algorithm based on principal component analysis for text categorization
    Du, Min
    Chen, Xing-shu
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2013, 14 (06): : 407 - 416
  • [4] Study on density peaks clustering based on k-nearest neighbors and principal component analysis
    Du, Mingjing
    Ding, Shifei
    Jia, Hongjie
    KNOWLEDGE-BASED SYSTEMS, 2016, 99 : 135 - 145
  • [5] K-Nearest Neighbor Algorithm Optimization in Text Categorization
    Chen, Shufeng
    2017 3RD INTERNATIONAL CONFERENCE ON ENVIRONMENTAL SCIENCE AND MATERIAL APPLICATION (ESMA2017), VOLS 1-4, 2018, 108
  • [6] Fault recognition based on principal component analysis and k-nearest neighbor algorithm
    Zou G.
    Ren K.
    Ji Y.
    Ding J.
    Zhang S.
    Meitiandizhi Yu Kantan/Coal Geology and Exploration, 2021, 49 (04): : 15 - 23
  • [7] Automatic text categorization based on K-nearest neighbor
    Sun, J.
    Wang, W.
    Zhong, Y.-X.
    Beijing Youdian Xueyuan Xuebao/Journal of Beijing University of Posts And Telecommunications, 2001, 24 (01): : 42 - 46
  • [8] K-nearest neighbors clustering algorithm
    Gauza, Dariusz
    Zukowska, Anna
    Nowak, Robert
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2014, 2014, 9290
  • [9] Analyzing the Impact of Principal Component Analysis on k-Nearest Neighbors and Naive Bayes Classification Algorithms
    Macionczyk, Rafal
    Moryc, Michal
    Buchtyar, Patryk
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2023, 2024, 1979 : 247 - 263
  • [10] Chameleon algorithm based on mutual k-nearest neighbors
    Yuru Zhang
    Shifei Ding
    Lijuan Wang
    Yanru Wang
    Ling Ding
    Applied Intelligence, 2021, 51 : 2031 - 2044