Research on small files classification based on improved KNN algorithm and pretreatment strategy

被引:0
|
作者
Shi, Hengliang [1 ,2 ]
Bai, Xiaolei [1 ]
Zhen, Lintao [1 ]
机构
[1] Information Engineering College, Henan University of Science and Technology, No. 263, Kaiyuan Road, Luoyang, China
[2] Noah (Suzhou) IT Solution Co., Ltd, Suzhou, China
来源
ICIC Express Letters | 2015年 / 9卷 / 02期
关键词
Data handling - Learning algorithms - Information retrieval systems;
D O I
暂无
中图分类号
学科分类号
摘要
This article which combines MapReduce model with mass data processing innovatively, proposes small files classification and pretreatment strategy research on mass data. The described method provides more convenience for the parallel computing characteristics of MapReduce architecture, and saves a large amount of processing time. Meanwhile, the classification method is proved to be efficient and reliable through some experiments. The strategy of the paper can be widely applied to document classification and clustering research and application. © 2015, ICIC International.
引用
收藏
页码:603 / 608
相关论文
共 50 条
  • [31] An improved sample mean KNN algorithm based on LDA
    Xue, Hongye
    Wang, Peiwen
    2019 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2019), VOL 1, 2019, : 266 - 270
  • [32] Medical Health Big Data Classification Based on KNN Classification Algorithm
    Xing, Wenchao
    Bei, Yilin
    IEEE ACCESS, 2020, 8 (28808-28819) : 28808 - 28819
  • [33] Rerouting Strategy Research Based on Improved Ant Colony Algorithm
    Wang, Lili
    Yang, Huidong
    PROCEEDINGS OF THE 2013 IEEE 8TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2013, : 766 - 770
  • [34] Text Classification Research Based on Improved SoftMax Regression Algorithm
    She, Xiangyang
    Zhu, Yinglong
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2018, : 273 - 276
  • [35] Research on manufacturing text classification based on improved genetic algorithm
    Zhou Kaijun
    Tong Yifei
    BRAZILIAN ARCHIVES OF BIOLOGY AND TECHNOLOGY, 2016, 59
  • [36] An Efficient Target-to-Area Classification Strategy with a PIP-Based KNN Algorithm for Epidemic Management
    Chen, Jong-Shin
    Hung, Ruo-Wei
    Yang, Cheng-Ying
    MATHEMATICS, 2025, 13 (04)
  • [37] Application Research of KNN Algorithm Based on Clustering in Big Data Talent Demand Information Classification
    Xiao, Qingtao
    Zhong, Xin
    Zhong, Chenghua
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2020, 34 (06)
  • [38] Research on Text Classification Based on SVM-KNN
    Lin, Yun
    Wang, Jie
    2014 5TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2014, : 842 - 844
  • [39] Research on the Perforating Algorithm Based on STL Files
    Han Yuchuan
    Zhu Xianfeng
    Bai Yunrui
    Wu Zhiwen
    2ND INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2018), 2018, 1004
  • [40] A Strategy of Small Files Caching Based on Access Temperature
    Tan, Songfu
    Zhu, Ligu
    Feng, Dongyu
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 482 - 486