Research on small files classification based on improved KNN algorithm and pretreatment strategy

被引:0
|
作者
Shi, Hengliang [1 ,2 ]
Bai, Xiaolei [1 ]
Zhen, Lintao [1 ]
机构
[1] Information Engineering College, Henan University of Science and Technology, No. 263, Kaiyuan Road, Luoyang, China
[2] Noah (Suzhou) IT Solution Co., Ltd, Suzhou, China
来源
ICIC Express Letters | 2015年 / 9卷 / 02期
关键词
Data handling - Learning algorithms - Information retrieval systems;
D O I
暂无
中图分类号
学科分类号
摘要
This article which combines MapReduce model with mass data processing innovatively, proposes small files classification and pretreatment strategy research on mass data. The described method provides more convenience for the parallel computing characteristics of MapReduce architecture, and saves a large amount of processing time. Meanwhile, the classification method is proved to be efficient and reliable through some experiments. The strategy of the paper can be widely applied to document classification and clustering research and application. © 2015, ICIC International.
引用
收藏
页码:603 / 608
相关论文
共 50 条
  • [21] An Improved Algorithm based on KNN and Random Forest
    Liang, Jun
    Liu, Qin
    Nie, Nuihua
    Zeng, Biqing
    Zhang, Zanbo
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2019), 2019,
  • [22] An Improved kNN Algorithm based on Essential Vector
    Zhao, Weidong
    Tang, Shuanglin
    Dai, Weihui
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2012, 123 (07) : 119 - 122
  • [23] Improved sine cosine algorithm based on dynamic classification strategy
    Wei F.
    Zhang Y.
    Li J.
    Shi Y.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2021, 43 (06): : 1596 - 1605
  • [24] Research on efficient classification algorithm for coal and gangue based on improved MobilenetV3-small
    Cao, Zhenguan
    Li, Jinbiao
    Fang, Liao
    Li, Zhuoqin
    Yang, Haixia
    Dong, Gaohui
    INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION, 2025, 45 (02) : 437 - 462
  • [25] Research on the high robustness data classification and the mining algorithm based on hierarchical clustering and KNN
    Li, Haohang
    Wang, Shen
    Tang, Rui
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES), 2016, : 1049 - 1054
  • [26] Tennis Posture Classification and Recognition Based on an Improved KNN
    Wang, Xinran
    Huang, Yaxin
    Zhong, Jianpeng
    Zhu, Yongqi
    Tang, Qing
    Wang, Meili
    Li, Shuqin
    TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
  • [27] An Improved ML-kNN Algorithm by Fusing Nearest Neighbor Classification
    Zeng, Yong
    Fu, Hao-ming
    Zhang, Yu-ping
    Zhao, Xi-ya
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER SCIENCE (AICS 2016), 2016, : 193 - 198
  • [28] Anti Social Comment Classification based on kNN Algorithm
    Chandra, Nidhi
    Khatri, Sunil Kumar
    Som, Subhranil
    2017 6TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO), 2017, : 348 - 354
  • [29] A fast classification algorithm for big data based on KNN
    Niu, Kun
    Zhao, Fang
    Zhang, Shubo
    Journal of Applied Sciences, 2013, 13 (12) : 2208 - 2212
  • [30] A novel pre-classification based kNN algorithm
    Xie, Huahua
    Liang, Dong
    Zhang, Zhaojing
    Jin, Hao
    Lu, Chen
    Lin, Yi
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2016, : 1269 - 1275