Enhanced Distributed Document Clustering Algorithm Using Different Similarity Measures

被引:0
|
作者
Narayanan, Neethi [1 ]
Judith, J. E. [1 ]
Jayakumari, J. [1 ]
机构
[1] Noorul Islam Ctr Higher Educ Kumaracoil, Kumaracoil, Tamil Nadu, India
关键词
Distributed document clustering similarity measures; Cosine similarity; Jaccard coefficient; Pearson coefficient;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Many of the distributed environments like internets, intranets, local area networks and wireless networks have different distributed data sources. lnorder to analyze and monitor these distributed data sources specialized data mining technologies for distributed applications are required. A variety of distributed document clustering algorithms exists for this purpose. This paper presents an Enhanced Distributed Algorithm (FDA) for document clustering. This paper presents the performance analysis of the algorithm using different similarity measures like cosine similarity, Jaccard and Pearson coefficient. The test was performed on standard document corpora like 2ONG (News Group), Reuters, Web The performance of this proposed FDA algorithm is also evaluated using different performance factors in order to determine its accuracy and clustering quality.
引用
收藏
页码:545 / 550
页数:6
相关论文
共 50 条
  • [31] Enhanced Ant Clustering Algorithm (EACA) For Distributed Databases
    Sumangala, K.
    2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [32] Similarity Measures for Spatial Clustering
    Hamdad, Leila
    Benatchba, Karima
    Ifrez, Soraya
    Mohguen, Yasmine
    COMPUTATIONAL INTELLIGENCE AND ITS APPLICATIONS, 2018, 522 : 25 - 36
  • [33] Document Clustering Based on Fuzzy Similarity
    Zhou, Jingli
    Nie, Xuejun
    Qin, Leihua
    Zhu, Jianfeng
    APPLIED MECHANICS AND MECHANICAL ENGINEERING, PTS 1-3, 2010, 29-32 : 2620 - 2626
  • [34] Plagiarism detection using document similarity based on distributed representation
    Baba, Kensuke
    Nakatoh, Tetsuya
    Minami, Toshiro
    8TH INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY, 2017, 111 : 382 - 387
  • [35] Distributed hierarchical document clustering
    Deb, Debzani
    Fuad, M. Muztaba
    Angryk, Rafal A.
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER SCIENCE AND TECHNOLOGY, 2006, : 328 - +
  • [36] TIERED CITATION AND MEASURES OF DOCUMENT SIMILARITY
    CRONIN, B
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1994, 45 (07): : 537 - 538
  • [37] MIN AND MAX HIERARCHICAL CLUSTERING USING ASYMMETRIC SIMILARITY MEASURES
    HUBERT, L
    PSYCHOMETRIKA, 1973, 38 (01) : 63 - 72
  • [38] Document clustering based on similarity of subjects using integrated subject graph
    Nakada, M
    Osana, Y
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, 2006, : 410 - +
  • [39] Similarity based clustering using the expectation maximization algorithm
    Brankov, JG
    Galatsanos, NP
    Yang, YY
    Wernick, MN
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 97 - 100
  • [40] Traffic Similarity Observation Using a Genetic Algorithm and Clustering
    Oujezsky, Vaclav
    Horvath, Tomas
    TECHNOLOGIES, 2018, 6 (04):