Enhanced Distributed Document Clustering Algorithm Using Different Similarity Measures

被引：0

作者：

Narayanan, Neethi ^{[1
]}

Judith, J. E. ^{[1
]}

Jayakumari, J. ^{[1
]}

机构：

[1] Noorul Islam Ctr Higher Educ Kumaracoil, Kumaracoil, Tamil Nadu, India

来源：

2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013) | 2013年

关键词：

Distributed document clustering similarity measures; Cosine similarity; Jaccard coefficient; Pearson coefficient;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Many of the distributed environments like internets, intranets, local area networks and wireless networks have different distributed data sources. lnorder to analyze and monitor these distributed data sources specialized data mining technologies for distributed applications are required. A variety of distributed document clustering algorithms exists for this purpose. This paper presents an Enhanced Distributed Algorithm (FDA) for document clustering. This paper presents the performance analysis of the algorithm using different similarity measures like cosine similarity, Jaccard and Pearson coefficient. The test was performed on standard document corpora like 2ONG (News Group), Reuters, Web The performance of this proposed FDA algorithm is also evaluated using different performance factors in order to determine its accuracy and clustering quality.

引用

页码：545 / 550

页数：6

共 50 条

[41] Clustering Blogs Using Document Context Similarity and Spectral Graph Partitioning
Ayyasamy, Ramesh Kumar
Alhashmi, Saadat M.
Eu-Gene, Siew
Tahayna, Bashar
KNOWLEDGE ENGINEERING AND MANAGEMENT, 2011, 123 : 475 - +
[42] An Empirical Evaluation of K-Means Clustering Algorithm Using Different Distance/Similarity Metrics
Gupta, Manoj Kumar
Chandra, Pravin
PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 884 - 892
[43] A Similarity Rough Set Model for Document Representation and Document Clustering
Nguyen Chi Thanh
Yamada, Koichi
Unehara, Muneyuki
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 15 (02) : 125 - 133
[44] Efficient Pre-Processing for Enhanced Semantics Based Distributed Document Clustering
Shah, Neepa
Mahajan, Sunita
PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 338 - 343
[45] Distributed Clustering Algorithm in Sensor Networks via Normalized Information Measures
Qin, Jiahu
Zhu, Yingda
Fu, Weiming
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 : 3266 - 3279
[46] Clustering of documents via similarity measures
Rezanková, H
Húsek, D
Smid, J
Snásel, V
CIC'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN COMPUTING, 2003, : 292 - 299
[47] SIMILARITY MEASURES FOR NOMINAL VARIABLE CLUSTERING
Sulc, Zdenek
8TH INTERNATIONAL DAYS OF STATISTICS AND ECONOMICS, 2014, : 1536 - 1545
[48] Improved Similarity Measures For Software Clustering
Naseem, Rashid
Maqbool, Onaiza
Muhammad, Siraj
2011 15TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2011, : 45 - 54
[49] Text document clustering using Spectral Clustering algorithm with Particle Swarm Optimization
Janani, R.
Vijayarani, S.
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 134 : 192 - 200
[50] Document Clustering in Correlation Similarity Measure Space
Zhang, Taiping
Tang, Yuan Yan
Fang, Bin
Xiang, Yong
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (06) : 1002 - 1013

← 1 2 3 4 5 →