Hypergraph based clustering for document similarity using FP growth algorithm

被引:0
|
作者
Ramakrishnan, Nayana [1 ]
Nair, Meenakshi J. [1 ]
Jayaprakash, Deepak [1 ]
Ananthakrishnan, H. [1 ]
Rani, Siji S. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Amritapuri, India
关键词
Hypergraph; Clustering; FP-Growth; Similarity;
D O I
10.1109/iccs45141.2019.9065630
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modelling multiple documents for different applications is a major field of research due to the tremendous growth in Web data. To find the document similarity, we require clustering to determine the grouping of unlabelled data. Graph models have the capability or knowledge of capturing the structural information in texts. It organizes high dimensional data in such a way that the user can effortlessly access the desired information. In this paper, we use a hypergraph with the help of an association rule mining to model a collection of text documents and find similarity between them using a hypergraph partitioning algorithm. Here we use FP-Growth algorithm to find the association relationship which is a recursive elimination scheme. We then uses a spectral clustering algorithm which uses eigenvalues and vectors which is found out from the matrices to find similar documents. Experiment shows that this algorithm gave better clusters compared to others which commonly take higher eigenvectors.
引用
收藏
页码:332 / 336
页数:5
相关论文
共 50 条
  • [31] Frequent Term Based Text Document Clustering Using Similarity Measures: A Novel Approach
    Gupta, Vijay Kumar
    Dutta, Maitreyee
    Kumar, Manoj
    2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2017, : 164 - 169
  • [32] Sentence Clustering in Text Document Using Fuzzy Clustering Algorithm
    Sruthi, S.
    Shalini, L.
    2014 INTERNATIONAL CONFERENCE ON CONTROL, INSTRUMENTATION, COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICCICCT), 2014, : 1473 - 1476
  • [33] Normalization of Semantic Based Web Search Engines Using Page Rank Algorithm and Hypergraph Based Clustering
    Archana, G.
    Muruganantham, B.
    Jayapradha, J.
    COMPUTER NETWORKS AND INFORMATION TECHNOLOGIES, 2011, 142 : 464 - 467
  • [34] HyperGraph Convolution Based Attributed HyperGraph Clustering
    Kamhoua, Barakeel Fanseu
    Zhang, Lin
    Ma, Kaili
    Cheng, James
    Li, Bo
    Han, Bo
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 453 - 463
  • [35] DOCUMENT CLUSTERING USING ANT COLONY ALGORITHM
    Nagarajan, E.
    Saritha, Keshetty
    MadhuGayathri, G.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS AND COMPUTATIONAL INTELLIGENCE (ICBDAC), 2017, : 459 - 463
  • [36] Document Clustering using Concept Space and Cosine Similarity Measurement
    Muflikhah, Lailil
    Baharudin, Baharum
    PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT, VOL 1, 2009, : 58 - 62
  • [37] A Document Clustering Method based on Hierarchical Algorithm with Model Clustering
    Sun, Haojun
    Liu, Zhihui
    Kong, Lingjun
    2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3, 2008, : 1229 - +
  • [38] An Improved Algorithm of Similarity Based on Clustering in XML
    Wang, Puqing
    PROCEEDINGS OF THE 2016 2ND WORKSHOP ON ADVANCED RESEARCH AND TECHNOLOGY IN INDUSTRY APPLICATIONS, 2016, 81 : 837 - 841
  • [39] A flocking based algorithm for document clustering analysis
    Cui, Xiaohui
    Gao, Jinzhu
    Potok, Thomas E.
    JOURNAL OF SYSTEMS ARCHITECTURE, 2006, 52 (8-9) : 505 - 515
  • [40] A clustering with slope algorithm based on item similarity
    Wu Huiyun
    Wang Yuping
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (04) : 2177 - 2185