Hypergraph based clustering for document similarity using FP growth algorithm

被引:0
|
作者
Ramakrishnan, Nayana [1 ]
Nair, Meenakshi J. [1 ]
Jayaprakash, Deepak [1 ]
Ananthakrishnan, H. [1 ]
Rani, Siji S. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Amritapuri, India
关键词
Hypergraph; Clustering; FP-Growth; Similarity;
D O I
10.1109/iccs45141.2019.9065630
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modelling multiple documents for different applications is a major field of research due to the tremendous growth in Web data. To find the document similarity, we require clustering to determine the grouping of unlabelled data. Graph models have the capability or knowledge of capturing the structural information in texts. It organizes high dimensional data in such a way that the user can effortlessly access the desired information. In this paper, we use a hypergraph with the help of an association rule mining to model a collection of text documents and find similarity between them using a hypergraph partitioning algorithm. Here we use FP-Growth algorithm to find the association relationship which is a recursive elimination scheme. We then uses a spectral clustering algorithm which uses eigenvalues and vectors which is found out from the matrices to find similar documents. Experiment shows that this algorithm gave better clusters compared to others which commonly take higher eigenvectors.
引用
收藏
页码:332 / 336
页数:5
相关论文
共 50 条
  • [1] Enhanced Distributed Document Clustering Algorithm Using Different Similarity Measures
    Narayanan, Neethi
    Judith, J. E.
    Jayakumari, J.
    2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013), 2013, : 545 - 550
  • [2] Multilayer Hypergraph Clustering Using the Aggregate Similarity Matrix
    Alaluusua, Kalle
    Avrachenkov, Konstantin
    Kumar, B. R. Vinay
    Leskela, Lasse
    ALGORITHMS AND MODELS FOR THE WEB GRAPH, WAW 2023, 2023, 13894 : 83 - 98
  • [3] A Novice approach for web document clustering using FP Growth based Fuzzy Particle Swarm Optimization
    Pamba, Raja Varma
    Sherly, Elizabeth
    2016 3RD INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2016), 2016, : 90 - 93
  • [4] Chinese Document Keyword Extraction Algorithm based on FP-Growth
    Zhao, Meng
    Yu, Wanjun
    Lu, Wenjing
    Liu, Quan
    Li, Jinxiao
    2016 INTERNATIONAL CONFERENCE ON SMART CITY AND SYSTEMS ENGINEERING (ICSCSE), 2016, : 202 - 205
  • [5] Document Clustering Based on Fuzzy Similarity
    Zhou, Jingli
    Nie, Xuejun
    Qin, Leihua
    Zhu, Jianfeng
    APPLIED MECHANICS AND MECHANICAL ENGINEERING, PTS 1-3, 2010, 29-32 : 2620 - 2626
  • [6] Fine-Tuning an Algorithm for Semantic Document Clustering Using a Similarity Graph
    Stanchev, Lubomir
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2016, 10 (04) : 527 - 555
  • [7] A Novel Clustering Algorithm Using Hypergraph-Based Granular Computing
    Liu, Qun
    Liao, XiaoFeng
    Wu, Yu
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2010, 25 (02) : 155 - 164
  • [8] Similarity based clustering using the expectation maximization algorithm
    Brankov, JG
    Galatsanos, NP
    Yang, YY
    Wernick, MN
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 97 - 100
  • [9] Semantic Document Clustering Using a Similarity Graph
    Stanchev, Lubomir
    2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 1 - 8
  • [10] Document clustering based on similarity of subjects using integrated subject graph
    Nakada, M
    Osana, Y
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, 2006, : 410 - +