Hypergraph based clustering for document similarity using FP growth algorithm

被引:0
|
作者
Ramakrishnan, Nayana [1 ]
Nair, Meenakshi J. [1 ]
Jayaprakash, Deepak [1 ]
Ananthakrishnan, H. [1 ]
Rani, Siji S. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Amritapuri, India
关键词
Hypergraph; Clustering; FP-Growth; Similarity;
D O I
10.1109/iccs45141.2019.9065630
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modelling multiple documents for different applications is a major field of research due to the tremendous growth in Web data. To find the document similarity, we require clustering to determine the grouping of unlabelled data. Graph models have the capability or knowledge of capturing the structural information in texts. It organizes high dimensional data in such a way that the user can effortlessly access the desired information. In this paper, we use a hypergraph with the help of an association rule mining to model a collection of text documents and find similarity between them using a hypergraph partitioning algorithm. Here we use FP-Growth algorithm to find the association relationship which is a recursive elimination scheme. We then uses a spectral clustering algorithm which uses eigenvalues and vectors which is found out from the matrices to find similar documents. Experiment shows that this algorithm gave better clusters compared to others which commonly take higher eigenvectors.
引用
收藏
页码:332 / 336
页数:5
相关论文
共 50 条
  • [21] CLUSTERING ALGORITHM BASED ON OBJECT SIMILARITY
    Nishanov, A. Kh.
    Akbarova, M. Kh.
    Tursunov, A. T.
    Ollamberganov, F. F.
    Rashidova, D. E.
    JOURNAL OF MATHEMATICS MECHANICS AND COMPUTER SCIENCE, 2024, 123 (03): : 108 - 120
  • [22] Incremental document clustering using cluster similarity histograms
    Hammouda, KM
    Kamel, MS
    IEEE/WIC INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS, 2003, : 597 - 601
  • [23] Membrane Clustering of Coronavirus Variants Using Document Similarity
    Lehotay-Kery, Peter
    Kiss, Attila
    GENES, 2022, 13 (11)
  • [24] Adaptive document clustering based on query-based similarity
    Na, Seung-Hoon
    Kang, In-Su
    Lee, Jong-Hyeok
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (04) : 887 - 901
  • [25] Hierarchical Document Clustering based on Cosine Similarity measure
    Popat, Shraddha K.
    Deshmukh, Pramod B.
    Metre, Vishakha A.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 153 - 159
  • [26] WordNet and Semantic Similarity based Approach for Document Clustering
    Desai, Sneha S.
    Laxminarayana, J. A.
    2016 INTERNATIONAL CONFERENCE ON COMPUTATION SYSTEM AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTIONS (CSITSS), 2016, : 312 - 317
  • [27] Efficient phrase-based document similarity for clustering
    Chim, Hung
    Deng, Xiaotie
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (09) : 1217 - 1229
  • [28] Hypergraph-Clustering Method Based on an Improved Apriori Algorithm
    Chen, Rumeng
    Hu, Feng
    Wang, Feng
    Bai, Libing
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [29] Stock Trends Prediction based on Hypergraph Modeling Clustering Algorithm
    Luo, Yongen
    Hu, Jicheng
    Wei, Xiaofeng
    Fang, Dongjian
    Shao, Heng
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2014, : 27 - 31
  • [30] A genetic clustering algorithm using a message-based similarity measure
    Chang, Dongxia
    Zhao, Yao
    Zheng, Changwen
    Zhang, Xianda
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (02) : 2194 - 2202