PROBABILISTIC HEURISTICS FOR HIERARCHICAL WEB DATA CLUSTERING

被引:2
|
作者
Chehreghani, Morteza Haghir [1 ]
Chehreghani, Mostafa Haghir [1 ]
Abolhassani, Hassan [1 ]
机构
[1] Sharif Univ Technol, Fac Comp Engn, Web Intelligence Lab, Dept Comp Engn, Tehran, Iran
关键词
data mining; Web clustering; Bayesian networks; hierarchical clustering; representative point;
D O I
10.1111/j.1467-8640.2012.00414.x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering Web data is one important technique for extracting knowledge from the Web. In this paper, a novel method is presented to facilitate the clustering. The method determines the appropriate number of clusters and provides suitable representatives for each cluster by inference from a Bayesian network. Furthermore, by means of the Bayesian network, the contents of the Web pages are converted into vectors of lower dimensions. The method is also extended for hierarchical clustering, and a useful heuristic is developed to select a good hierarchy. The experimental results show that the clusters produced benefit from high quality.
引用
收藏
页码:209 / 233
页数:25
相关论文
共 50 条
  • [41] Similarity Heuristics for Clustering Wells Based on Logging-Data
    Khliustov D.K.
    Kovalev D.Y.
    Safonov S.S.
    Lobachevskii Journal of Mathematics, 2023, 44 (1) : 157 - 169
  • [43] Application of Agglomerative Hierarchical Clustering for Clustering of Time Series Data
    Radovanovic, Ana
    Li, Junshi
    Milanovic, Jovica, V
    Milosavljevic, Nina
    Storchi, Riccardo
    2020 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES EUROPE (ISGT-EUROPE 2020): SMART GRIDS: KEY ENABLERS OF A GREEN POWER SYSTEM, 2020, : 640 - 644
  • [44] Data clustering and analyzing techniques using hierarchical clustering method
    Hu, Wen
    Pan, Qing He
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (19) : 8495 - 8504
  • [45] Data clustering and analyzing techniques using hierarchical clustering method
    Wen Hu
    Qing he Pan
    Multimedia Tools and Applications, 2015, 74 : 8495 - 8504
  • [46] A unified probabilistic framework for clustering correlated heterogeneous web objects
    Liu, GW
    Zhu, WB
    Yu, Y
    WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 76 - 87
  • [47] Effective clustering algorithm for probabilistic data stream
    Dai, Dong-Bo
    Zhao, Gang
    Sun, Sheng-Li
    Ruan Jian Xue Bao/Journal of Software, 2009, 20 (05): : 1313 - 1328
  • [48] Data Visualization with Probabilistic Clustering and Neighbor Embedding
    Liao, Xiaohui
    Yan, Jingqi
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9044 - 9050
  • [49] Improvements on approximation algorithms for clustering probabilistic data
    Sharareh Alipour
    Knowledge and Information Systems, 2021, 63 : 2719 - 2740
  • [50] Improvements on approximation algorithms for clustering probabilistic data
    Alipour, Sharareh
    KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (10) : 2719 - 2740