A Novel Approach to Cluster Web Traversal Patterns Based on Edit Distance

被引:0
|
作者
Tan, Xiaoqiu [1 ]
Xu, Miaojun [1 ]
机构
[1] Zhejiang Ocean Univ, Sch Math Phys & Informat, Zhoushan, Peoples R China
关键词
Edit distance; Clustering; Traversal Pattern; Web Topology;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Edit distance, as a similarity measure between user traversal patterns, satisfies the need of varying-length of user traversal sequences very well because it can be computed between different-length symbol strings which needs lower time and storage expense. Moreover, web topology is skillfully used to compute the relationship between pages which is used as a measure of cost of an edit operation. Finally, two-threshold sequential clustering method (TTSCM) is used to cluster user traversal patterns avoiding specifying the number of cluster in advance, and reducing the dependency between the clustering results and the clustering order of traversal patterns. Experimental results test and verify the effectiveness and flexibility of our proposed methods.
引用
收藏
页码:440 / 447
页数:8
相关论文
共 50 条
  • [1] Graph Traversal Edit Distance and Extensions
    Ebrahimpour Boroojeny, Ali
    Shrestha, Akash
    Sharifi-Zarchi, Ali
    Gallagher, Suzanne Renick
    Sahinalp, S. Cenk
    Chitsaz, Hamidreza
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2020, 27 (03) : 317 - 329
  • [2] GTED: Graph Traversal Edit Distance
    Boroojeny, Ali Ebrahimpour
    Shrestha, Akash
    Sharifi-Zarchi, Ali
    Gallagher, Suzanne Renick
    Sahinalp, S. Cenk
    Chitsaz, Hamidreza
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2018, 2018, 10812 : 37 - 53
  • [3] Efficient approach for interactively mining web traversal patterns
    Lee, YS
    Hsieh, MC
    Yen, SJ
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2005, PT 2, 2005, 3481 : 1055 - 1065
  • [4] PyGTED: Python']Python Application for Computing Graph Traversal Edit Distance
    Boroojeny, Ali Ebrahimpour
    Shrestha, Akash
    Sharifi-zarchi, Ali
    Gallagher, Suzanne Renick
    Sahinalp, Suleyman Cenk
    Chitsaz, Hamidreza
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2020, 27 (03) : 436 - 439
  • [5] Revisiting the complexity of and algorithms for the graph traversal edit distance and its variants
    Qiu, Yutong
    Shen, Yihang
    Kingsford, Carl
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2024, 19 (01)
  • [6] Hierarchical distance-based clustering for interactive VRML traversal patterns
    Hung, SS
    Hou, JL
    Huang, WF
    Liu, DSM
    ITRE 2005: 3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: RESEARCH AND EDUCATION, PROCEEDINGS, 2005, : 485 - 489
  • [7] A Novel Human Action Recognition Algorithm Based on Edit Distance
    Yuan, Hejin
    Wang, Cuiru
    NEW TRENDS AND APPLICATIONS OF COMPUTER-AIDED MATERIAL AND ENGINEERING, 2011, 186 : 261 - 265
  • [8] Efficient Mining of Utility-Based Web Path Traversal Patterns
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    11TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY, VOLS I-III, PROCEEDINGS,: UBIQUITOUS ICT CONVERGENCE MAKES LIFE BETTER!, 2009, : 2215 - 2218
  • [9] Matching Patterns with Variables Under Edit Distance
    Gawrychowski, Pawel
    Manea, Florin
    Siemer, Stefan
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2022, 2022, 13617 : 275 - 289
  • [10] Incremental and interactive mining of web traversal patterns
    Lee, Yue-Shi
    Yen, Show-Jane
    INFORMATION SCIENCES, 2008, 178 (02) : 287 - 306