Extended VSM for XML Document Classification Using Frequent Subtrees

被引:0
|
作者
Yang, Jianwu [1 ]
Wang, Songlin [1 ]
机构
[1] Peking Univ, Inst Comp Sci & Tech, Beijing 100871, Peoples R China
来源
关键词
XML Document; Classification; Vector Space Model (VSM); Structured Link Vector Model (SLVM); Frequent Subtree;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Structured link vector model (SLVM) is a representation proposed for modeling XML documents which was extended from the conventional vector space model (VSM) by incorporating document structures In this paper we describe the classification approach for XML documents based on SLVM in the Document Mining Challenge of INEX 2009 where the closed frequent subtrees as structural units are used for content extraction from the XML document and the Chi-square test is used for feature selection
引用
收藏
页码:441 / 448
页数:8
相关论文
共 50 条
  • [1] XML Document Classification Using Extended VSM
    Yang, Jianwu
    Zhang, Fudong
    FOCUSED ACCESS TO XML DOCUMENTS, 2008, 4862 : 234 - 244
  • [2] Clustering XML Documents Using Frequent Subtrees
    Kutty, Sangeetha
    Tran, Tien
    Nayak, Richi
    Li, Yuefeng
    ADVANCES IN FOCUSED RETRIEVAL, 2009, 5631 : 436 - 445
  • [4] Discovering frequent subtrees from XML data using neural networks
    College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    Wuhan Univ J Nat Sci, 2006, 1 (117-121):
  • [5] Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach
    Kutty, Sangeetha
    Tran, Tien
    Nayak, Richi
    Li, Yuefeng
    FOCUSED ACCESS TO XML DOCUMENTS, 2008, 4862 : 183 - 194
  • [6] VSM: Mapping XML document to relations with constraint
    Han, ZM
    Yu, SJ
    Le, JJ
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2004: COOPLS, DOA, AND ODBASE, PT 2, PROCEEDINGS, 2004, 3291 : 1113 - 1122
  • [7] A novel method for mining frequent subtrees from XML data
    Zhang, WS
    Liu, DX
    Zhang, JP
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 300 - 305
  • [8] Mining frequent rooted subtrees in XML data with Me-tree
    Zhang, WS
    Liu, DX
    Zhang, JP
    2004 IEEE SYSTEMS & INFORMATION ENGINEERING DESIGN SYMPOSIUM, 2004, : 209 - 214
  • [9] Probabilistic frequent subtrees for efficient graph classification and retrieval
    Welke, Pascal
    Horvath, Tamas
    Wrobel, Stefan
    MACHINE LEARNING, 2018, 107 (11) : 1847 - 1873
  • [10] Probabilistic frequent subtrees for efficient graph classification and retrieval
    Pascal Welke
    Tamás Horváth
    Stefan Wrobel
    Machine Learning, 2018, 107 : 1847 - 1873