Discovering semantic sibling groups from web documents with XTREEM-SG

被引:0
|
作者
Brunzel, Marko [1 ]
Spiliopoulou, Myra [1 ]
机构
[1] Otto Von Guericke Univ, Magdeburg, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The acquisition of explicit semantics is still a research challenge. Approaches for the extraction of semantics focus mostly on learning hierarchical hypernym-hyponym relations. The extraction of co-hyponym and co-meronym sibling semantics is performed to a much lesser extent, though they are not less important in ontology engineering. In this paper we will describe and evaluate the XTREEM-SG (Xhtml TREE Mining - for Sibling Groups) approach on finding sibling semantics from semi-structured Web documents. XTREEM takes advantage of the added value of mark-up, available in web content, for grouping text siblings. We will show that this grouping is semantically meaningful. The XTREEM-SG approach has the advantage that it is domain and language independent; it does not rely on background knowledge, NLP software or training. In this paper we apply the XTREEM-SG approach and evaluate against the reference semantics from two golden standard ontologies. We investigate how variations on input, parameters and reference influence the obtained results on structuring a closed vocabulary on sibling relations. Earlier methods that evaluate sibling relations against a golden standard report a 14.18% F-measure value. Our method improves this number into 21.47%.
引用
收藏
页码:141 / 157
页数:17
相关论文
共 18 条
  • [1] Discovering semantic sibling associations from Web Documents with XTREEM-SP
    Brunzel, Marko
    Spiliopoulou, Myra
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 469 - 480
  • [2] Acquiring semantic sibling associations from web documents
    Department of Knowledge Management Research, German Research Center for Artificial Intelligence , Germany
    不详
    Int. J. Data Warehouse. Min., 2007, 4 (83-98):
  • [3] Discovering multi terms and co-hyponymy from XHTML']HTML documents with XTREEM
    Brunzel, Marko
    Spiliopoulou, Myra
    KNOWLEDGE DISCOVERY FROM XML DOCUMENTS, PROCEEDINGS, 2006, 3915 : 22 - 32
  • [4] Discovering Semantic Relations from the Web and Organizing them with PATTY
    Nakashole, Ndapandula
    Weikum, Gerhard
    Suchanek, Fabian
    SIGMOD RECORD, 2013, 42 (02) : 29 - 34
  • [5] SNExtractor: A Prototype for Extracting Semantic Networks from Web Documents
    Zhang, Chi
    Wang, Yanhua
    Wang, Chengyu
    Cheng, Wenliang
    He, Xiaofeng
    WEB-AGE INFORMATION MANAGEMENT, PT II, 2016, 9659 : 527 - 530
  • [6] Classification of Durian Characteristics for Semantic Representation from Web Documents
    Abu Bakar, Zainab
    Ismail, Khairul Nurmazianna
    2012 IEEE SYMPOSIUM ON E-LEARNING, E-MANAGEMENT AND E-SERVICES (IS3E 2012), 2012, : 111 - 115
  • [7] Discovering Multilingual Concepts from Unaligned Web Documents by Exploring Associated Images
    Zhang, Xiaochen
    Jin, Xiaoming
    Li, Lianghao
    Shen, Dou
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 173 - 174
  • [8] Location based Semantic Information Retrieval from Web Documents using Web Crawler
    Archana, A. B.
    Kumar, Jalesh
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2015, : 370 - 375
  • [9] Web Video Event Recognition by Semantic Analysis From Ubiquitous Documents
    Yu, Litao
    Yang, Yang
    Huang, Zi
    Wang, Peng
    Song, Jingkuan
    Shen, Heng Tao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 26 (12) : 5689 - 5701
  • [10] Towards a Semantic Web: Ontology Development based on the Extraction of Semantic Concepts from Digital Documents
    Abascal Mena, Rocio
    PROCEEDINGS OF THE 13TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS, 2009, : 519 - +