Ontology Extraction Considering Content Concordance from Tagging to Web Pages in Similar SBM Users

被引:0
|
作者
Harada, Fumiko [1 ]
Shimakawa, Hiromitsu [1 ]
机构
[1] Ritsumeikan Univ, Fac Comp Sci, Dept Informat Sci & Engn, Kusatsu, Shiga 5258577, Japan
关键词
personal phrase meaning; tagging; social bookmark; similar user;
D O I
10.1109/IIAI-AAI.2013.45
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To realize web search engines with considering meaning of query phrases for each user, we have studied a method to extract hierarchical and synonymous relationships among tagged phrases on a social bookmark (SBM) for an individual SBM user. It detects the relationships from webpage clusters with same tagged phrases derived from the bookmarks shared in the target and his similar SBM users. However, noisy tagging violating personal phrase meaning degrades its detection accuracy. This paper proposes a method to improve such drawback. The proposed method classifies webpages based on its content concordance as long as based on sameness of tagged phrases. Analyzing webpages belongingness to content-based and tag-based clusters, the relationships are detected more accurately. We compared the detection accuracies of the proposed and traditional methods through an experiment. For hierarchical relationships, the F-measure improves by 7.41% and the precision improves by 20.94% under guaranteeing more than 20% recall. For synonymous one, the F-measure does by 4.17% and the precision does by 21.80% under more than 10% recall.
引用
收藏
页码:289 / 295
页数:7
相关论文
共 50 条
  • [41] Automatic Data Extraction from Lists in Web Pages Based on XML
    Xin, Zhou
    Hao, Wang
    ADVANCED TECHNOLOGY IN TEACHING - PROCEEDINGS OF THE 2009 3RD INTERNATIONAL CONFERENCE ON TEACHING AND COMPUTATIONAL SCIENCE (WTCS 2009), VOL 2: EDUCATION, PSYCHOLOGY AND COMPUTER SCIENCE, 2012, 117 : 915 - 921
  • [42] Extraction of ontologies from web pages: Conceptual modelling and tourism application
    Riadi-GDL Laboratory, ENSI Campus, Universitaire de la Manouba, Tunisia
    不详
    不详
    J. Internet Technol., 2007, 4 (411-421):
  • [43] Data extraction from semi-structured web pages by clustering
    Vuong, Le Phong Bao
    Gao, Xiaoying
    Zhang, Mengjie
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 374 - +
  • [44] Pattern Matching for Extraction of Core Contents from News Web Pages
    Sirsat, Sandeep
    Chavan, Vinay
    2016 SECOND INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2016, : 13 - 18
  • [45] Leveraging spatial join for robust tuple extraction from web pages
    Han, Wook-Shin
    Kwak, Wooseong
    Yu, Hwanjo
    Lee, Jeong-Hoon
    Kim, Min-Soo
    INFORMATION SCIENCES, 2014, 261 : 132 - 148
  • [46] Ontology creation: Extraction of domain knowledge from web documents
    Storey, VC
    Chiang, R
    Chen, GL
    CONCEPTUAL MODELING - ER 2005, 2005, 3716 : 256 - 269
  • [47] Ontology-based Knowledge Extraction from Hidden Web
    宋晖
    马范援
    刘晓强
    Journal of DongHua University, 2004, (05) : 73 - 78
  • [48] Bootstrapping Information Extraction from Semi-structured Web Pages
    Carlson, Andrew
    Schafer, Charles
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS, 2008, 5211 : 195 - +
  • [49] An Approach to Image Extraction and Accurate Skin Detection from Web Pages
    Girgis, Moheb R.
    Mahmoud, Tarek M.
    Abd-El-Hafeez, Tarek
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 21, 2007, 21 : 367 - 375
  • [50] Ontology-based knowledge extraction from hidden web
    Song, Hui
    Ma, Fan-Yuan
    Liu, Xiao-Qiang
    Journal of Dong Hua University (English Edition), 2004, 21 (05): : 73 - 78