Ontology Extraction Considering Content Concordance from Tagging to Web Pages in Similar SBM Users

被引:0
|
作者
Harada, Fumiko [1 ]
Shimakawa, Hiromitsu [1 ]
机构
[1] Ritsumeikan Univ, Fac Comp Sci, Dept Informat Sci & Engn, Kusatsu, Shiga 5258577, Japan
关键词
personal phrase meaning; tagging; social bookmark; similar user;
D O I
10.1109/IIAI-AAI.2013.45
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To realize web search engines with considering meaning of query phrases for each user, we have studied a method to extract hierarchical and synonymous relationships among tagged phrases on a social bookmark (SBM) for an individual SBM user. It detects the relationships from webpage clusters with same tagged phrases derived from the bookmarks shared in the target and his similar SBM users. However, noisy tagging violating personal phrase meaning degrades its detection accuracy. This paper proposes a method to improve such drawback. The proposed method classifies webpages based on its content concordance as long as based on sameness of tagged phrases. Analyzing webpages belongingness to content-based and tag-based clusters, the relationships are detected more accurately. We compared the detection accuracies of the proposed and traditional methods through an experiment. For hierarchical relationships, the F-measure improves by 7.41% and the precision improves by 20.94% under guaranteeing more than 20% recall. For synonymous one, the F-measure does by 4.17% and the precision does by 21.80% under more than 10% recall.
引用
收藏
页码:289 / 295
页数:7
相关论文
共 50 条
  • [31] Zero-shot Entity Extraction from Web Pages
    Pasupat, Panupong
    Liang, Percy
    PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 391 - 401
  • [32] Automatic Extraction of Textual Elements from News Web Pages
    Ibrahim, Hossam
    Darwish, Kareem
    Abdel-sabor, Abdel-Rahim
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1600 - 1603
  • [33] Extraction of flat and nested data records from web pages
    Algur, Siddu P.
    Hiremath, P.S.
    Conferences in Research and Practice in Information Technology Series, 2006, 61 : 163 - 168
  • [34] TEXT: Automatic Template Extraction from Heterogeneous Web Pages
    Kim, Chulyun
    Shim, Kyuseok
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (04) : 612 - 626
  • [35] Schema Inference and Data Extraction from Templatized Web Pages
    Krishna, Shinde Santaji
    Dattatraya, Joshi Shashank
    2015 INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING (ICPC), 2015,
  • [36] Automatic data extraction from template generated web pages
    Ma, L
    Goharian, N
    Chowdhury, A
    PDPTA'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-4, 2003, : 642 - 648
  • [37] A hybrid approach for extracting informative content from web pages
    Uzun, Erdinc
    Agun, Hayri Volkan
    Yerlikaya, Tarik
    INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (04) : 928 - 944
  • [38] Rule identification using ontology while acquiring rules from Web pages
    Park, Sangun
    Lee, Jae Kyu
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2007, 65 (07) : 659 - 673
  • [39] LBDA: A NOVEL FRAMEWORK FOR EXTRACTING CONTENT FROM WEB PAGES
    Vijendran, Anna Saro
    Deepa, C.
    PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2013,
  • [40] Knowledge Extraction from Web Pages with an Auto-Adaptive System
    Havas, Camille
    Larue, Othalia
    Camus, Mickael
    COMPUTATIONAL ENGINEERING IN SYSTEMS APPLICATIONS, 2008, : 126 - 131