Ontology Extraction Considering Content Concordance from Tagging to Web Pages in Similar SBM Users

被引:0
|
作者
Harada, Fumiko [1 ]
Shimakawa, Hiromitsu [1 ]
机构
[1] Ritsumeikan Univ, Fac Comp Sci, Dept Informat Sci & Engn, Kusatsu, Shiga 5258577, Japan
关键词
personal phrase meaning; tagging; social bookmark; similar user;
D O I
10.1109/IIAI-AAI.2013.45
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To realize web search engines with considering meaning of query phrases for each user, we have studied a method to extract hierarchical and synonymous relationships among tagged phrases on a social bookmark (SBM) for an individual SBM user. It detects the relationships from webpage clusters with same tagged phrases derived from the bookmarks shared in the target and his similar SBM users. However, noisy tagging violating personal phrase meaning degrades its detection accuracy. This paper proposes a method to improve such drawback. The proposed method classifies webpages based on its content concordance as long as based on sameness of tagged phrases. Analyzing webpages belongingness to content-based and tag-based clusters, the relationships are detected more accurately. We compared the detection accuracies of the proposed and traditional methods through an experiment. For hierarchical relationships, the F-measure improves by 7.41% and the precision improves by 20.94% under guaranteeing more than 20% recall. For synonymous one, the F-measure does by 4.17% and the precision does by 21.80% under more than 10% recall.
引用
收藏
页码:289 / 295
页数:7
相关论文
共 50 条
  • [1] Personal Ontology Extraction Considering Content Concordance from Tagging to Webpages in Similar SBM Users
    Harada, Fumiko
    Shimakawa, Hiromitsu
    APPLIED COMPUTING AND INFORMATION TECHNOLOGY, 2014, 553 : 137 - 154
  • [2] A Geo-Tagging Framework for Address Extraction from Web Pages
    Efremova, Julia
    Endres, Ian
    Vidas, Isaac
    Melnik, Ofer
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 288 - 295
  • [3] A Novel Approach for Content Extraction from Web Pages
    Bhardwaj, Aanshi
    Mangat, Veenu
    2014 RECENT ADVANCES IN ENGINEERING AND COMPUTATIONAL SCIENCES (RAECS), 2014,
  • [4] Extraction of core web content from web pages using noise elimination
    Saravanan A.
    Bama S.S.
    Journal of Engineering Science and Technology Review, 2020, 13 (04) : 173 - 187
  • [5] Content Extraction from Web Pages Based on Chinese Punctuation Number
    Song, Mingqiu
    Wu, Xintao
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5573 - 5575
  • [6] Authoring of Personalized Web Page from Heterogeneous Web Pages by Content Extraction and Integration
    Li, Wei-gang
    Sun, Ke
    Wang, Shuo-chen
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGY (CNCT 2016), 2016, 54 : 734 - 740
  • [7] Product ontology learning from web pages
    Fu Kui
    Nie Guihua
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON INNOVATION & MANAGEMENT, VOLS I AND II, 2007, : 1864 - 1867
  • [8] Information Extraction from Web pages
    Novotny, Robert
    Vojtas, Peter
    Maruscak, Dusan
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 121 - +
  • [9] Content Extraction from Web Pages Based on the Row Block Semantics and Punctuations
    Song, Anping
    Ding, Xuehai
    Li, Mingbo
    Si, Wulin
    Zhang, Wu
    PROCEEDINGS OF THE 2013 ASIA-PACIFIC COMPUTATIONAL INTELLIGENCE AND INFORMATION TECHNOLOGY CONFERENCE, 2013, : 327 - 334
  • [10] Data Engineered Content Extraction Studies for Indian Web Pages
    Kolla, Bhanu Prakash
    Raman, Arun Raja
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, 2019, 711 : 505 - 512