Ontology Extraction Considering Content Concordance from Tagging to Web Pages in Similar SBM Users

被引：0

作者：

Harada, Fumiko ^{[1
]}

Shimakawa, Hiromitsu ^{[1
]}

机构：

[1] Ritsumeikan Univ, Fac Comp Sci, Dept Informat Sci & Engn, Kusatsu, Shiga 5258577, Japan

来源：

2013 SECOND IIAI INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2013) | 2013年

关键词：

personal phrase meaning; tagging; social bookmark; similar user;

D O I：

10.1109/IIAI-AAI.2013.45

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To realize web search engines with considering meaning of query phrases for each user, we have studied a method to extract hierarchical and synonymous relationships among tagged phrases on a social bookmark (SBM) for an individual SBM user. It detects the relationships from webpage clusters with same tagged phrases derived from the bookmarks shared in the target and his similar SBM users. However, noisy tagging violating personal phrase meaning degrades its detection accuracy. This paper proposes a method to improve such drawback. The proposed method classifies webpages based on its content concordance as long as based on sameness of tagged phrases. Analyzing webpages belongingness to content-based and tag-based clusters, the relationships are detected more accurately. We compared the detection accuracies of the proposed and traditional methods through an experiment. For hierarchical relationships, the F-measure improves by 7.41% and the precision improves by 20.94% under guaranteeing more than 20% recall. For synonymous one, the F-measure does by 4.17% and the precision does by 21.80% under more than 10% recall.

引用

页码：289 / 295

页数：7

共 50 条

[41] Automatic Data Extraction from Lists in Web Pages Based on XML
Xin, Zhou
Hao, Wang
ADVANCED TECHNOLOGY IN TEACHING - PROCEEDINGS OF THE 2009 3RD INTERNATIONAL CONFERENCE ON TEACHING AND COMPUTATIONAL SCIENCE (WTCS 2009), VOL 2: EDUCATION, PSYCHOLOGY AND COMPUTER SCIENCE, 2012, 117 : 915 - 921
[42] Extraction of ontologies from web pages: Conceptual modelling and tourism application
Riadi-GDL Laboratory, ENSI Campus, Universitaire de la Manouba, Tunisia
不详
不详
J. Internet Technol., 2007, 4 (411-421):
[43] Data extraction from semi-structured web pages by clustering
Vuong, Le Phong Bao
Gao, Xiaoying
Zhang, Mengjie
2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 374 - +
[44] Pattern Matching for Extraction of Core Contents from News Web Pages
Sirsat, Sandeep
Chavan, Vinay
2016 SECOND INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2016, : 13 - 18
[45] Leveraging spatial join for robust tuple extraction from web pages
Han, Wook-Shin
Kwak, Wooseong
Yu, Hwanjo
Lee, Jeong-Hoon
Kim, Min-Soo
INFORMATION SCIENCES, 2014, 261 : 132 - 148
[46] Ontology creation: Extraction of domain knowledge from web documents
Storey, VC
Chiang, R
Chen, GL
CONCEPTUAL MODELING - ER 2005, 2005, 3716 : 256 - 269
[47] Ontology-based Knowledge Extraction from Hidden Web
宋晖
马范援
刘晓强
Journal of DongHua University, 2004, (05) : 73 - 78
[48] Bootstrapping Information Extraction from Semi-structured Web Pages
Carlson, Andrew
Schafer, Charles
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS, 2008, 5211 : 195 - +
[49] An Approach to Image Extraction and Accurate Skin Detection from Web Pages
Girgis, Moheb R.
Mahmoud, Tarek M.
Abd-El-Hafeez, Tarek
PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 21, 2007, 21 : 367 - 375
[50] Ontology-based knowledge extraction from hidden web
Song, Hui
Ma, Fan-Yuan
Liu, Xiao-Qiang
Journal of Dong Hua University (English Edition), 2004, 21 (05): : 73 - 78

← 1 2 3 4 5 →