A Hybrid Method for Extracting Deep Web Information

被引：0

作者：

Zhang, Yuanpeng ^{[1
]}

Wang, Li ^{[1
]}

Jiang, Kui ^{[1
]}

Qian, Danmin ^{[1
]}

Dong, Jiancheng ^{[1
]}

机构：

[1] Nantong Univ, Sch Med, Dept Med Informat, Nantong 226001, Jiangsu, Peoples R China

来源：

PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING | 2015年 / 124卷

关键词：

information extraction; clinic expert information; domain model; block importance model; SVM;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Some previous works show that more than 60% of the information available on the Web is located in Deep Web database. Such information cannot be directly indexed by search engines. In this paper, a hybrid method, which is composed of a domain model and a block importance model is proposed to extract information in Deep Web. The domain model is used for classifying and identifying whether a form is a WQI. The block importance model is used for filtering noisy information in response pages. These two models are both compared with a rule-based method. The experiment results indicate that the domain model yields a precision6.44% higher than that of the rulebased method, whereas the block importance model yields an F1 measure 10.5% higher thanthat of the XPath method.

引用

页码：777 / 782

页数：6

共 50 条

[41] Hybrid Schema Matching for Deep Web
Chen, Kerui
Zuo, Wanli
He, Fengling
Chen, Yongheng
INTELLIGENT COMPUTING AND INFORMATION SCIENCE, PT II, 2011, 135 : 165 - +
[42] A method of extracting management information for service management
Kubo, K
Ikemoto, K
NTT REVIEW, 1998, 10 (02): : 63 - 68
[43] Mining the deep Web for company information
Ojala, M
ONLINE, 2002, 26 (05): : 73 - 75
[44] A hybrid approach for web information extraction
Xiao, Ji-Yi
Zhu, Dao-Hui
Zou, La-Mei
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 1560 - 1563
[45] Three Level Method Using Machine Learning and Rule Based Approach for Extracting Web-Table Information
Jung, Sung-Won
Lim, Sung-Shin
Kwon, Hyuk-Chul
IECON 2004: 30TH ANNUAL CONFERENCE OF IEEE INDUSTRIAL ELECTRONICS SOCIETY, VOL 3, 2004, : 3131 - 3136
[46] Extracting Output Metadata from Scientific Deep Web Data Sources
Wang, Fan
Agrawal, Gagan
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 552 - 561
[47] A method of deep web classification
Xu, He-Xiang
Hao, Xiu-Lan
Wang, Shu-Yun
Hu, Yun-Fa
PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 4009 - 4014
[48] Query Interface Schema Extracting from Deep Web using Ontology
Sun, Yong
Wang, Shang
Li, Zhenyuan
Liu, Chang
Peng, Tao
Qiu, Yuhang
2021 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2021, 12076
[49] A scalable hybrid approach for extracting head components from Web tables
Jung, SW
Kwon, HC
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (02) : 174 - 187
[50] Extracting Wetland Type Information with a Deep Convolutional Neural Network
Guan, XianMing
Wang, Di
Wan, Luhe
Zhang, Jiyi
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022

← 1 2 3 4 5 →