Design and Implement of Information Extraction System Based on XML

被引:0
|
作者
Xuan, Yanyan [1 ]
Hu, Yan [1 ]
机构
[1] Wuhan Univ Technol, Dept Comp Sci & Technol, Wuhan 430070, Peoples R China
关键词
Information Extraction; XML; XPath; XSLT; Extraction Rule;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
By studying the structure of HTML documents, this paper solves the problem of web information extraction through the standard XML technology and poses an information extraction method based on XML: construct HTMLDOM tree to implement Web cleaning and generate XHTML documents by analyzing HTML web, then analyze the XHTML files through the Xerces-J's DOM methods and construct an XPath generation algorithm; use the advantages of XSLT and XPath technology in the aspects of data location and conversion to automatically learn and generate the information extraction rules and implement the Web information extraction according to the generated XPath.
引用
收藏
页码:1400 / 1404
页数:5
相关论文
共 50 条
  • [31] Integrated Information Framework for Intelligent Cooperative Design Based on Multi-Agent System and XML
    Cao Yan
    Yang Lina
    Yang Yanli
    Chen Hua
    INTERNATIONAL ELECTRONIC CONFERENCE ON COMPUTER SCIENCE, 2008, 1060 : 42 - +
  • [32] XML-based web services technology to implement a prototype command and control system
    Lin, Ching-Show
    Liang, Chia-Hao
    DEFENCE SCIENCE JOURNAL, 2006, 56 (04) : 591 - 597
  • [33] The Implement of a Flooding Information System
    Cheng, Chao-Chung
    Qiu, Qi
    Lee, Guan-Ting
    AUTOMATIC MANUFACTURING SYSTEMS II, PTS 1 AND 2, 2012, 542-543 : 1426 - +
  • [34] Design and Implement of Laboratory Management System based Web
    Li, Zheng-Bo
    Proceedings of the 2016 International Conference on Engineering and Advanced Technology, 2016, 82 : 424 - 429
  • [35] The System Design and Implement Based on SSH Framework Technology
    Pan Bin
    Dang Qing-zhong
    Zhu Tao
    Lai Bo
    EBM 2010: INTERNATIONAL CONFERENCE ON ENGINEERING AND BUSINESS MANAGEMENT, VOLS 1-8, 2010, : 5648 - +
  • [36] The Design and Implement of the Video Monitoring System Based on WinCE
    Yu Yuanhui
    Li Yongmei
    Deng Ying
    SUSTAINABLE ENVIRONMENT AND TRANSPORTATION, PTS 1-4, 2012, 178-181 : 2747 - +
  • [37] Design and Implement a System of Wastewater Treatment Based on Wetlands
    Dominguez-Patino, Martha L.
    Rodriguez-Martinez, Antonio
    Jasso-Castillo, Luis A.
    ICEIC 2011/ IRE&PS 2011: INTERNATIONAL CONFERENCE ON EDUCATION, INFORMATICS, AND CYBERNETICS/ INTERNATIONAL SYMPOSIUM ON INTEGRATING RESEARCH, EDUCATION, AND PROBLEM SOLVING, 2011, : 261 - 264
  • [38] Design and Implement of Meeting Attendance System Based on RFID
    Zhou, Lei
    Liu, Hu
    Zhu, Quanyin
    Yan, Yunyang
    2012 7TH INTERNATIONAL CONFERENCE ON SYSTEM OF SYSTEMS ENGINEERING (SOSE), 2012, : 939 - 942
  • [39] Design and implement of warehouse management system based on AOP
    Luo Cheng
    Xu Didi
    Lai Mingyong
    Wang Yan
    2006 IEEE INTERNATIONAL ENGINEERING MANAGEMENT CONFERENCE, 2006, : 243 - +
  • [40] Design and Implement of Tire Monitoring System Based on ZigBee
    Wang, Zu-Xun
    Xu, Yuan
    Wang, Gui-Juan
    2009 5TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-8, 2009, : 3487 - +