A New Hidden Web Crawling Approach

被引:0
|
作者
Saoudi, L. [1 ]
Boukerram, A. [2 ]
Mhamedi, S. [1 ]
机构
[1] Mohammed Boudiaf Univ, Dept Comp Sci, Msila, Algeria
[2] Abderrahmane Mira Univ, Dept Comp Sci, Bejaia, Algeria
关键词
Deep crawler; Hidden Web crawler; SQLI query; form submission; searchable forms;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Traditional search engines deal with the Surface Web which is a set of Web pages directly accessible through hyperlinks and ignores a large part of the Web called hidden Web which is a great amount of valuable information of online database which is "hidden" behind the query forms. To access to those information the crawler have to fill the forms with a valid data, for this reason we propose a new approach which use SQLI technique in order to find the most promising keywords of a specific domain for automatic form submission. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained
引用
收藏
页码:293 / 297
页数:5
相关论文
共 50 条
  • [41] Scheduling algorithms for Web crawling
    Castillo, C
    Marin, M
    Rodriguez, A
    Baeza-Yates, R
    WEBMEDIA & LA-WEB 2004, VOL 1, PROCEEDINGS, 2004, : 10 - 17
  • [42] Deep Web crawling: a survey
    Hernandez, Inma
    Rivero, Carlos R.
    Ruiz, David
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (04): : 1577 - 1610
  • [43] Web-crawling reliability
    Cothey, V
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (14): : 1228 - 1238
  • [44] Focused Web Crawling Algorithms
    Amrin, Andas
    Xia, Chunlei
    Dai, Shuguang
    JOURNAL OF COMPUTERS, 2015, 10 (04) : 245 - 251
  • [45] Web from preprocessor for crawling
    Fernando Román Muñoz
    Luis Javier García Villalba
    Multimedia Tools and Applications, 2015, 74 : 8559 - 8570
  • [46] Web from preprocessor for crawling
    Roman Munoz, Fernando
    Garcia Villalba, Luis Javier
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (19) : 8559 - 8570
  • [47] Deep Web crawling: a survey
    Inma Hernández
    Carlos R. Rivero
    David Ruiz
    World Wide Web, 2019, 22 : 1577 - 1610
  • [48] COVID-19 Vaccine Perception in South Korea: Web Crawling Approach
    Lee, Hocheol
    Noh, Eun Bi
    Park, Sung Jong
    Nam, Hae Kweun
    Lee, Tae Ho
    Lee, Ga Ram
    Nam, Eun Woo
    JMIR PUBLIC HEALTH AND SURVEILLANCE, 2021, 7 (09):
  • [49] An Ontology-based Web Crawling Approach for the Retrieval of Materials in the Educational Domain
    Ibrahim, Mohammed
    Yang, Yanyan
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 900 - 906
  • [50] Crawling the construction web - A machine-learning approach without negative examples
    Kovaievic, Milos
    Davidson, Colin H.
    APPLIED ARTIFICIAL INTELLIGENCE, 2008, 22 (05) : 459 - 482