A New Hidden Web Crawling Approach

被引:0
|
作者
Saoudi, L. [1 ]
Boukerram, A. [2 ]
Mhamedi, S. [1 ]
机构
[1] Mohammed Boudiaf Univ, Dept Comp Sci, Msila, Algeria
[2] Abderrahmane Mira Univ, Dept Comp Sci, Bejaia, Algeria
关键词
Deep crawler; Hidden Web crawler; SQLI query; form submission; searchable forms;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Traditional search engines deal with the Surface Web which is a set of Web pages directly accessible through hyperlinks and ignores a large part of the Web called hidden Web which is a great amount of valuable information of online database which is "hidden" behind the query forms. To access to those information the crawler have to fill the forms with a valid data, for this reason we propose a new approach which use SQLI technique in order to find the most promising keywords of a specific domain for automatic form submission. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained
引用
收藏
页码:293 / 297
页数:5
相关论文
共 50 条
  • [1] Focused crawling for the hidden web
    Liakos, Panagiotis
    Ntoulas, Alexandros
    Labrinidis, Alexandros
    Delis, Alex
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (04): : 605 - 631
  • [2] Focused crawling for the hidden web
    Panagiotis Liakos
    Alexandros Ntoulas
    Alexandros Labrinidis
    Alex Delis
    World Wide Web, 2016, 19 : 605 - 631
  • [3] Structure-based crawling in the Hidden Web
    Vidal, Marcio
    da Silva, Altigran S.
    de Moura, Edleno S.
    Cavalcanti, Joao M. B.
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2008, 14 (11) : 1857 - 1876
  • [4] Crawling the content hidden behind web forms
    Alvarez, Manuel
    Raposo, Juan
    Pan, Alberto
    Cacheda, Fidel
    Bellas, Fernando
    Carneiro, Victor
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2007, PT 2, PROCEEDINGS, 2007, 4706 : 322 - 333
  • [5] Optimal Algorithms for Crawling a Hidden Database in the Web
    Sheng, Cheng
    Zhang, Nan
    Tao, Yufei
    Jin, Xin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11): : 1112 - 1123
  • [6] An effective approach of web crawling for deep web
    Wang, Shunyan
    Wu, Binghua
    Zhong, Luo
    DCABES 2007 Proceedings, Vols I and II, 2007, : 855 - 858
  • [7] Crawling for domain-specific Hidden Web resources
    Bergholz, A
    Chidlovskii, B
    FOURTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, PROCEEDINGS, 2003, : 125 - 133
  • [8] An Efficient Focused Web Crawling Approach
    Aggarwal, Kompal
    SOFTWARE ENGINEERING (CSI 2015), 2019, 731 : 131 - 138
  • [9] A New Framework for Focused Web Crawling
    PENG Tao
    WuhanUniversityJournalofNaturalSciences, 2006, (05) : 1394 - 1397
  • [10] A Novel Approach for avoiding overload in the Web Crawling
    Pamulaparty, Lavanya
    Rao, C. Y. Guru
    Rao, M. Sreenivasa
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,