OLAWSDS: An Online Arabic Web Spam Detection System

被引:0
|
作者
Al-Kabi, Mohammed N. [1 ]
Wahsheh, Heider A. [2 ]
Alsmadi, Izzat M. [3 ]
机构
[1] Zarqa Univ, Fac Sci & IT, Zarqa, Jordan
[2] King Khalid Univ, Coll Comp Sci, Dept Comp Sci, Abha, Saudi Arabia
[3] Prince Sultan Univ, Coll Comp & Informat Sci, Dept Informat Syst, Riyadh 11586, Saudi Arabia
关键词
Arabic Web spam; content-based; link-based; Information Retrieval;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
For marketing purposes, Some Websites designers and administrators use illegal Search Engine Optimization (SEO) techniques to optimize the ranking of their Web pages and mislead the search engines. Some Arabic Web pages use both content and link features, to increase artificially the rank of their Web pages in the Search Engine Results Pages (SERPs). This study represents an enhancement to previous work in this field. It includes the design and implementation of an online Arabic Web spam detection system, based on algorithms and mathematical foundations, which can detect the Arabic content and link web spam depending on the tree of the spam detection conditions, beside depending on the user's feedback through a custom Web browser. The users can participate in making the decision about any Web page, through their feedbacks, so they judge if the Arabic Web pages in the browser are relevant for their particular queries or not. The proposed system uses the extracted content and link features from Arabic Web pages to determine whether to label each Web page as a spam or as a nonspam. This system also attempts to learn from the user's feedback to enhance automatically its performance. Statistical analysis is adopted in this study to evaluate the proposed system. Statistical Package for the Social Sciences (SPSS) software is used to evaluate this new system which considers the users feedbacks as dependent variables, while Arabic content and links features on the other hand are considered independent variables. The statistical analysis with the SPSS is used to apply a variety of tests, such as the test of the analysis of variance (ANOVA). ANOVA is used to show the relationships between the dependent and independent variables in the dataset, which leads to solving problems and building intelligent decisions and results.
引用
收藏
页码:105 / 110
页数:6
相关论文
共 50 条
  • [41] Improving SVM classifiers with link structure for web spam detection
    Zhang, H. (824223485@163.com), 1600, Binary Information Press (10):
  • [42] Exploring link-based algorithm for web spam detection
    Yu, Jian
    Zhou, Jing
    Yu, Mei
    Du, Yu
    Lv, Fang
    Journal of Information and Computational Science, 2015, 12 (13): : 5003 - 5011
  • [43] Web spam detection based on discriminative content and link features
    Mahmoudi M.
    Yari A.
    Khadivi S.
    2010 5th International Symposium on Telecommunications, IST 2010, 2010, : 542 - 546
  • [44] Web Spam Detection Based On Link Diversity and Content Features
    Xu Gongwen
    Li Xiaomei
    Zhang Zhijun
    Xu Li'Na
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2016, 10 (07): : 363 - 372
  • [45] Opinion Spam Detection in Web Forum: A Real Case Study
    Chen, Yu-Ren
    Chen, Hsin-Hsi
    PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW 2015), 2015, : 173 - 183
  • [46] Opinion Spam Detection in Online Reviews Using Neural Networks
    Archchitha, K.
    Charles, E. Y. A.
    2019 19TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER - 2019), 2019,
  • [47] Web Spam Detection Based on Improved Tri-training
    Li, Hailong
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2014, : 61 - 65
  • [48] Analysis on the Content Features and Their Correlation of Web Pages for Spam Detection
    JI Hua
    ZHANG Huaxiang
    中国通信, 2015, 12 (03) : 84 - 94
  • [49] Statistical Detection of Online Drifting Twitter Spam [Invited Paper]
    Liu, Shigang
    Zhang, Jun
    Xiang, Yang
    ASIA CCS'16: PROCEEDINGS OF THE 11TH ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, : 1 - 10
  • [50] Leveraging Phone Numbers for Spam detection in Online Social Networks
    Jere, Rohit
    Pandey, Anant
    Singh, Manvi
    Ganjapurkar, Mandar
    2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021), 2021, : 119 - 123