A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News

被引:119
作者
Fernandes, Kelwin [1 ]
Vinagre, Pedro [2 ]
Cortez, Paulo [2 ]
机构
[1] Univ Porto, INESC TEC Porto, P-4100 Oporto, Portugal
[2] Univ Minho, Algoritmi Res Ctr, Braga, Portugal
来源
PROGRESS IN ARTIFICIAL INTELLIGENCE-BK | 2015年 / 9273卷
关键词
Popularity prediction; Online news; Text mining; Classification; Stochastic local search;
D O I
10.1007/978-3-319-23485-4_53
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the Web expansion, the prediction of online news popularity is becoming a trendy research topic. In this paper, we propose a novel and proactive Intelligent Decision Support System (IDSS) that analyzes articles prior to their publication. Using a broad set of extracted features (e.g., keywords, digital media content, earlier popularity of news referenced in the article) the IDSS first predicts if an article will become popular. Then, it optimizes a subset of the articles features that can more easily be changed by authors, searching for an enhancement of the predicted popularity probability. Using a large and recently collected dataset, with 39,000 articles from the Mashable website, we performed a robust rolling windows evaluation of five state of the art models. The best result was provided by a Random Forest with a discrimination power of 73%. Moreover, several stochastic hill climbing local searches were explored. When optimizing 1000 articles, the best optimization method obtained a mean gain improvement of 15 percentage points in terms of the estimated popularity probability. These results attest the proposed IDSS as a valuable tool for online news authors.
引用
收藏
页码:535 / 546
页数:12
相关论文
共 17 条
[1]  
Ahmed M., 2013, P 6 ACM INT C WEB SE, P607, DOI [10.1145/2433396.2433473, DOI 10.1145/2433396.2433473]
[2]   Eight key issues for the decision support systems discipline [J].
Amott, David ;
Pervan, Graham .
DECISION SUPPORT SYSTEMS, 2008, 44 (03) :657-672
[3]  
[Anonymous], 2012, P 6 INT AAAI C WEBL
[4]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[5]  
De Smedt T., 2014, P 5 INT C COMP CREAT
[6]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[7]   Modelling and predicting news popularity [J].
Hensinger, Elena ;
Flaounas, Ilias ;
Cristianini, Nello .
PATTERN ANALYSIS AND APPLICATIONS, 2013, 16 (04) :623-635
[8]   Description and prediction of Slashdot activity [J].
Kaltenbrunner, Andreas ;
Gomez, Vicenc ;
Lopez, Vicente .
LA-WEB 2007: 5TH LATIN AMERICAN WEB CONGRESS, PROCEEDINGS, 2007, :57-66
[9]   Modeling and predicting the popularity of online contents with Cox proportional hazard regression model [J].
Lee, Jong Gun ;
Moon, Sue ;
Salamatian, Kave .
NEUROCOMPUTING, 2012, 76 (01) :134-145
[10]  
Michalewicz Z., 2006, Adaptive Business Intelligence, DOI DOI 10.1007/978-3-540-32929-9