Modelling and predicting news popularity

被引:19
作者
Hensinger, Elena [1 ]
Flaounas, Ilias [1 ]
Cristianini, Nello [1 ]
机构
[1] Univ Bristol, Intelligent Syst Lab, Bristol, Avon, England
关键词
News popularity; News appeal; Ranking support vector machines; Pattern recognition;
D O I
10.1007/s10044-012-0314-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore the problem of learning to predict the popularity of an article in online news media. By "popular" we mean an article that was among the "most read" articles of a given day in the news outlet that published it. We show that this cannot be modelled simply as the binary classification task of separating popular from unpopular articles, thereby assuming that popularity is an absolute property. Instead, we propose to view popularity in the perspective of a competitive situation where the popular articles are those which were the most appealing on that particular day. This leads to the notion of an "appeal" function, to model which we use a linear function in the bag of words representation. The parameters of this linear function are learnt from a training set formed by pairs of documents, one of which was popular and the other which appeared on the same page and date, without becoming popular. To learn the appeal function we use Ranking Support Vector Machines, using data collected from six different outlets over a period of 1 year. We show that our method can predict which articles will become popular, as well as extracting those keywords that mostly affect the appeal function. This also enables us to compare different outlets from the point of view of their readers' preference patterns. Remarkably, this is achieved using very limited information, namely the textual content of title and description of each article, the page and date of publication, and whether it became popular.
引用
收藏
页码:623 / 635
页数:13
相关论文
共 45 条
[1]  
Ali O, 2010, AUTOMATING NEWS CONT, P36
[2]  
[Anonymous], P 17 C INF KNOWL MAN
[3]  
[Anonymous], 2009, P SIGIR 2009 WORKSH
[4]  
[Anonymous], 2010, Proceedings of the 19th International Conference on World Wide Web, WWW'10, page, DOI DOI 10.1145/1772690.1772754
[5]  
[Anonymous], 2002, Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
[6]  
[Anonymous], 2007, Google news personalization: scalable online collaborative filtering, DOI DOI 10.1145/1242572.1242610
[7]  
Bautin M., 2010, Proceedings of the 19th International Conference on World Wide Web, P1229
[8]  
Billsus D, 2007, ADAPTIVE WEB
[9]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[10]  
Center PR, 2010, TECHN MAK HEADL MED