Introduction-Topic models: What they are and why they matter

被引:274
作者
Mohr, John W. [1 ]
Bogdanov, Petko [2 ]
机构
[1] Univ Calif Santa Barbara, Dept Sociol, Santa Barbara, CA 93106 USA
[2] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
关键词
TEXT;
D O I
10.1016/j.poetic.2013.10.001
中图分类号
I [文学];
学科分类号
05 ;
摘要
We provide a brief, non-technical introduction to the text mining methodology known as "topic modeling." We summarize the theory and background of the method and discuss what kinds of things are found by topic models. Using a text corpus comprised of the eight articles from the special issue of Poetics on the subject of topic models, we run a topic model on these articles, both as a way to introduce the methodology and also to help summarize some of the ways in which social and cultural scientists are using topic models. We review some of the critiques and debates over the use of the method and finally, we link these developments back to some of the original innovations in the field of content analysis that were pioneered by Harold D. Lasswell and colleagues during and just after World War II. (C) 2013 Published by Elsevier B.V.
引用
收藏
页码:545 / 569
页数:25
相关论文
共 43 条
[1]  
Bail C.A., THEORY SOC IN PRESS, V43
[2]   Becoming a Nazi: A model for narrative networks [J].
Bearman, PS ;
Stovel, K .
POETICS, 2000, 27 (2-3) :69-90
[3]  
Blei D.M., 2006, INT C MACHINE LEARNI, DOI DOI 10.1145/1143844.1143859
[4]  
Blei D. M., 2012, Journal of Digital Humanities, V2, P8
[5]  
Blei D.M., 2011, Introduction to probabilistic topic models
[6]   A CORRELATED TOPIC MODEL OF SCIENCE [J].
Blei, David M. ;
Lafferty, John D. .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :17-35
[7]   Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[8]   HIERARCHICAL RELATIONAL MODELS FOR DOCUMENT NETWORKS [J].
Chang, Jonathan ;
Blei, David M. .
ANNALS OF APPLIED STATISTICS, 2010, 4 (01) :124-150
[9]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[10]  
2-9