Author Identification Using Latent Dirichlet Allocation

被引:0
|
作者
Calvo, Hiram [1 ,2 ]
Hernandez-Castaneda, Angel [1 ]
Garcia-Flores, Jorge [2 ]
机构
[1] IPN, Ctr Comp Res CIC, Ave JD Batiz E MO Mendizabal, Mexico City 07738, DF, Mexico
[2] Univ Paris 13, Lab Informat Paris Nord, CNRS, UMR 7030,Sorbonne Paris Cite, F-93430 Villetaneuse, France
关键词
D O I
10.1007/978-3-319-77116-8_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We tackle the task of author identification at PAN 2015 through a Latent Dirichlet Allocation (LDA) model. By using this method, we take into account the vocabulary and context of words at the same time, and after a statistical process find to what extent the relations between words are given in each document; processing a set of documents by LDA returns a set of distributions of topics. Each distribution can be seen as a vector of features and a fingerprint of each document within the collection. We used then a Naive Bayes classifier on the obtained patterns with different performances. We obtained state-of-the-art performance for English, overtaking the best FS score reported in PAN 2015, while obtaining mixed results for other languages.
引用
收藏
页码:303 / 312
页数:10
相关论文
共 50 条
  • [21] The Security of Latent Dirichlet Allocation
    Mei, Shike
    Zhu, Xiaojin
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 681 - 689
  • [22] Sequential latent Dirichlet allocation
    Lan Du
    Wray Buntine
    Huidong Jin
    Changyou Chen
    Knowledge and Information Systems, 2012, 31 : 475 - 503
  • [23] Learning and Using Context on a Humanoid Robot Using Latent Dirichlet Allocation
    Celikkanat, Hande
    Orhan, Guner
    Pugeault, Nicolas
    Guerin, Frank
    Sahin, Erol
    Kalkan, Sinan
    FOUTH JOINT IEEE INTERNATIONAL CONFERENCES ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (IEEE ICDL-EPIROB 2014), 2014, : 201 - 207
  • [24] Topic modeling for expert finding using latent Dirichlet allocation
    Momtazi, Saeedeh
    Naumann, Felix
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 346 - 353
  • [25] Video fingerprinting using Latent Dirichlet Allocation and facial images
    Vretos, Nicholas
    Nikolaidis, Nikos
    Pitas, Ioannis
    PATTERN RECOGNITION, 2012, 45 (07) : 2489 - 2498
  • [26] Mining Sentiments from Songs Using Latent Dirichlet Allocation
    Sharma, Govind
    Murty, M. Narasimha
    ADVANCES IN INTELLIGENT DATA ANALYSIS X: IDA 2011, 2011, 7014 : 328 - 339
  • [27] Terminological ontology learning and population using latent Dirichlet allocation
    Colace, Francesco
    De Santo, Massimo
    Greco, Luca
    Amato, Flora
    Moscato, Vincenzo
    Picariello, Antonio
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2014, 25 (06): : 818 - 826
  • [28] Feature extraction for document text using Latent Dirichlet Allocation
    Prihatini, P. M.
    Suryawan, I. K.
    Mandia, I. N.
    2ND INTERNATIONAL JOINT CONFERENCE ON SCIENCE AND TECHNOLOGY (IJCST) 2017, 2018, 953
  • [29] Semantic Annotation of Satellite Images Using Latent Dirichlet Allocation
    Lienou, Marie
    Maitre, Henri
    Datcu, Mihai
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2010, 7 (01) : 28 - 32
  • [30] Latent Dirichlet Allocation for Classification using Gene Expression Data
    Yalamanchili, Hima Bindu
    Kho, Soon Jye
    Raymer, Michael L.
    2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 39 - 44