Author Identification based on Word Distribution in Word Space

被引:0
|
作者
Ganesh, Barathi H. B. [1 ]
Reshma, U. [1 ]
Kumar, Anand M. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham, Ctr Excellence Computat Engn & Networking, Coimbatore 641112, Tamil Nadu, India
关键词
Author attribution; Random forest tree; Logistic Regression; Support Vector Machine; PAN Author Identification 2014;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Author attribution has grown into an area that is more challenging from the past decade. It has become an inevitable task in many sectors like forensic analysis, law, journalism and many more as it helps to detect the author in every documentation. Here unigram/bigram features along with latent semantic features from word space were taken and the similarity of a particular document was tested using Random forest tree, Logistic Regression and Support Vector Machine in order to create a global model. Dataset from PAN Author Identification shared task 2014 is taken for processing. It has been observed that the proposed model shows state-of-art accuracy of 80% which is significantly greater when compared to the Author Identification PAN results of the year 2014.
引用
收藏
页码:1519 / 1523
页数:5
相关论文
共 50 条
  • [31] TRADEOFFS IN WORD IDENTIFICATION - MEANING AS A BASIS FOR ATTENDING TO OR IGNORING A WORD
    JACOBY, L
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1980, 16 (03) : 159 - 160
  • [32] A transposed-word effect on word-in-sequence identification
    Yun Wen
    Jonathan Mirault
    Jonathan Grainger
    Psychonomic Bulletin & Review, 2022, 29 : 2284 - 2292
  • [33] Introduction to the special issue: Morphology in word identification and word spelling
    Ludo Verhoeven
    Joanne F. Carlisle
    Reading and Writing, 2006, 19 : 643 - 650
  • [34] A transposed-word effect on word-in-sequence identification
    Wen, Yun
    Mirault, Jonathan
    Grainger, Jonathan
    PSYCHONOMIC BULLETIN & REVIEW, 2022, 29 (06) : 2284 - 2292
  • [35] When word identification fails: ERP correlates of recognition without identification and of word identification failure
    Ryals, Anthony J.
    Yadon, Carly A.
    Nomi, Jason S.
    Cleary, Anne M.
    NEUROPSYCHOLOGIA, 2011, 49 (12) : 3224 - 3237
  • [36] WORD SENSE DISAMBIGUATION USING WORD ONTOLOGY AND CONCEPT DISTRIBUTION
    Hung, Jason C.
    Yang, Che-Yu
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2009, 32 (02) : 153 - 168
  • [37] Measuring author research relatedness: A comparison of word-based, topic-based, and author cocitation approaches
    Lu, Kun
    Wolfram, Dietmar
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2012, 63 (10): : 1973 - 1986
  • [38] AUTHOR AS COMPOSITOR - WORD-PROCESSOR TO TYPESETTER
    WEBB, T
    SCHOLARLY PUBLISHING, 1984, 15 (02): : 177 - 190
  • [39] Supervised Author Recognition with Aggregated Word Embeddings
    Atar, Muhammed Selim
    Esen, Ersin
    Arabaci, Mehmet Ali
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [40] THE COMPOSER'S WORD AS AUTHOR'S GESTURE
    Chepelenko, Ksenya O.
    VESTNIK TOMSKOGO GOSUDARSTVENNOGO UNIVERSITETA-KULTUROLOGIYA I ISKUSSTVOVEDENIE-TOMSK STATE UNIVERSITY JOURNAL OF CULTURAL STUDIES AND ART HISTORY, 2016, 23 (03): : 157 - 165