Authorship Classification: A Discriminative Syntactic Tree Mining Approach

被引:0
|
作者
Kim, Sangkyum [1 ]
Kim, Hyungsul [1 ]
Weninger, Tim [1 ]
Han, Jiawei [1 ]
Kim, Hyun Duk [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
Authorship Attribution; Text Mining; Text Categorization; Authorship Discrimination; Authorship Classification; ATTRIBUTION; IDENTIFICATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the past, there have been dozens of studies on automatic authorship classification, and many of these studies concluded that the writing style is one of the best indicators for original authorship. From among the hundreds of features which were developed, syntactic features were best able to reflect an author's writing style. However, due to the high computational complexity for extracting and computing syntactic features, only simple variations of basic syntactic features such as function words, POS(Part of Speech) tags, and rewrite rules were considered. In this paper, we propose a new feature set of k-embedded-edge subtree patterns that holds more syntactic information than previous feature sets. We also propose a novel approach to directly mining them from a given set of syntactic trees. We show that this approach reduces the computational burden of using complex syntactic structures as the feature set. Comprehensive experiments on real-world datasets demonstrate that our approach is reliable and more accurate than previous studies.
引用
收藏
页码:455 / 464
页数:10
相关论文
共 50 条
  • [1] Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach
    Lo, David
    Cheng, Hong
    Han, Jiawei
    Khoo, Siau-Cheng
    Sun, Chengnian
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 557 - 565
  • [2] Discriminative Subsequence Mining for action classification
    Nowozin, Sebastian
    Bakir, Goekhan
    Tsuda, Koji
    2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 1727 - 1734
  • [3] Discriminative Spatial Tree for Image Classification
    Xu, Ye
    Yu, Xiaodong
    Wang, Tian
    Lu, Fuqiang
    2017 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2017, : 283 - 288
  • [4] Direct discriminative pattern mining for effective classification
    Cheng, Hong
    Yan, Xifeng
    Han, Jiawei
    Yu, Philip S.
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 169 - +
  • [5] A Discriminative Approach to Sentiment Classification
    Li, Guangmin
    Lin, Zhiwei
    Wang, Hui
    Wei, Xin
    NEURAL PROCESSING LETTERS, 2020, 51 (01) : 749 - 758
  • [6] Correcting Syntactic Annotation Errors Based on Tree Mining
    Suzuki, Kanta
    Kato, Yoshihide
    Matsubara, Shigeki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (05): : 1106 - 1113
  • [7] A Discriminative Approach to Sentiment Classification
    Guangmin Li
    Zhiwei Lin
    Hui Wang
    Xin Wei
    Neural Processing Letters, 2020, 51 : 749 - 758
  • [8] A Computational Approach Based on Syntactic Levels of Language in Authorship Attribution
    Varela, P. J.
    Justino, E. J. R.
    Bortolozzi, F.
    Oliveira, L. E. S.
    IEEE LATIN AMERICA TRANSACTIONS, 2016, 14 (01) : 259 - 266
  • [9] TYPE CLASSIFICATION OF FINGERPRINTS - A SYNTACTIC APPROACH
    RAO, K
    BALCK, K
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1980, 2 (03) : 223 - 231
  • [10] A Syntactic Approach for Aspect Based Opinion Mining
    Chinsha, T. C.
    Joseph, Shibily
    2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 24 - 31