An improved algorithm for unsupervised decomposition of a multi-author document

被引:3
|
作者
Giannella, Chris [1 ]
机构
[1] Mitre Corp, Human Language Technol Dept, 7515 Colshire Dr, Mclean, VA 22102 USA
关键词
natural language processing; machine learning;
D O I
10.1002/asi.23375
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article addresses the problem of unsupervised decomposition of a multi-author text document: identifying the sentences written by each author assuming the number of authors is unknown. An approach, BayesAD, is developed for solving this problem: apply a Bayesian segmentation algorithm, followed by a segment clustering algorithm. Results are presented from an empirical comparison between BayesAD and AK, a modified version of an approach published by Akiva and Koppel in 2013. BayesAD exhibited greater accuracy than AK in all experiments. However, BayesAD has a parameter that needs to be set and which had a nontrivial impact on accuracy. Developing an effective method for eliminating this need would be a fruitful direction for future work. When controlling for topic, the accuracy levels of BayesAD and AK were, in all but one case, worse than a baseline approach wherein one author was assumed to write all sentences in the input text document. Hence, room for improved solutions exists.
引用
收藏
页码:400 / 411
页数:12
相关论文
共 50 条
  • [21] India: Multi-author papers skew ranking
    P. Sriram
    Nature, 2015, 522 : 419 - 419
  • [22] Massively Multi-Author Hybrid Artificial Intelligence
    Pendlebury, John
    Humphrys, Mark
    Walshe, Ray
    ERCIM NEWS, 2012, (89): : 40 - 41
  • [23] Development of course material in a multi-author environment
    Schlotter, Michael
    AUSTRALASIAN JOURNAL OF EDUCATIONAL TECHNOLOGY, 2009, 25 (04) : 459 - 470
  • [24] India: multi-author papers skew ranking
    Sriram, P.
    NATURE, 2015, 522 (7557) : 417 - 417
  • [25] THE CONSTRUCTION OF KAREN KARNAK: THE MULTI-AUTHOR FUNCTION
    Snake-Beings, Emit
    MEDIA INTERNATIONAL AUSTRALIA, 2013, (147) : 40 - 50
  • [26] Introduction to the multi-author review on conjugative transposons
    Mullany, P
    CELLULAR AND MOLECULAR LIFE SCIENCES, 2002, 59 (12) : 2015 - 2016
  • [27] Introduction to the multi-author review on methylation in cellular physiology
    Shechter, David
    CELLULAR AND MOLECULAR LIFE SCIENCES, 2019, 76 (15) : 2871 - 2872
  • [28] A Scalable Framework for Stylometric Analysis of Multi-author Documents
    Sarwar, Raheem
    Yu, Chenyun
    Nutanong, Sarana
    Urailertprasert, Norawit
    Vannaboot, Nattapol
    Rakthanmanon, Thanawin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2018, PT I, 2018, 10827 : 813 - 829
  • [29] Credit Allocation for Each Author in a Multi-Author Paper Based on PageRank
    Wang J.-P.
    Guo Q.
    Liu J.-G.
    Guo, Qiang (qiang.guo@usst.edu.cn), 1600, Univ. of Electronic Science and Technology of China (49): : 918 - 923
  • [30] An Interactive Installation for Dynamic Visualization of Multi-author Narratives
    Antonopoulou, Caterina
    INTERACTIVE STORYTELLING, ICIDS 2017, 2017, 10690 : 261 - 264