Feature Words of Moves in Scientific Abstracts

被引:1
|
作者
Hashimoto, Kiyota [1 ]
Soonklang, Tasanawan [2 ]
Hirokawa, Sachio [3 ]
机构
[1] Prince Songkla Univ, Hat Yai, Thailand
[2] Silpakorn Univ, Bangkok, Thailand
[3] Kyushu Univ, Fukuoka 812, Japan
来源
PROCEEDINGS 2016 5TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS IIAI-AAI 2016 | 2016年
关键词
ARTICLES;
D O I
10.1109/IIAI-AAI.2016.38
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Extraction of structure from texts is a key issue of text mining. The rhetorical structure of move in scientific articles is useful for assisting in the reading and writing. In this paper, we classify move structure in the abstract of research articles with a small number of characteristic words that determine five moves of including background (B), purpose(P), method(M), result(R) and discussion(D). Eleven measures were introduced and used to select features of moves. Exhaustive parameter search were conducted to get the optimal combination of measure and the number of features. We applied support vector machine and evaluated 10 fold cross validations. The accuracies with optimal feature selections are 0.9022, 0.8322, 0.8442, 0.8820 and 0.8354 for B, P, M, R and D, respectively. They are 10% better than the baseline performance that use all keywords. This study surprisedly found that the negative feature words play central role for prediction performance improvement.
引用
收藏
页码:144 / 149
页数:6