面向微博文本的命名实体识别

被引:12
作者
姜仁会
王挺
唐晋韬
机构
[1] 国防科学技术大学计算机学院计算机科学与技术系
关键词
命名实体识别; 微博; 短文本;
D O I
暂无
中图分类号
TP391.1 [文字信息处理];
学科分类号
081203 ; 0835 ;
摘要
命名实体识别是文本信息处理的重要基础,也是自然语言处理的一项关键技术。近几年来微博迅速发展成为人们进行信息交流的平台,微博文本俨然已经成为进行命名实体抽取的新载体。论文利用微博内容和结构的特点,提出了一种基于统计与规则相结合的命名实体识别的方法。微博文本较短并且文本中含有标签、话题等内容,论文在考虑这些特点基础上,利用微博评论和转发进行词频统计,通过规则筛选,完成命名实体识别。在新浪微博数据上的实验结果表明该方法可以有效地提高微博中命名实体识别效果。
引用
收藏
页码:647 / 651
页数:5
相关论文
共 13 条
[1]  
OKI Electric Industry:Description of the OKI System as Used for MET-2. Fukumoto J,Shimohata M,Masui F. Proceedings of the 7th Message Understanding Conference (MUC-7) . 1998
[2]   命名实体识别研究 [J].
张晓艳 ;
王挺 ;
陈火旺 .
计算机科学, 2005, (04) :44-48
[3]  
A simple introduction to maximum entropy models for natural language processing. Adwait Ratnaparkhi. . 1997
[4]  
A decision tree method for finding and classifying names in Japanese texts. Sekine S,Grishman R,Shinou H. Proceedings of the Sixth Workshop on Very Large Corpora . 1998
[5]  
Algorithms that learn to extract information-BBN: Description of the SIFT system as used for MUC-7. Scott Miller,Michael Crystal,Heidi Fox,et al. Proceedings of the 7th Message Understanding Conference (MUC-7) . 1998
[6]  
Description of the Kent Ridge Digital Labs System Used for MUC-7. Yu S H,Bai S H,Wu P. Proceedings of the Seventh Message Understanding Conference . 1998
[7]  
Ranking Algorithms for Named Entity Extraction: Boosting and theVoted Perceptron. M. Collins. Proceedings of40th Annual Meeting of the Association forComputational Linguistics . 2002
[8]  
Named entity recognition in tweets:an experimental study. Ritter A,Clark S,Mausam D,et al. Proceedings of the Conference on Empirical Methods in Natural Language Processing . 2011
[9]  
Named Entity Recognition:Adapting to Microblogging. B.Locke,J.Martin. . 2009
[10]  
Named Entity Extraction with Conditional Markov Models and Classifiers. MJansche. The 6th Conference on Natural Language Learning . 2002