A step further towards a consensus on linking tweets to Wikipedia

被引:1
|
作者
Nait-Hamoud, Mohamed Cherif [1 ,2 ]
Lahfa, Fedoua [1 ]
Ennaji, Abdellatif [3 ]
机构
[1] Univ Abou Bekr Belkaid Tlemcen, Dept Sci Comp, BP 13000, Tilimsen, Algeria
[2] Univ Larbi Tebessi, Dept Math & Sci Comp, BP 12000, Tebessa, Algeria
[3] Univ Rouen Normandie, EA 4108, LITIS Lab, Rouen, France
关键词
Information extraction; Tweet entity linking; Topic extraction; Wikification; DISAMBIGUATION;
D O I
10.1007/s12065-020-00549-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The study of contemporary tweet-based Entity Linking (EL) systems reveals a lack of a standard definition and a consensus on the task. Specifically, identifying what should be annotated in texts remains a recurring question. This prevents proper design and fair evaluation of EL systems. To tackle this issue, the present paper introduces a set of rules intended to define the EL task for tweets. We experimented the effectiveness of the proposed rules by developing TELS, an end-to-end supervised system that links tweets to Wikipedia. The experiments conducted on five publicly available datasets show that our system outperforms the baselines with an improvement, in terms of overall macro F1-score (micro F1-score), ranging from 25.04% (7.32%) up to 35.36% (42.03%). Moreover, feature analysis reveals that when the annotation is not limited to very few entity types, the proposed rules capture more efficiently annotators' tacit agreements from datasets. Consequently, the proposed rules constitute a step further towards a consensus on the EL task.
引用
收藏
页码:1825 / 1840
页数:16
相关论文
共 50 条