Deep Learning As an Aid to Text Mining in the Choice of Texts to Lemmatise for a Comparison Corpus: A Stylistic Study of Peter Damian's Letters

被引:0
|
作者
Thon, Valerie [1 ]
Vanni, Laurent [2 ]
Longree, Dominique [3 ]
机构
[1] ULiege, UParis Cite, LASLA, CERILAC, Liege, Belgium
[2] UCA, CNRS, UMR7320, BCL, Nice, France
[3] ULiege, LASLA, Liege, Belgium
来源
NEW FRONTIERS IN TEXTUAL DATA ANALYSIS, JADT 2022 | 2024年
关键词
Prediction; Lemmatisation; Morphosyntax;
D O I
10.1007/978-3-031-55917-4_14
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Lemmatising and morphosyntactically labelling a Latin text is a time-consuming process. Focusing in this contribution on the epistolary corpus of Peter Damian (eleventh century), an ecclesiastical author of 180 Latin letters, we cross intertextual distance calculation (Brunet and Jaccard) and a deep learning model trained on authorship classification on a selection of unlemmatised texts from 39 of his literary predecessors; the idea is to theoretically identify which text(s) share a similar style to Peter, and would therefore be suitable candidates for a precise lemmatisation. A dialogue between both methods seems promising, and the areas of activation in the deep learning model even suggest a recognition of complex linguistic patterns that Peter possibly shares with some of his predecessors.
引用
收藏
页码:173 / 184
页数:12
相关论文
empty
未找到相关数据