Adding value to, and extracting of value from, a signed language corpus through secondary processing: implications for annotation schemas and corpus creation

被引:0
|
作者
Johnston, Trevor [1 ]
机构
[1] Macquarie Univ, Sydney, NSW 2109, Australia
关键词
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
A basic signed language (SL) corpus is created through primary processing of video recordings using multi-media annotation software. Primary processing entails the tokenization and identification of SL units. For the purposes of linguistic research a corpus also needs secondary processing. Secondary processing entails appending tags for specific linguistic features to primary annotations. I draw on the experience from the Auslan corpus project to describe how primary and secondary processing can be used in corpus-based SL research. In particular, I show how the tier structure of ELAN can be used to tag SL units in a variety of ways, and how this information can be used to glean new information from the corpus which can then be added as new annotations to the corpus. Value-adding by principled and systematic primary and secondary processing of digital recordings is thus not only essential for corpus creation ('machine-readability'), it also enables further enriching of the corpus so that even more value can be extracted. I conclude by discussing the implications for annotation software and standardized annotation schemas used in the creation of SL corpora.
引用
收藏
页码:A137 / A142
页数:6
相关论文
共 5 条