Automatic terminological collocations extraction from large corpus

被引:0
|
作者
Suarez, Octavio Santana [1 ]
Aguiar, Jose Perez [1 ]
Berriel, Isabel Sanchez [2 ]
Rodriguez, Virginia Gutierrez [2 ]
机构
[1] Univ Las Palmas Gran Canaria, Edificio Dept Informat & Matemat, Las Palmas Gran Canaria 35017, Spain
[2] Univ La Laguna, Edificio Fis & Matemat,Campus Univ Anchieta, San Cristobal la Laguna 38271, Spain
来源
关键词
automatic extraction of collocations; terminology; computational linguistics; text mining;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The automatic systems which deal with term's extractions constitute an important tool when they make reference to the labor of compilation of lexemes, which is restricted to a specific field or specialty. The textual analysis that are realized for this type of software must include strategies that could detect collocations in the field in which is done. In this topic is studied the viability of the use from extensive textual's corpus, that have not contain linguistic information, as happen with those textual's corpus that could be compiled from internet. The internet is used like a source of information for the recompilation of terminology's collocations. With that purpose is analyzed the behavior of different indicators based on the frequencies registered for a collection of economic terms in a Spanish corpus of 300.000 words.
引用
收藏
页码:145 / 152
页数:8
相关论文
共 50 条
  • [41] Automatic Retrieval of Parallel Collocations
    Novitskiy, Valeriy I.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, 2011, 6744 : 261 - 267
  • [42] AUTOMATIC RETIEVAL OF PARALLEL COLLOCATIONS
    Novitskiy, V. I.
    BIZNES INFORMATIKA-BUSINESS INFORMATICS, 2011, 17 (03): : 24 - +
  • [43] Collocations with mind in Corpus and Implications for Language Teaching
    Turan, Umit Deniz
    EURASIAN JOURNAL OF EDUCATIONAL RESEARCH, 2012, 12 (49A): : 331 - 348
  • [44] Towards a Statistical-Enriched Corpus Containing Portuguese Collocations in Use: Reviewing Possible Extraction Tools
    Costa, Angela
    Coheur, Luisa
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE (PROPOR 2016), 2016, 9727 : 319 - 329
  • [45] Building an annotated corpus for automatic metadata extraction from multilingual journal article references
    Choi, Wonjun
    Yoon, Hwa-Mook
    Hyun, Mi-Hwan
    Lee, Hye-Jin
    Seol, Jae-Wook
    Lee, Kangsan Dajeong
    Yoon, Young Joon
    Kong, Hyesoo
    PLOS ONE, 2023, 18 (01):
  • [46] Using collocations for terminology extraction
    Stoykova, Velislava
    Majchrakova, Daniela
    Petkova, Ekaterina
    PROCEEDINGS OF THE INTERNATIONAL JUBILEE CONFERENCE OF THE INSTITUTE FOR BULGARIAN LANGUAGE, VOL 2, 2017, : 134 - 138
  • [47] Formulaic language and collocations in German essays: from corpus-driven data to corpus-based materials
    Krummes, Cedric
    Ensslin, Astrid
    LANGUAGE LEARNING JOURNAL, 2015, 43 (01): : 110 - 127
  • [48] Collocations workbook: a corpus-based online pedagogical support material for the teaching of English collocations
    Orenha-Ottaiano, Adriane
    REVISTA DE ESTUDOS DA LINGUAGEM, 2015, 23 (03) : 833 - 881
  • [49] Automatic Building of a Large Arabic Spelling Error Corpus
    Aichaoui S.B.
    Hiri N.
    Dahou A.H.
    Cheragui M.A.
    SN Computer Science, 4 (2)
  • [50] CULTURAL AND TERMINOLOGICAL LANDMARKS IN THE CREATION OF THE MCVRO CORPUS
    Chircu, Adrian
    DISCOURSE AS A FORM OF MULTICULTURALISM IN LITERATURE AND COMMUNICATION - LANGUAGE AND DISCOURSE, 2015, : 94 - 102