Automatic terminological collocations extraction from large corpus

被引:0
|
作者
Suarez, Octavio Santana [1 ]
Aguiar, Jose Perez [1 ]
Berriel, Isabel Sanchez [2 ]
Rodriguez, Virginia Gutierrez [2 ]
机构
[1] Univ Las Palmas Gran Canaria, Edificio Dept Informat & Matemat, Las Palmas Gran Canaria 35017, Spain
[2] Univ La Laguna, Edificio Fis & Matemat,Campus Univ Anchieta, San Cristobal la Laguna 38271, Spain
来源
关键词
automatic extraction of collocations; terminology; computational linguistics; text mining;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The automatic systems which deal with term's extractions constitute an important tool when they make reference to the labor of compilation of lexemes, which is restricted to a specific field or specialty. The textual analysis that are realized for this type of software must include strategies that could detect collocations in the field in which is done. In this topic is studied the viability of the use from extensive textual's corpus, that have not contain linguistic information, as happen with those textual's corpus that could be compiled from internet. The internet is used like a source of information for the recompilation of terminology's collocations. With that purpose is analyzed the behavior of different indicators based on the frequencies registered for a collection of economic terms in a Spanish corpus of 300.000 words.
引用
收藏
页码:145 / 152
页数:8
相关论文
共 50 条
  • [31] Encoding a parallel corpus for automatic terminology extraction
    Gamper, J
    NINTH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS, 1999, : 275 - 276
  • [32] Automatic discovery of translation collocations from bilingual corpora
    Barrachina, S
    Vilar, JM
    ECAI 2004: 16TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 110 : 571 - 575
  • [33] Automatic label curation from large-scale text corpus
    Avasthi, Sandhya
    Chauhan, Ritu
    ENGINEERING RESEARCH EXPRESS, 2024, 6 (01):
  • [34] A Comparative Study of Collocations in a Native Corpus and a Learner Corpus of Spanish
    Orol Gonzalez, Ana
    Alonso Ramos, Margarita
    CORPUS RESOURCES FOR DESCRIPTIVE AND APPLIED STUDIES. CURRENT CHALLENGES AND FUTURE DIRECTIONS: SELECTED PAPERS FROM THE 5TH INTERNATIONAL CONFERENCE ON CORPUS LINGUISTICS (CILC2013), 2013, 95 : 563 - 570
  • [35] Evaluation of automatic annotation by a multi-terminological concepts extractor within a corpus of data from family medicine consultations
    Siefridt, Charlotte
    Grosjean, Julien
    Lefebvre, Tatiana
    Rollin, Laetitia
    Darmoni, Stefan
    Schuers, Matthieu
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2020, 133
  • [36] Automatic feature extraction from large time series
    Mierswa, I
    Classification - the Ubiquitous Challenge, 2005, : 600 - 607
  • [37] Lexicographical and discursive study of collocations with a view to their integration into a terminological data base
    Pecman, Mojca
    JOURNAL OF SPECIALISED TRANSLATION, 2012, (18): : 113 - 138
  • [38] Temporal knowledge extraction from large-scale text corpus
    Yu Liu
    Wen Hua
    Xiaofang Zhou
    World Wide Web, 2021, 24 : 135 - 156
  • [39] EMCOR: a medical corpus for terminological purposes
    Varela Vila, Tamara
    Sanchez Trigo, Elena
    JOURNAL OF SPECIALISED TRANSLATION, 2012, (18): : 139 - 159
  • [40] Temporal knowledge extraction from large-scale text corpus
    Liu, Yu
    Hua, Wen
    Zhou, Xiaofang
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (01): : 135 - 156