Quality of word and concept embeddings in targetted biomedical domains

被引:0
|
作者
Giancani, Salvatore [1 ,2 ,3 ]
Albertoni, Riccardo [3 ]
Catalano, Chiara Eva [3 ]
机构
[1] CNRS, Inst Neurosci Timone, Unite Mixte Rech 7289, 27 Blvd Jean Moulin, F-13385 Marseille 05, France
[2] Aix Marseille Univ, Fac Med, 27 Blvd Jean Moulin, F-13385 Marseille 05, France
[3] CNR, Ist Matemat Applicata & Tecnol Informat, Via Marini 16, I-16149 Genoa, Italy
关键词
Embedding; Quality; UMLS; Coverage; Chronic obstructive pulmonary disease; SYSTEM; RELATEDNESS; SIMILARITY; UMLS; TEXT;
D O I
10.1016/j.heliyon.2023.e16818
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new evaluation methodology to test the coverage of embeddings against a targetted domain of interest. It defines measures to assess the terminology, similarity, and analogy coverage, which are core aspects of the embeddings. Then, it discusses the experimentation carried out on existing biomedical embeddings in the specific context of pulmonary diseases. The proposed methodology and measures are general and may be applied to any application domain.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Combining word embeddings to extract chemical and drug entities in biomedical literature
    Lopez-Ubeda, Pilar
    Diaz-Galiano, Manuel Carlos
    Urena-Lopez, L. Alfonso
    Martin-Valdivia, M. Teresa
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 1)
  • [22] Vecsigrafo: Corpus-based word-concept embeddings
    Denaux, Ronald
    Manuel Gomez-Perez, Jose
    SEMANTIC WEB, 2019, 10 (05) : 881 - 908
  • [23] Concept Mover's Distance: measuring concept engagement via word embeddings in texts
    Stoltz, Dustin S.
    Taylor, Marshall A.
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2019, 2 (02): : 293 - 313
  • [24] Concept Mover’s Distance: measuring concept engagement via word embeddings in texts
    Dustin S. Stoltz
    Marshall A. Taylor
    Journal of Computational Social Science, 2019, 2 : 293 - 313
  • [25] Biomedical Event Trigger Detection Based on Hybrid Methods Integrating Word Embeddings
    Li, Lishuang
    Qin, Meiyue
    Huang, Degen
    KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: SEMANTIC, KNOWLEDGE, AND LINKED BIG DATA, 2016, 650 : 67 - 79
  • [26] Augmenting Word Embeddings through External Knowledge-base for Biomedical Application
    Jha, Kishlay
    Xun, Guangxu
    Gopalakrishnan, Vishrawas
    Zhang, Aidong
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1965 - 1974
  • [27] Word embeddings and external resources for answer processing in biomedical factoid question answering
    Dimitriadis, Dimitris
    Tsoumakas, Grigorios
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 92
  • [28] Clinical Concept Normalization on Medical Records Using Word Embeddings and Heuristics
    Silva, Joao Figueira
    Antunes, Rui
    Rafael Almeida, Joao
    Matos, Sergio
    DIGITAL PERSONALIZED HEALTH AND MEDICINE, 2020, 270 : 93 - 97
  • [29] Refined Global Word Embeddings Based on Sentiment Concept for Sentiment Analysis
    Wang, Yabing
    Huang, Guimin
    Li, Jun
    Li, Hui
    Zhou, Ya
    Jiang, Hua
    IEEE ACCESS, 2021, 9 : 37075 - 37085
  • [30] Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories
    Chaloner, Kaytlin
    Maldonado, Alfredo
    GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 25 - 32