DeepMeSH: deep semantic representation for improving large-scale MeSH indexing

被引:84
|
作者
Peng, Shengwen [1 ,2 ]
You, Ronghui [1 ,2 ]
Wang, Hongning [3 ]
Zhai, Chengxiang [4 ]
Mamitsuka, Hiroshi [5 ,6 ]
Zhu, Shanfeng [1 ,2 ,7 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[2] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China
[3] Univ Virginia, Dept Comp Sci, Charlottesville, VA 22904 USA
[4] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[5] Kyoto Univ, Inst Chem Res, Bioinformat Ctr, Uji 6110011, Japan
[6] Aalto Univ, Dept Comp Sci, Espoo, Finland
[7] Fudan Univ, Ctr Computat Syst Biol, Shanghai 200433, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金; 美国国家卫生研究院;
关键词
LIBRARY;
D O I
10.1093/bioinformatics/btw294
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Medical Subject Headings (MeSH) indexing, which is to assign a set of MeSH main headings to citations, is crucial for many important tasks in biomedical text mining and information retrieval. Large-scale MeSH indexing has two challenging aspects: the citation side and MeSH side. For the citation side, all existing methods, including Medical Text Indexer (MTI) by National Library of Medicine and the state-of-the-art method, MeSHLabeler, deal with text by bag-of-words, which cannot capture semantic and context-dependent information well. Methods: We propose DeepMeSH that incorporates deep semantic information for large-scale MeSH indexing. It addresses the two challenges in both citation and MeSH sides. The citation side challenge is solved by a new deep semantic representation, D2V-TFIDF, which concatenates both sparse and dense semantic representations. The MeSH side challenge is solved by using the 'learning to rank' framework of MeSHLabeler, which integrates various types of evidence generated from the new semantic representation. Results: DeepMeSH achieved a Micro F-measure of 0.6323, 2% higher than 0.6218 of MeSHLabeler and 12% higher than 0.5637 of MTI, for BioASQ3 challenge data with 6000 citations.
引用
收藏
页码:70 / 79
页数:10
相关论文
共 50 条
  • [21] Coherent Semantic-Visual Indexing for Large-Scale Image Retrieval in the Cloud
    Hong, Richang
    Li, Lei
    Cai, Junjie
    Tao, Dapeng
    Wang, Meng
    Tian, Qi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (09) : 4128 - 4138
  • [22] An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition
    Tsatsaronis, George
    Balikas, Georgios
    Malakasiotis, Prodromos
    Partalas, Ioannis
    Zschunke, Matthias
    Alvers, Michael R.
    Weissenborn, Dirk
    Krithara, Anastasia
    Petridis, Sergios
    Polychronopoulos, Dimitris
    Almirantis, Yannis
    Pavlopoulos, John
    Baskiotis, Nicolas
    Gallinari, Patrick
    Artieres, Thierry
    Ngomo, Axel-Cyrille Ngonga
    Heino, Norman
    Gaussier, Eric
    Barrio-Alvers, Liliana
    Schroeder, Michael
    Androutsopoulos, Ion
    Paliouras, Georgios
    BMC BIOINFORMATICS, 2015, 16
  • [23] An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition
    George Tsatsaronis
    Georgios Balikas
    Prodromos Malakasiotis
    Ioannis Partalas
    Matthias Zschunke
    Michael R Alvers
    Dirk Weissenborn
    Anastasia Krithara
    Sergios Petridis
    Dimitris Polychronopoulos
    Yannis Almirantis
    John Pavlopoulos
    Nicolas Baskiotis
    Patrick Gallinari
    Thierry Artiéres
    Axel-Cyrille Ngonga Ngomo
    Norman Heino
    Eric Gaussier
    Liliana Barrio-Alvers
    Michael Schroeder
    Ion Androutsopoulos
    Georgios Paliouras
    BMC Bioinformatics, 16
  • [24] Large-scale investigation of weakly-supervised deep learning for the fine-grained semantic indexing of biomedical literature
    Nentidis, Anastasios
    Chatzopoulos, Thomas
    Krithara, Anastasia
    Tsoumakas, Grigorios
    Paliouras, Georgios
    JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 146
  • [25] Indexing of large-scale multimedia signals
    Wang, Meng
    Gao, Xinbo
    Yang, Yi
    Shan, Caifeng
    SIGNAL PROCESSING, 2013, 93 (08) : 2109 - 2110
  • [26] Large-Scale Semantic Scene Understanding with Cross-Correction Representation
    Zhao, Yuehua
    Zhang, Jiguang
    Ma, Jie
    Xu, Shibiao
    REMOTE SENSING, 2022, 14 (23)
  • [27] A semantic approach to improving machine readability of a large-scale attack graph
    Jooyoung Lee
    Daesung Moon
    Ikkyun Kim
    Youngseok Lee
    The Journal of Supercomputing, 2019, 75 : 3028 - 3045
  • [28] A semantic approach to improving machine readability of a large-scale attack graph
    Lee, Jooyoung
    Moon, Daesung
    Kim, Ikkyun
    Lee, Youngseok
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (06): : 3028 - 3045
  • [29] Improving the representation of groundwater processes in a large-scale water resources model
    Baron, Helen Elizabeth
    Keller, Virginie D. J.
    Horan, R.
    MacAllister, Donald John
    Simpson, Mike
    Jackson, Christopher R.
    Houghton-Carr, Helen A.
    Rickards, Nathan
    Garg, Kaushal K.
    Sekhar, Muddu
    MacDonald, Alan
    Rees, Gwyn
    HYDROLOGICAL SCIENCES JOURNAL, 2023, 68 (09) : 1264 - 1285
  • [30] Semantic Hierarchy Preserving Deep Hashing for Large-Scale Image Retrieval
    Ming Zhang
    Zhe, Xuefei
    Le Ou-Yang
    Chen, Shifeng
    Hong Yan
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,