Different approaches for identifying important concepts in probabilistic biomedical text summarization

被引:46
|
作者
Moradi, Milad [1 ,2 ]
Ghadiri, Nasser [1 ]
机构
[1] Isfahan Univ Technol, Dept Elect & Comp Engn, Esfahan 8415683111, Iran
[2] Tech Univ Denmark, Dept Appl Math & Comp Sci, DK-2800 Lyngby, Denmark
关键词
Medical text mining; Data mining; Bayesian classification; Feature selection; UMLS concept; Sentence classifications; MEDICAL DOCUMENTS; KNOWLEDGE; DOMAIN;
D O I
10.1016/j.artmed.2017.11.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic text summarization tools help users in the biomedical domain to acquire their intended information from various textual resources more efficiently. Some of biomedical text summarization systems put the basis of their sentence selection approach on the frequency of concepts extracted from the input text. However, it seems that exploring other measures rather than the raw frequency for identifying valuable contents within an input document, or considering correlations existing between concepts, may be more useful for this type of summarization. In this paper, we describe a Bayesian summarization method for biomedical text documents. The Bayesian summarizer initially maps the input text to the Unified Medical Language System (UMLS) concepts; then it selects the important ones to be used as classification features. We introduce six different feature selection approaches to identify the most important concepts of the text and select the most informative contents according to the distribution of these concepts. We show that with the use of an appropriate feature selection approach, the Bayesian summarizer can improve the performance of biomedical summarization. Using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) toolkit, we perform extensive evaluations on a corpus of scientific papers in the biomedical domain. The results show that when the Bayesian summarizer utilizes the feature selection methods that do not use the raw frequency, it can outperform the biomedical summarizers that rely on the frequency of concepts, domain-independent and baseline methods. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:101 / 116
页数:16
相关论文
共 50 条
  • [1] The use of domain-specific concepts in biomedical text summarization
    Reeve, Lawrence H.
    Han, Hyoil
    Brooks, Ari D.
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (06) : 1765 - 1776
  • [2] Different Metrics Results in Text Summarization Approaches
    Barbella, Marcello
    Risi, Michele
    Tortora, Genoveffa
    Citarella, Alessia Auriemma
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2022, : 31 - 39
  • [3] Study of automatic text summarization approaches in different languages
    Yogesh Kumar
    Komalpreet Kaur
    Sukhpreet Kaur
    Artificial Intelligence Review, 2021, 54 : 5897 - 5929
  • [4] Study of automatic text summarization approaches in different languages
    Kumar, Yogesh
    Kaur, Komalpreet
    Kaur, Sukhpreet
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 5897 - 5929
  • [5] Automatic Text Summarization Approaches
    Al-Taani, Ahmad T.
    2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS), 2017, : 93 - 94
  • [6] Automatic Text Summarization of Biomedical Text Data: A Systematic Review
    Chaves, Andrea
    Kesiku, Cyrille
    Garcia-Zapirain, Begonya
    INFORMATION, 2022, 13 (08)
  • [7] Resolving ambiguity in biomedical text to improve summarization
    Plaza, Laura
    Stevenson, Mark
    Diaz, Alberto
    INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (04) : 755 - 766
  • [8] Multimodal text summarization with evaluation approaches
    Khilji, Abdullah Faiz Ur Rahman
    Sinha, Utkarsh
    Singh, Pintu
    Ali, Adnan
    Laskar, Sahinur Rahman
    Dadure, Pankaj
    Manna, Riyanka
    Pakray, Partha
    Favre, Benoit
    Bandyopadhyay, Sivaji
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2023, 48 (04):
  • [9] Probabilistic Neural Network Based Text Summarization
    Fattah, Mohamed Abdel
    Ren, Fuji
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 43 - 48
  • [10] Multimodal text summarization with evaluation approaches
    Abdullah Faiz Ur Rahman Khilji
    Utkarsh Sinha
    Pintu Singh
    Adnan Ali
    Sahinur Rahman Laskar
    Pankaj Dadure
    Riyanka Manna
    Partha Pakray
    Benoit Favre
    Sivaji Bandyopadhyay
    Sādhanā, 48