Enhancing random forest classification with NLP in DAMEH: A system for DAta Management in eHealth Domain

被引:8
|
作者
Amato, Flora [1 ]
Coppolino, Luigi [2 ]
Cozzolino, Giovanni [1 ]
Mazzeo, Giovanni [1 ]
Moscato, Francesco [3 ]
Nardone, Roberto [4 ]
机构
[1] Univ Naples Federico II, DIETI, Naples, Italy
[2] Univ Naples Parthenope, DI, Naples, Italy
[3] Univ Salerno, DIEM, Fisciano, Italy
[4] Univ Mediterranea Reggio Calabria, DIIES, Reggio Di Calabria, Italy
关键词
Big data processing; E-health; Machine learning; Random forests; Multi-classification schema; FEATURE-SELECTION; ARCHITECTURE;
D O I
10.1016/j.neucom.2020.08.091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The use of pervasive IoT devices in Smart Cities, have increased the Volume of data produced in many and many field. Interesting and very useful applications grow up in number in E-health domain, where smart devices are used in order to manage huge amount of data, in highly distributed environments, in order to provide smart services able to collect data to fill medical records of patients. The problem here is to gather data, to produce records and to analyze medical records depending on their contents. Since data gathering involve very different devices (not only wearable medical sensors, but also environmental smart devices, like weather, pollution and other sensors) it is very difficult to classify data depending their contents, in order to enable better management of patients. Data from smart devices couple with medical records written in natural language: we describe here an architecture that is able to determine best features for classification, depending on existent medical records. The architecture is based on pre filtering phase based on Natural Language Processing, that is able to enhance Machine learning classification based on Random Forests. We carried on experiments on about 5000 medical records from real (anonymized) case studies from various health-care organizations in Italy. We show accuracy of the presented approach in terms of Accuracy-Rejection curves. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:79 / 91
页数:13
相关论文
共 50 条
  • [1] Enhancing The Data Fusion Based Forest Fire Preserving And Management System
    Elmas, Cetin
    Sonmez, Yusuf
    JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2008, 11 (02): : 99 - 108
  • [2] Random forest algorithm for classification of multiwavelength data
    Gao, Dan
    Zhang, Yan-Xia
    Zhao, Yong-Heng
    RESEARCH IN ASTRONOMY AND ASTROPHYSICS, 2009, 9 (02) : 220 - 226
  • [3] Random forest algorithm for classification of multiwavelength data
    Dan Gao1
    2 Graduate University of Chinese Academy of Sciences
    ResearchinAstronomyandAstrophysics, 2009, 9 (02) : 220 - 226
  • [4] User and Service Classification in Power Grid Management System based on Random Forest
    Qu, Yansheng
    Cheng, Xingfang
    Wang, Yunxiao
    Sheng, Hua
    Zhang, Wenbin
    Hou, Lu
    Meng, Jian
    Wang, Xie
    PROCEEDINGS OF 2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS AND SPECIAL SESSIONS: (WI-IAT WORKSHOP/SPECIAL SESSION 2021), 2021, : 1 - 4
  • [5] Frequency Bands Selection for Seizure Classification and Forecasting Using NLP, Random Forest and SVM Models
    Wang, Ziwei
    Mengoni, Paolo
    BRAIN INFORMATICS, BI 2021, 2021, 12960 : 310 - 320
  • [6] Random forest for gene selection and microarray data classification
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    BIOINFORMATION, 2011, 7 (03) : 142 - 146
  • [7] Investigation of the random forest framework for classification of hyperspectral data
    Ham, J
    Chen, YC
    Crawford, MM
    Ghosh, J
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2005, 43 (03): : 492 - 501
  • [8] Random Forest for Gene Selection and Microarray Data Classification
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    KNOWLEDGE TECHNOLOGY, 2012, 295 : 174 - 183
  • [9] Ovarian Cancer Data Classification Using Bagging and Random Forest
    Arfiani, A.
    Rustam, Z.
    PROCEEDINGS OF THE 4TH INTERNATIONAL SYMPOSIUM ON CURRENT PROGRESS IN MATHEMATICS AND SCIENCES (ISCPMS2018), 2019, 2168
  • [10] Random forest classification of multisource remote sensing and geographic data
    Gislason, PO
    Benediktsson, JA
    Sveinsson, JR
    IGARSS 2004: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM PROCEEDINGS, VOLS 1-7: SCIENCE FOR SOCIETY: EXPLORING AND MANAGING A CHANGING PLANET, 2004, : 1049 - 1052