Enhancing random forest classification with NLP in DAMEH: A system for DAta Management in eHealth Domain

被引:8
|
作者
Amato, Flora [1 ]
Coppolino, Luigi [2 ]
Cozzolino, Giovanni [1 ]
Mazzeo, Giovanni [1 ]
Moscato, Francesco [3 ]
Nardone, Roberto [4 ]
机构
[1] Univ Naples Federico II, DIETI, Naples, Italy
[2] Univ Naples Parthenope, DI, Naples, Italy
[3] Univ Salerno, DIEM, Fisciano, Italy
[4] Univ Mediterranea Reggio Calabria, DIIES, Reggio Di Calabria, Italy
关键词
Big data processing; E-health; Machine learning; Random forests; Multi-classification schema; FEATURE-SELECTION; ARCHITECTURE;
D O I
10.1016/j.neucom.2020.08.091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The use of pervasive IoT devices in Smart Cities, have increased the Volume of data produced in many and many field. Interesting and very useful applications grow up in number in E-health domain, where smart devices are used in order to manage huge amount of data, in highly distributed environments, in order to provide smart services able to collect data to fill medical records of patients. The problem here is to gather data, to produce records and to analyze medical records depending on their contents. Since data gathering involve very different devices (not only wearable medical sensors, but also environmental smart devices, like weather, pollution and other sensors) it is very difficult to classify data depending their contents, in order to enable better management of patients. Data from smart devices couple with medical records written in natural language: we describe here an architecture that is able to determine best features for classification, depending on existent medical records. The architecture is based on pre filtering phase based on Natural Language Processing, that is able to enhance Machine learning classification based on Random Forests. We carried on experiments on about 5000 medical records from real (anonymized) case studies from various health-care organizations in Italy. We show accuracy of the presented approach in terms of Accuracy-Rejection curves. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:79 / 91
页数:13
相关论文
共 50 条
  • [21] Forest type identification by random forest classification combined with SPOT and multitemporal SAR data
    Ying Yu
    Mingze Li
    Yu Fu
    Journal of Forestry Research, 2018, 29 (05) : 1407 - 1414
  • [22] Forest type identification by random forest classification combined with SPOT and multitemporal SAR data
    Ying Yu
    Mingze Li
    Yu Fu
    Journal of Forestry Research, 2018, 29 : 1407 - 1414
  • [23] Towards an Ontological Framework for Integrating Domain Expert Knowledge with Random Forest Classification
    Beden, Sadeer
    Beckmann, Arnold
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 221 - 224
  • [24] Phasma: An Automatic Modulation Classification System Based on Random Forest
    Triantafyllakis, Kostis
    Surligas, Manolis
    Vardakis, George
    Papadakis, Stefanos
    2017 IEEE INTERNATIONAL SYMPOSIUM ON DYNAMIC SPECTRUM ACCESS NETWORKS (IEEE DYSPAN), 2017,
  • [25] GA-optimized random forest classification for high dimensional data
    Pan, Jingchang
    Wei, Peng
    Guo, Qiang
    Zhang, Caiming
    Luo, Ali
    ICIC Express Letters, 2011, 5 (05): : 1529 - 1534
  • [26] A novel Random Forest integrated model for imbalanced data classification problem
    Gu, Qinghua
    Tian, Jingni
    Li, Xuexian
    Jiang, Song
    KNOWLEDGE-BASED SYSTEMS, 2022, 250
  • [27] A Random Forest Model for Peptide Classification Based on Virtual Docking Data
    Feng, Hua
    Wang, Fangyu
    Li, Ning
    Xu, Qian
    Zheng, Guanming
    Sun, Xuefeng
    Hu, Man
    Xing, Guangxu
    Zhang, Gaiping
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (14)
  • [28] Random forest rock type classification with integration of geochemical and photographic data
    Trott, McLean
    Leybourne, Matthew
    Hall, Lindsay
    Layton-Matthews, Daniel
    APPLIED COMPUTING AND GEOSCIENCES, 2022, 15
  • [29] Classification of Travel Data with Multiple Sensor Information using Random Forest
    Shafique, Muhammad Awais
    Hato, Eiji
    19TH EURO WORKING GROUP ON TRANSPORTATION MEETING (EWGT2016), 2017, 22 : 144 - 153
  • [30] Imbalanced educational data classification: an effective approach with resampling and random forest
    Vo Thi Ngoc Chau
    Nguyen Hua Phung
    PROCEEDINGS OF 2013 IEEE RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF), 2013, : 135 - 140