Deep neural networks ensemble for detecting medication mentions in tweets

被引:20
|
作者
Weissenbacher, Davy [1 ]
Sarker, Abeed [1 ]
Klein, Ari [1 ]
O'Connor, Karen [1 ]
Magge, Arjun [2 ]
Gonzalez-Hernandez, Graciela [1 ]
机构
[1] Univ Penn, Dept Biostat Epidemiol & Informat, Perelman Sch Med, 480-492-0477,404 Blockley Hall,423 Guardian Dr, Philadelphia, PA 19104 USA
[2] Arizona State Univ, Biodesign Ctr Environm Hlth Engn, Tempe, AZ USA
关键词
social media; pharmacovigilance; drug name detection; ensemble learning; text classification; TWITTER; MODELS;
D O I
10.1093/jamia/ocz156
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step toward incorporating Twitter data in pharmacoepidemiologic research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names suffer from low recall due to misspellings or ambiguity with common words, we propose a more advanced method to recognize them. Materials and Methods: We present Kusuri, an Ensemble Learning classifier able to identify tweets mentioning drug products and dietary supplements. Kusuri ("medication" in Japanese) is composed of 2 modules: first, 4 different classifiers (lexicon based, spelling variant based, pattern based, and a weakly trained neural network) are applied in parallel to discover tweets potentially containing medication names; second, an ensemble of deep neural networks encoding morphological, semantic, and long-range dependencies of important words in the tweets makes the final decision. Results: On a class-balanced (50-50) corpus of 15 005 tweets, Kusuri demonstrated performances close to human annotators with an F-1 score of 93.7%, the best score achieved thus far on this corpus. On a corpus made of all tweets posted by 112 Twitter users (98 959 tweets, with only 0.26% mentioning medications), Kusuri obtained an F-1 score of 78.8%. To the best of our knowledge, Kusuri is the first system to achieve this score on such an extremely imbalanced dataset. Conclusions: The system identifies tweets mentioning drug names with performance high enough to ensure its usefulness, and is ready to be integrated in pharmacovigilance, toxicovigilance, or more generally, public health pipelines that depend on medication name mentions.
引用
收藏
页码:1618 / 1626
页数:9
相关论文
共 50 条
  • [31] An ensemble framework of deep neural networks for colorectal polyp classification
    Farah Younas
    Muhammad Usman
    Wei Qi Yan
    Multimedia Tools and Applications, 2023, 82 : 18925 - 18946
  • [32] Ensemble Lung Segmentation System Using Deep Neural Networks
    Ali, Redha
    Hardie, Russell C.
    Ragb, Hussin K.
    2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA, 2020,
  • [33] Efficient Diversity-Driven Ensemble for Deep Neural Networks
    Zhang, Wentao
    Jiang, Jiawei
    Shao, Yingxia
    Cui, Bin
    2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 73 - 84
  • [34] Swarm Intelligence Based Ensemble Learning of Deep Neural Networks
    Li, Tao
    Ma, Jinwen
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 256 - 264
  • [35] Sequential Deep Neural Networks Ensemble for Speech Bandwidth Extension
    Lee, Bong-Ki
    Noh, Kyounjin
    Chang, Joon-Hyuk
    Choo, Kihyun
    Oh, Eunmi
    IEEE ACCESS, 2018, 6 : 27039 - 27047
  • [36] Snapshot boosting: a fast ensemble framework for deep neural networks
    Wentao Zhang
    Jiawei Jiang
    Yingxia Shao
    Bin Cui
    Science China Information Sciences, 2020, 63
  • [37] Deep ensemble with Neural Networks to model power curve uncertainty
    Perez-Sanjines, F.
    Verstraeten, T.
    Nowe, A.
    Helsen, J.
    EERA DEEPWIND OFFSHORE WIND R&D CONFERENCE, DEEPWIND 2022, 2022, 2362
  • [38] Snapshot boosting: a fast ensemble framework for deep neural networks
    Wentao ZHANG
    Jiawei JIANG
    Yingxia SHAO
    Bin CUI
    Science China(Information Sciences), 2020, 63 (01) : 77 - 88
  • [39] Ensemble Learning on Deep Neural Networks for Image Caption Generation
    Katpally, Harshitha
    Bansal, Ajay
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2020), 2020, : 61 - 68
  • [40] Ensemble of Deep Convolutional Neural Networks for Prognosis of Ischemic Stroke
    Choi, Youngwon
    Kwon, Yongchan
    Lee, Hanbyul
    Kim, Beom Joon
    Paik, Myunghee Cho
    Won, Joong-Ho
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, 2016, 2016, 10154 : 231 - 243