Deep neural networks ensemble for detecting medication mentions in tweets

被引:20
|
作者
Weissenbacher, Davy [1 ]
Sarker, Abeed [1 ]
Klein, Ari [1 ]
O'Connor, Karen [1 ]
Magge, Arjun [2 ]
Gonzalez-Hernandez, Graciela [1 ]
机构
[1] Univ Penn, Dept Biostat Epidemiol & Informat, Perelman Sch Med, 480-492-0477,404 Blockley Hall,423 Guardian Dr, Philadelphia, PA 19104 USA
[2] Arizona State Univ, Biodesign Ctr Environm Hlth Engn, Tempe, AZ USA
关键词
social media; pharmacovigilance; drug name detection; ensemble learning; text classification; TWITTER; MODELS;
D O I
10.1093/jamia/ocz156
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step toward incorporating Twitter data in pharmacoepidemiologic research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names suffer from low recall due to misspellings or ambiguity with common words, we propose a more advanced method to recognize them. Materials and Methods: We present Kusuri, an Ensemble Learning classifier able to identify tweets mentioning drug products and dietary supplements. Kusuri ("medication" in Japanese) is composed of 2 modules: first, 4 different classifiers (lexicon based, spelling variant based, pattern based, and a weakly trained neural network) are applied in parallel to discover tweets potentially containing medication names; second, an ensemble of deep neural networks encoding morphological, semantic, and long-range dependencies of important words in the tweets makes the final decision. Results: On a class-balanced (50-50) corpus of 15 005 tweets, Kusuri demonstrated performances close to human annotators with an F-1 score of 93.7%, the best score achieved thus far on this corpus. On a corpus made of all tweets posted by 112 Twitter users (98 959 tweets, with only 0.26% mentioning medications), Kusuri obtained an F-1 score of 78.8%. To the best of our knowledge, Kusuri is the first system to achieve this score on such an extremely imbalanced dataset. Conclusions: The system identifies tweets mentioning drug names with performance high enough to ensure its usefulness, and is ready to be integrated in pharmacovigilance, toxicovigilance, or more generally, public health pipelines that depend on medication name mentions.
引用
收藏
页码:1618 / 1626
页数:9
相关论文
共 50 条
  • [1] Detecting Informative Tweets during Disaster using Deep Neural Networks
    Madichetty, Sreenivasulu
    Sridevi, M.
    2019 11TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2019, : 709 - 713
  • [2] A Semantic BI Process for Detecting and Analyzing Mentions of Interest for a Domain in Tweets
    Pereira Junior, Vilmar Cesar
    Fileto, Renato
    de Souza, Willian Santos
    Wittwer, Matthias
    Reinhold, Olaf
    Alt, Rainer
    WEBMEDIA'18: PROCEEDINGS OF THE 24TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2018, : 197 - 204
  • [3] Active neural networks to detect mentions of changes to medication treatment in social media
    Weissenbacher, Davy
    Ge, Suyu
    Klein, Ari
    O'Connor, Karen
    Gross, Robert
    Hennessy, Sean
    Gonzalez-Hernandez, Graciela
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (12) : 2551 - 2561
  • [4] Detecting Psychological Stress from Speech using Deep Neural Networks and Ensemble Classifiers
    Mihalache, Serban
    Burileanu, Dragos
    Burileanu, Corneliu
    2021 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2021, : 74 - 79
  • [5] Neural Networks Ensemble Approach for Detecting Attacks in Computer Networks
    Bukhtoyarov, Vladimir
    Semenkin, Eugene
    2012 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2012,
  • [6] Identifying Personal Health Experience Tweets with Deep Neural Networks
    Jiang, Keyuan
    Gupta, Ravish
    Gupta, Matrika
    Calix, Ricardo A.
    Bernard, Gordon R.
    2017 39TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2017, : 1174 - 1177
  • [7] An Ensemble of Deep Recurrent Neural Networks for Detecting IoT Cyber Attacks Using Network Traffic
    Saharkhizan, Mahdis
    Azmoodeh, Amin
    Dehghantanha, Ali
    Choo, Kim-Kwang Raymond
    Parizi, Reza M.
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (09): : 8852 - 8859
  • [8] Affect Classification in Tweets using Multitask Deep Neural Networks
    Nagar, Seema
    Shankhdhar, Achintya
    Barbhuiya, Ferdous Ahmed
    Dey, Kuntal
    WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 516 - 520
  • [9] AN ENSEMBLE OF DEEP NEURAL NETWORKS FOR OBJECT TRACKING
    Zhou, Xiangzeng
    Xie, Lei
    Zhang, Peng
    Zhang, Yanning
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 843 - 847
  • [10] DETECTING HATE SPEECH IN TWEETS USING DIFFERENT DEEP NEURAL NETWORK ARCHITECTURES
    Amrutha, B. R.
    Bindu, K. R.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 923 - 926