Healthcare Provider Summary Data for Fraud Classification

被引:1
|
作者
Johnson, Justin M. [1 ]
Khoshgoftaar, Taghi M. [1 ]
机构
[1] Florida Atlantic Univ, Coll Engn & Comp Sci, Boca Raton, FL 33431 USA
关键词
Healthcare; Medicare; Medical Providers; Fraud Detection; Big Data; Machine Learning; Feature Engineering;
D O I
10.1109/IRI54793.2022.00060
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fraud, waste, and abuse are spreading throughout the healthcare industry and costing patients and taxpayers billions of dollars. Fortunately, electronic medical records and publicly available data sources like the Centers for Medicare & Medicaid Services (CMS) have enabled data mining and machine learning techniques that can help automate the detection of healthcare fraud. In this study, we explore the application of healthcare provider summary data for the purpose of fraud detection. We leverage the latest CMS Part B Summary by Provider big data sets to curate two new labeled data sets for supervised learning. The two new data sets are compared to a popular baseline data set from related works using six runs of cross validation with two popular ensemble learners, multiple complementary performance metrics, and statistical tests. Classification results show that the proposed provider summary features are good indicators of healthcare fraud. A two-way analysis of variance test and 95% confidence intervals show that the new features yield significantly better performance on the fraud detection task when used to enrich existing data sets. Finally, feature contributions are measured with Shapley values to illustrate the top 20 features that contribute to fraud estimation.
引用
收藏
页码:236 / 242
页数:7
相关论文
共 50 条
  • [1] Medical Provider Embeddings for Healthcare Fraud Detection
    Johnson J.M.
    Khoshgoftaar T.M.
    SN Computer Science, 2021, 2 (4)
  • [2] Using Graph Attention Networks in Healthcare Provider Fraud Detection
    Mardani, Shahla
    Moradi, Hadi
    IEEE ACCESS, 2024, 12 : 132786 - 132800
  • [3] Classification of Imbalanced Auction Fraud Data
    Ganguly, Swati
    Sadaoui, Samira
    ADVANCES IN ARTIFICIAL INTELLIGENCE, CANADIAN AI 2017, 2017, 10233 : 84 - 89
  • [4] Approaches for identifying US medicare fraud in provider claims data
    Herland, Matthew
    Bauder, Richard A.
    Khoshgoftaar, Taghi M.
    HEALTH CARE MANAGEMENT SCIENCE, 2020, 23 (01) : 2 - 19
  • [5] Graph-based Classification of Healthcare Provider Activity
    Sadeghzadehyazdi, Nasrin
    Batabyal, Tamal
    Barnes, Laura E.
    Acton, Scott T.
    2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 1268 - 1272
  • [6] Healthcare insurance fraud detection using data mining
    Hamid, Zain
    Khalique, Fatima
    Mahmood, Saba
    Daud, Ali
    Bukhari, Amal
    Alshemaimri, Bader
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [7] Data analytics (ab) use in healthcare fraud audits
    Koreff, Jared
    Weisner, Martin
    Sutton, Steve G.
    INTERNATIONAL JOURNAL OF ACCOUNTING INFORMATION SYSTEMS, 2021, 42
  • [8] Data-Centric AI for Healthcare Fraud Detection
    Johnson J.M.
    Khoshgoftaar T.M.
    SN Computer Science, 4 (4)
  • [9] Database of Food Fraud Records: Summary of Data from 1980 to 2022
    Everstine, Karen D.
    Chin, Henry B.
    Lopes, Fernando A.
    Moore, Jeffrey C.
    JOURNAL OF FOOD PROTECTION, 2024, 87 (03)
  • [10] Using Big Data Analytics to Detect Fraud in Healthcare Provision
    Georgakopoulos, Spiros, V
    Gallos, Parisis
    Plagianakos, Vassilis P.
    2020 IEEE 5TH MIDDLE EAST AND AFRICA CONFERENCE ON BIOMEDICAL ENGINEERING (MECBME), 2020, : 90 - 92