A MULTIVARIATE FREQUENCY-SEVERITY FRAMEWORK FOR HEALTHCARE DATA BREACHES

被引:0
|
作者
Sun, Hong [1 ]
Xu, Maochao [2 ]
Zhao, Peng [3 ]
机构
[1] Lanzhou Univ, Sch Math & Stat, Lanzhou, Peoples R China
[2] Illinois State Univ, Dept Math, Normal, IL USA
[3] Jiangsu Normal Univ, Sch Math & Stat, Jiangsu Prov Key Lab Educ Big Data Sci & Engn, RIMS, Xuzhou, Peoples R China
来源
ANNALS OF APPLIED STATISTICS | 2023年 / 17卷 / 01期
基金
中国国家自然科学基金;
关键词
Copula; data breach; heavy tail; multivariate dependence; score; COUNT DATA; REGRESSION; MODELS; RULES; COSTS; RISK;
D O I
10.1214/22-AOAS1625
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Data breaches in healthcare have become a substantial concern in recent years and cause millions of dollars in financial losses each year. It is fundamental for government regulators, insurance companies, and stakeholders to understand the breach frequency and the number of affected individuals in each state, as these are directly related to the federal Health Insurance Portability and Accountability Act (HIPAA) and state data breach laws. However, an obstacle to studying data breaches in healthcare is the lack of suitable statistical approaches. We develop a novel multivariate frequency-severity framework to analyze breach frequency and the number of affected individuals at the state level. A mixed effects model is developed to model the square root transformed frequency, and the log-gamma distribution is proposed to capture the skewness and heavy tail exhibited by the distribution of numbers of affected individuals. We further discover a positive nonlinear dependence between the transformed frequency and the log-transformed numbers of affected individuals (i.e., severity). In particular, we propose to use a D-vine copula to capture the multivariate dependence among conditional severities, given frequencies due to its inherent temporal structure and rich bivariate copula families. The rejection sampling technique is developed to simulate the predictive distributions. Both the in-sample and out-of-sample studies show that the proposed multivariate frequency-severity model that accommodates nonlinear dependence has satisfactory fitting and prediction performances.
引用
收藏
页码:240 / 268
页数:29
相关论文
共 50 条
  • [1] Multivariate Frequency-Severity Regression Models in Insurance
    Frees, Edward W.
    Lee, Gee
    Yang, Lu
    RISKS, 2016, 4 (01)
  • [2] The frequency-severity indeterminacy
    Hauer, E
    ACCIDENT ANALYSIS AND PREVENTION, 2006, 38 (01): : 78 - 83
  • [3] Static Risk Measures in a Frequency-Severity Framework with Systematic Risk: Application in Reinsurance
    Assa, Hirbod
    NORTH AMERICAN ACTUARIAL JOURNAL, 2025, 29 (01) : 94 - 118
  • [4] Dependent frequency-severity modeling of insurance claims
    Shi, Peng
    Feng, Xiaoping
    Ivantsova, Anastasia
    INSURANCE MATHEMATICS & ECONOMICS, 2015, 64 : 417 - 428
  • [5] Measuring Healthcare Data Breaches
    Alkinoon, Mohammed
    Choi, Sung J.
    Mohaisen, David
    INFORMATION SECURITY APPLICATIONS, 2021, 13009 : 265 - 277
  • [6] Healthcare provider data breaches - framework for crisis communication and support of patients and healthcare workers in mental healthcare
    Looi, Jeffrey C. L.
    Allison, Stephen
    Bastiampillai, Tarun
    Maguire, Paul A.
    Kisely, Steve
    Looi, Richard C. H.
    AUSTRALASIAN PSYCHIATRY, 2024, 32 (04) : 319 - 322
  • [7] Stochastic gradient boosting frequency-severity model of insurance claims
    Su, Xiaoshan
    Bai, Manying
    PLOS ONE, 2020, 15 (08):
  • [8] A dependent frequency-severity approach to modeling longitudinal insurance claims
    Lee, Gee Y.
    Shi, Peng
    INSURANCE MATHEMATICS & ECONOMICS, 2019, 87 : 115 - 129
  • [9] Policy Framework for Data Breaches
    Telang, Rahul
    IEEE SECURITY & PRIVACY, 2015, 13 (01) : 77 - 79
  • [10] Healthcare Data Breaches: Insights and Implications
    Seh, Adil Hussain
    Zarour, Mohammad
    Alenezi, Mamdouh
    Sarkar, Amal Krishna
    Agrawal, Alka
    Kumar, Rajeev
    Ahmad Khan, Raees
    HEALTHCARE, 2020, 8 (02)