Using the Benford's Law as a First Step to Assess the Quality of the Cancer Registry Data

被引:13
|
作者
Crocetti, Emanuele [1 ]
Randi, Giorgia [1 ]
机构
[1] European Commiss, JRC, Directorate Hlth Consumers & Reference Mat F, Hlth Soc Unit, Ispra, Italy
关键词
cancer registry; incidence; data quality; Benford; methodology; FRAUD;
D O I
10.3389/fpubh.2016.00225
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: Benfords law states that the distribution of the first digit different from 0 [first significant digit (FSD)] in many collections of numbers is not uniform. The aim of this study is to evaluate whether population-based cancer incidence rates follow Benfords law, and if this can be used in their data quality check process. Methods: We sampled 43 population-based cancer registry populations (CRPs) from the Cancer Incidence in 5 Continents-volume X (CI5-X). The distribution of cancer incidence rate FSD was evaluated overall, by sex, and by CRP. Several statistics, including Pearsons coefficient of correlation and distance measures, were applied to check the adherence to the Benfords law. Results: In the whole dataset (146,590 incidence rates) and for each sex (70,722 male and 75,868 female incidence rates), the FSD distributions were Benford-like. The coefficient of correlation between observed and expected FSD distributions was extremely high (0.999), and the distance measures low. Considering single CRP (from 933 to 7,222 incidence rates), the results were in agreement with the Benfords law, and only a few CRPs showed possible discrepancies from it. Conclusion: This study demonstrated for the first time that cancer incidence rates follow Benfords law. This characteristic can be used as a new, simple, and objective tool in data quality evaluation. The analyzed data had been already checked for publication in CI5-X. Therefore, their quality was expected to be good. In fact, only for a few CRPs several statistics were consistent with possible violations.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Using Benford's law to assess the quality of COVID-19 register data in Brazil
    Silva, Lucas
    Figueiredo Filho, Dalson
    JOURNAL OF PUBLIC HEALTH, 2021, 43 (01) : 107 - 110
  • [2] Altmetric data quality analysis using Benford's law
    Gupta, Solanki
    Singh, Vivek Kumar
    Banshal, Sumit Kumar
    SCIENTOMETRICS, 2024, 129 (07) : 4597 - 4621
  • [3] On the Use of Benford's Law to Assess the Quality of the Data Provided by Lightning Locating Systems
    Mansouri, Ehsan
    Mostajabi, Amirhosein
    Schulz, Wolfgang
    Diendorfer, Gerhard
    Rubinstein, Marcos
    Rachidi, Farhad
    ATMOSPHERE, 2022, 13 (04)
  • [4] Not the first digit! Using Benford's law to detect fraudulent scientific data
    Diekmann, Andreas
    JOURNAL OF APPLIED STATISTICS, 2007, 34 (03) : 321 - 329
  • [5] Benford's Law and the Quality of Occupational Hygiene Data REPLY
    de Vocht, Frank
    Kromhout, Hans
    ANNALS OF OCCUPATIONAL HYGIENE, 2014, 58 (03): : 400 - 401
  • [6] The use of Zipf's law in the screening of analytical data: a step beyond Benford
    Brown, Richard J. C.
    ANALYST, 2007, 132 (04) : 344 - 349
  • [7] Application of Benford's Law for the analysis of the reliability of production quality data
    Rajda-Tasior, Angelina
    33RD INTERNATIONAL CONFERENCE MATHEMATICAL METHODS IN ECONOMICS (MME 2015), 2015, : 695 - 700
  • [8] Detecting fraud in data sets using Benford's Law
    Geyer, CL
    Williamson, PP
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2004, 33 (01) : 229 - 246
  • [9] BENFORD'S LAW AND LIGHTNING DATA
    Manoochehrnia, P.
    Rachidi, F.
    Rubinstein, M.
    Schulz, W.
    Diendorfer, G.
    2010 30TH INTERNATIONAL CONFERENCE ON LIGHTNING PROTECTION (ICLP), 2010,
  • [10] Detecting Problems in Survey Data Using Benford's Law
    Judge, George
    Schechter, Laura
    JOURNAL OF HUMAN RESOURCES, 2009, 44 (01) : 1 - 24