Identifying low-quality patterns in accident reports from textual data

被引:5
|
作者
Macedo, July B. [1 ,2 ]
Ramos, Plinio M. S. [1 ,2 ]
Maior, Caio B. S. [1 ,3 ]
Moura, Marcio J. C. [1 ,2 ]
Lins, Isis D. [1 ,2 ]
Vilela, Romulo F. T. [4 ]
机构
[1] Univ Fed Pernambuco, CEERMA Ctr Risk Anal Reliabil Engn & Environm Mod, Recife, PE, Brazil
[2] Univ Fed Pernambuco, Dept Prod Engn, Recife, PE, Brazil
[3] Univ Fed Pernambuco, Technol Ctr, Recife, PE, Brazil
[4] Companhia Hidrelect Sao Francisco CHESF, Ico, Brazil
关键词
occupational safety; automatic classification; natural language processing; machine learning; topic modeling; safety culture; accident analysis; SUPPORT VECTOR MACHINES; DECISION-SUPPORT; INJURY; RELIABILITY; MANAGEMENT; SYSTEM;
D O I
10.1080/10803548.2022.2111847
中图分类号
TB18 [人体工程学];
学科分类号
1201 ;
摘要
Accident investigation reports provide useful knowledge to support companies to propose preventive and mitigative measures. However, the information presented in accident report databases is normally large, complex, filled with errors and has missing and/or redundant data. In this article, we propose text mining and natural language processing techniques to investigate low-quality accident reports. We adopted machine learning (ML) to detect and investigate inconsistencies on accident reports. The methodology was applied to 626 documents collected from an actual hydroelectric power company. The initial ML performances indicated data divergences and concerns related to the report structure. Then, the accident database was restructured to a more proper form confirming the supposition about the quality of the reports investigated. The proposed approach can be used as a diagnostic tool to improve the design of accident investigation reports to provide a more useful source of knowledge to support decisions in the safety context.
引用
收藏
页码:1088 / 1100
页数:13
相关论文
共 50 条
  • [21] Knowledge Transfer with Low-Quality Data: A Feature Extraction Issue
    Quanz, Brian
    Huan, Jun
    Mishra, Meenakshi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (10) : 1789 - 1802
  • [22] A novel method for clinical risk prediction with low-quality data
    Wang, Zeyuan
    Poon, Josiah
    Wang, Shuze
    Sun, Shiding
    Poon, Simon
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 114
  • [23] Identifying Macrocognitive Function Failures from Accident Reports: A Case Study
    Liu, Peng
    Lyu, Xi
    Qiu, Yongping
    Hu, Juntao
    Tong, Jiejuan
    Li, Zhizhong
    ADVANCES IN HUMAN FACTORS IN ENERGY: OIL, GAS, NUCLEAR AND ELECTRIC POWER INDUSTRIES, 2017, 495 : 29 - 40
  • [25] Knowledge Transfer with Low-Quality Data: a Feature Extraction Issue
    Quanz, Brian
    Huan, Jun
    Mishra, Meenakshi
    IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 769 - 779
  • [26] SYNTHESIS OF ZEOLITES FROM A LOW-QUALITY COLOMBIAN KAOLIN
    Villaquiran-Caicedo, Monica A.
    de Gutierrez, Ruby M.
    Gordillo, Marisol
    Gallego, Nidia C.
    CLAYS AND CLAY MINERALS, 2016, 64 (1-2) : 75 - 85
  • [27] Prostate Cancer Care and Practice Patterns: Low-quality Observations Miss the Benefits from High-quality Care
    Kapoor, Deepak A.
    EUROPEAN UROLOGY, 2018, 73 (04) : 499 - 501
  • [28] Synthesis of Zeolites from A Low-Quality Colombian Kaolin
    Mónica A. Villaquirán-Caicedo
    Ruby M. De Gutiérrez
    Marisol Gordillo
    Nidia C. Gallego
    Clays and Clay Minerals, 2016, 64 : 75 - 85
  • [29] Using Energy Landscapes to Determine Crystal Structures from Low-Quality Experimental Data.
    van de Streek, Jacco
    Neumann, Marcus A.
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2009, 65 : S106 - S106
  • [30] Operator inference with roll outs for learning reduced models from scarce and low-quality data
    Uy, Wayne Isaac Tan
    Hartmann, Dirk
    Peherstorfer, Benjamin
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2023, 145 : 224 - 239