Identifying low-quality patterns in accident reports from textual data

被引:5
|
作者
Macedo, July B. [1 ,2 ]
Ramos, Plinio M. S. [1 ,2 ]
Maior, Caio B. S. [1 ,3 ]
Moura, Marcio J. C. [1 ,2 ]
Lins, Isis D. [1 ,2 ]
Vilela, Romulo F. T. [4 ]
机构
[1] Univ Fed Pernambuco, CEERMA Ctr Risk Anal Reliabil Engn & Environm Mod, Recife, PE, Brazil
[2] Univ Fed Pernambuco, Dept Prod Engn, Recife, PE, Brazil
[3] Univ Fed Pernambuco, Technol Ctr, Recife, PE, Brazil
[4] Companhia Hidrelect Sao Francisco CHESF, Ico, Brazil
关键词
occupational safety; automatic classification; natural language processing; machine learning; topic modeling; safety culture; accident analysis; SUPPORT VECTOR MACHINES; DECISION-SUPPORT; INJURY; RELIABILITY; MANAGEMENT; SYSTEM;
D O I
10.1080/10803548.2022.2111847
中图分类号
TB18 [人体工程学];
学科分类号
1201 ;
摘要
Accident investigation reports provide useful knowledge to support companies to propose preventive and mitigative measures. However, the information presented in accident report databases is normally large, complex, filled with errors and has missing and/or redundant data. In this article, we propose text mining and natural language processing techniques to investigate low-quality accident reports. We adopted machine learning (ML) to detect and investigate inconsistencies on accident reports. The methodology was applied to 626 documents collected from an actual hydroelectric power company. The initial ML performances indicated data divergences and concerns related to the report structure. Then, the accident database was restructured to a more proper form confirming the supposition about the quality of the reports investigated. The proposed approach can be used as a diagnostic tool to improve the design of accident investigation reports to provide a more useful source of knowledge to support decisions in the safety context.
引用
收藏
页码:1088 / 1100
页数:13
相关论文
共 50 条
  • [11] Harnessing the information contained in low-quality data sources
    Couso, Ines
    Sanchez, Luciano
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2014, 55 (07) : 1485 - 1486
  • [12] Editorial: Special issue on mining low-quality data
    Zhu, Xingquan
    Khoshgoftaar, Taghi M.
    Davidson, Ian
    Zhang, Shichao
    KNOWLEDGE AND INFORMATION SYSTEMS, 2007, 11 (02) : 131 - 136
  • [13] Editorial: Special issue on mining low-quality data
    Xingquan Zhu
    Taghi M. Khoshgoftaar
    Ian Davidson
    Shichao Zhang
    Knowledge and Information Systems, 2007, 11 : 131 - 136
  • [14] RECOVERY OF ALUMINA FROM THE LOW-QUALITY ORES
    SUTYRIN, IE
    DOKLADY AKADEMII NAUK SSSR, 1981, 256 (04): : 920 - 922
  • [15] Lithium extraction from low-quality brines
    Yang, Sixie
    Wang, Yigang
    Pan, Hui
    He, Ping
    Zhou, Haoshen
    NATURE, 2024, 636 (8042) : 309 - 321
  • [16] Ask It Right! Identifying Low-Quality questions on Community Question Answering Services
    Arora, Udit
    Goyal, Nidhi
    Goel, Anmol
    Sachdeva, Niharika
    Kumaraguru, Ponnurangam
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [17] Low-Quality DanMu Detection via Eye-Tracking Patterns
    Liu, Xiangyang
    He, Weidong
    Xu, Tong
    Chen, Enhong
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2022, PT III, 2022, 13370 : 247 - 259
  • [18] Binarization for low-quality ESPI fringe patterns based on preprocessing and clustering
    Chen, Lei
    Tang, Chen
    Xu, Min
    Lei, Zhenkun
    APPLIED OPTICS, 2021, 60 (31) : 9866 - 9874
  • [19] A Hopfield Neural Network Based Algorithm for Haplotype Assembly from Low-quality Data
    Chen, Xiao
    Peng, Qinke
    Han, Libin
    Wang, Xiao
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1328 - 1333
  • [20] A Low-quality Data User Identification Method Based on Blockchain
    Wei, Jiayong
    Zhang, Hua
    Chen, Yuebu
    Xu, Yanxin
    2023 INTERNATIONAL CONFERENCE ON DATA SECURITY AND PRIVACY PROTECTION, DSPP, 2023, : 136 - 142