Cost-sensitive learning for semi-supervised hit-and-run analysis

被引:14
|
作者
Zhu, Siying [1 ]
Wan, Jianwu [1 ]
机构
[1] Nanyang Technol Univ, Sch Civil & Environm Engn, Singapore, Singapore
来源
关键词
Hit-and-run; Cost-sensitive; Semi-supervised learning; Imbalanced dataset; Unlabelled data; CRASHES; ACCIDENTS; VEHICLE; BARRIERS; NETWORK; MODEL; ROAD;
D O I
10.1016/j.aap.2021.106199
中图分类号
TB18 [人体工程学];
学科分类号
1201 ;
摘要
Hit-and-run crashes not only degrade the morality, but also result in delays of medical services provided to victims. However, class imbalance problem exists as the number of hit-and-run crashes is much smaller than that of non-hit-and-run crashes. The missing label problem also exists in the crash analysis due to reasons like data barrier such that the information hidden in the unlabelled samples has not been effectively utilised. In this paper, a cost-sensitive semi-supervised logistic regression (CS3LR) model is proposed for hit-and-run analysis, in order to tackle class-imbalanced data distribution and missing label problem, based on the crash dataset of Victorian, Australia (2013-2019). By performing label estimation with logistic regression jointly utilising both labelled and unlabelled data with pseudo labels in a well-designed cost-sensitive semi-supervised maximum likelihood framework, the proposed model can obtain an unbiased likelihood parameter for hit-and-run prediction and analysis. Comparing the experimental results of CS3LR model with two logistic regression models and seven machine learning methods, better performance of CS3LR model is demonstrated. The most significant contributing factors to hit-and-run crashes extracted by CS3LR with only 10% labelled data show a high degree of consistency with the true contributing factors obtained by the supervised cost-sensitive logistic regression with complete hit-and-run labels. The effects of class-weighted ratio and hyper-parameter lambda on the performance of hitand-run crash prediction model have also been analysed. The results can further provide recommendations and implications on the policies and counter-measures for preventing hit-and-run collisions and crimes. The methodology proposed in this paper can also be employed to analyse crash data with other types of missing labels, such as crash severity.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Cost-sensitive semi-supervised classification using CS-EM
    Qin, Zhenxing
    Zhang, Shichao
    Liu, Li
    Wang, Tao
    2008 IEEE 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2008, : 131 - +
  • [12] Cost-sensitive semi-supervised ensemble model for customer churn prediction
    Xiao, Jin
    Huang, Lan
    Xie, Ling
    2018 15TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM), 2018,
  • [13] Cost-sensitive semi-supervised selective ensemble model for customer credit scoring
    Xiao, Jin
    Zhou, Xu
    Zhong, Yu
    Xie, Ling
    Gu, Xin
    Liu, Dunhu
    KNOWLEDGE-BASED SYSTEMS, 2020, 189
  • [14] Cost-sensitive semi-supervised deep learning to assess driving risk by application of naturalistic vehicle trajectories
    Hu, Hongyu
    Wang, Qi
    Cheng, Ming
    Gao, Zhenhai
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 178
  • [15] Cost-sensitive semi-supervised deep learning to assess driving risk by application of naturalistic vehicle trajectories
    Hu, Hongyu
    Wang, Qi
    Cheng, Ming
    Gao, Zhenhai
    Expert Systems with Applications, 2021, 178
  • [16] COSNet: A Cost Sensitive Neural Network for Semi-supervised Learning in Graphs
    Bertoni, Alberto
    Frasca, Marco
    Valentini, Giorgio
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 219 - 234
  • [17] A Theoretical Analysis of Semi-supervised Learning
    Fujii, Takashi
    Ito, Hidetaka
    Miyoshi, Seiji
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 28 - 36
  • [18] Analysis of active semi-supervised learning
    Berton, Lilian
    Mitsuishi, Felipe Baz
    Vega-Oliveros, Didier A.
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 1122 - 1129
  • [19] Cost-Sensitive Learning
    Zhou, Zlii-Hua
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2011, 2011, 6820 : 17 - 18
  • [20] Cost sensitive semi-supervised Laplacian support vector machine
    Wan, Jian-Wu
    Yang, Ming
    Chen, Yin-Juan
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2012, 40 (07): : 1410 - 1415