Taxi drivers' traffic violations detection using random forest algorithm: A case study in China

被引:4
|
作者
Wan, Ming [1 ]
Wu, Qian [1 ]
Yan, Lixin [1 ]
Guo, Junhua [1 ]
Li, Wenxia [1 ]
Lin, Wei [2 ]
Lu, Shan [3 ]
机构
[1] East China Jiaotong Univ, Sch Transportat Engn, 808 Shuanggang East St,Nanchang Econ Dev Zone, Nanchang 330013, Jiangxi, Peoples R China
[2] Traff Adm Bur Nanchang Publ Secur Bur, Nanchang, Jiangxi, Peoples R China
[3] Shenzhen Polytech, Inst Intelligence Sci & Engn, Shenzhen, Peoples R China
关键词
Taxi drivers' traffic violations; impact factors; imbalanced dataset; Random Forest; SHAP; SAFETY; CLASSIFICATION; EXPERIENCE; TIME;
D O I
10.1080/15389588.2023.2191286
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objective: To effectively explore the impacts of several key factors on taxi drivers' traffic violations and provide traffic management departments with scientific decisions to reduce traffic fatalities and injuries. Methods: 43,458 electronic enforcement data about taxi drivers' traffic violations in Nanchang City, Jiangxi Province, China, from July 1, 2020, to June 30, 2021, were utilized to explore the characteristics of traffic violations. A random forest algorithm was used to predict the severity of taxi drivers' traffic violations and 11 factors affecting traffic violations, including time, road conditions, environment, and taxi companies were analyzed using the Shapley Additionality Explanation (SHAP) framework. Results: Firstly, the ensemble method Balanced Bagging Classifier (BBC) was applied to balance the dataset. The results showed that the imbalance ratio (IR) of the original imbalanced dataset reduced from 6.61% to 2.60%. Moreover, a prediction model for the severity of taxi drivers' traffic violations was established by using the Random Forest, and the results showed that accuracy, m_F1, m_G-mean, m_AUC, and m_AP obtained 0.877, 0.849, 0.599, 0.976, and 0.957, respectively. Compared with the algorithms of Decision Tree, XG Boost, Ada Boost, and Neural Network, the performance measures of the prediction model based on Random Forest were the best. Finally, the SHAP framework was used to improve the interpretability of the model and identify important factors affecting taxi drivers' traffic violations. The results showed that functional districts, location of the violation, and road grade were found to have a high impact on the probability of traffic violations; their mean SHAP values were 0.39, 0.36, and 0.26, respectively. Conclusions: Findings of this paper may help to discover the relationship between the influencing factors and the severity of traffic violations, and provide a theoretical basis for reducing the traffic violations of taxi drivers and improving the road safety management.
引用
收藏
页码:362 / 370
页数:9
相关论文
共 50 条
  • [1] Evolutionary Game Model of Traffic Violations among Taxi Drivers
    Jiang X.
    Zhou Y.
    Xia L.
    Fu C.
    Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University, 2019, 54 (06): : 1121 - 1128
  • [2] Forested landslide detection using LiDAR data and the random forest algorithm: A case study of the Three Gorges, China
    Chen, Weitao
    Li, Xianju
    Wang, Yanxin
    Chen, Gang
    Liu, Shengwei
    REMOTE SENSING OF ENVIRONMENT, 2014, 152 : 291 - 301
  • [4] Common Traffic Violations of Bus Drivers in Urban China: An Observational Study
    Wang, Qiqi
    Zhang, Wei
    Yang, Rendong
    Huang, Yuanxiu
    Zhang, Lin
    Ning, Peishan
    Cheng, Xunjie
    Schwebel, David C.
    Hu, Guoqing
    Yao, Hongyan
    PLOS ONE, 2015, 10 (09):
  • [5] Exploring drivers of patient satisfaction using a random forest algorithm
    Mecit Can Emre Simsekler
    Noura Hamed Alhashmi
    Elie Azar
    Nelson King
    Rana Adel Mahmoud Ali Luqman
    Abdalla Al Mulla
    BMC Medical Informatics and Decision Making, 21
  • [6] Exploring drivers of patient satisfaction using a random forest algorithm
    Simsekler, Mecit Can Emre
    Alhashmi, Noura Hamed
    Azar, Elie
    King, Nelson
    Luqman, Rana Adel Mahmoud Ali
    Al Mulla, Abdalla
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [7] Factors influencing traffic signal violations by car drivers, cyclists, and pedestrians: A case study from Guangdong, China
    Zhang, Guangnan
    Tan, Ying
    Jou, Rong-Chang
    TRANSPORTATION RESEARCH PART F-TRAFFIC PSYCHOLOGY AND BEHAVIOUR, 2016, 42 : 205 - 216
  • [8] Traffic Accident Detection Using Random Forest Classifier
    Dogru, Nejdet
    Subasi, Abdulhamit
    2018 15TH LEARNING AND TECHNOLOGY CONFERENCE (L&T), 2018, : 40 - 45
  • [9] Taxi Drivers' Smoking Behavior Detection in Traffic Monitoring Video
    Chen, Siwei
    Jia, Kebin
    Liu, Pengyu
    Huang, Xunping
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 968 - 973
  • [10] Taxi-Out Time Prediction at a Busy Airport using Random Forest Algorithm
    Kim, Jihoon
    Baik, Hojoing
    2021 IEEE/AIAA 40TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2021,