Taxi drivers' traffic violations detection using random forest algorithm: A case study in China

被引:4
|
作者
Wan, Ming [1 ]
Wu, Qian [1 ]
Yan, Lixin [1 ]
Guo, Junhua [1 ]
Li, Wenxia [1 ]
Lin, Wei [2 ]
Lu, Shan [3 ]
机构
[1] East China Jiaotong Univ, Sch Transportat Engn, 808 Shuanggang East St,Nanchang Econ Dev Zone, Nanchang 330013, Jiangxi, Peoples R China
[2] Traff Adm Bur Nanchang Publ Secur Bur, Nanchang, Jiangxi, Peoples R China
[3] Shenzhen Polytech, Inst Intelligence Sci & Engn, Shenzhen, Peoples R China
关键词
Taxi drivers' traffic violations; impact factors; imbalanced dataset; Random Forest; SHAP; SAFETY; CLASSIFICATION; EXPERIENCE; TIME;
D O I
10.1080/15389588.2023.2191286
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objective: To effectively explore the impacts of several key factors on taxi drivers' traffic violations and provide traffic management departments with scientific decisions to reduce traffic fatalities and injuries. Methods: 43,458 electronic enforcement data about taxi drivers' traffic violations in Nanchang City, Jiangxi Province, China, from July 1, 2020, to June 30, 2021, were utilized to explore the characteristics of traffic violations. A random forest algorithm was used to predict the severity of taxi drivers' traffic violations and 11 factors affecting traffic violations, including time, road conditions, environment, and taxi companies were analyzed using the Shapley Additionality Explanation (SHAP) framework. Results: Firstly, the ensemble method Balanced Bagging Classifier (BBC) was applied to balance the dataset. The results showed that the imbalance ratio (IR) of the original imbalanced dataset reduced from 6.61% to 2.60%. Moreover, a prediction model for the severity of taxi drivers' traffic violations was established by using the Random Forest, and the results showed that accuracy, m_F1, m_G-mean, m_AUC, and m_AP obtained 0.877, 0.849, 0.599, 0.976, and 0.957, respectively. Compared with the algorithms of Decision Tree, XG Boost, Ada Boost, and Neural Network, the performance measures of the prediction model based on Random Forest were the best. Finally, the SHAP framework was used to improve the interpretability of the model and identify important factors affecting taxi drivers' traffic violations. The results showed that functional districts, location of the violation, and road grade were found to have a high impact on the probability of traffic violations; their mean SHAP values were 0.39, 0.36, and 0.26, respectively. Conclusions: Findings of this paper may help to discover the relationship between the influencing factors and the severity of traffic violations, and provide a theoretical basis for reducing the traffic violations of taxi drivers and improving the road safety management.
引用
收藏
页码:362 / 370
页数:9
相关论文
共 50 条
  • [21] Environmental Fire Hazard Detection and Prediction using Random Forest Algorithm
    Thakkar, Ranak
    Abhyankar, Varad
    Reddy, Polaka Divya
    Prakash, Surya
    2022 International Conference for Advancement in Technology, ICONAT 2022, 2022,
  • [22] The Random Forest based Detection of Shadowsock's Traffic
    Deng, Ziye
    Liu, Zihan
    Chen, Zhouguo
    Guo, Yubin
    2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 2, 2017, : 75 - 78
  • [23] Forecasting road traffic conditions using a context-based random forest algorithm
    Evans, Jonny
    Waterson, Ben
    Hamilton, Andrew
    TRANSPORTATION PLANNING AND TECHNOLOGY, 2019, 42 (06) : 554 - 572
  • [24] Taxi Drivers and Taxidars: A Case Study of Uber and Ola in Delhi
    Kashyap, Rina
    Bhatia, Anjali
    JOURNAL OF DEVELOPING SOCIETIES, 2018, 34 (02) : 169 - 194
  • [25] Traffic Rule Violations of Private Bus Drivers and Bus Crashes in Sri Lanka: A Case-Control Study
    Jayatilleke, Achala Upendra
    Poudel, Krishna C.
    Nakahara, Shinji
    Dharmaratne, Samath D.
    Jayatilleke, Achini Chinthika
    Jimba, Masamine
    TRAFFIC INJURY PREVENTION, 2010, 11 (03) : 263 - 269
  • [26] Identifying Forest Fire Driving Factors and Related Impacts in China Using Random Forest Algorithm
    Ma, Wenyuan
    Feng, Zhongke
    Cheng, Zhuxin
    Chen, Shilin
    Wang, Fengge
    FORESTS, 2020, 11 (05):
  • [27] Application of Random Forest Algorithm on Tornado Detection
    Zeng, Qiangyu
    Qing, Zhipeng
    Zhu, Ming
    Zhang, Fugui
    Wang, Hao
    Liu, Yin
    Shi, Zhao
    Yu, Qiu
    REMOTE SENSING, 2022, 14 (19)
  • [28] An Improved LandTrendr Algorithm for Forest Disturbance Detection Using Optimized Temporal Trajectories of the Spectrum: A Case Study in Yunnan Province, China
    He, Li
    Hong, Liang
    Zhu, A-Xing
    FORESTS, 2024, 15 (09):
  • [29] Identifying older drivers at risk of traffic violations by using a driving simulator: A 3-year longitudinal study
    Lee, HC
    Lee, AH
    AMERICAN JOURNAL OF OCCUPATIONAL THERAPY, 2005, 59 (01): : 97 - 100
  • [30] Intelligence Detection and Identification of Traffic Rule Violations Using a Drone
    Kim N.
    Lee K.
    Journal of Institute of Control, Robotics and Systems, 2022, 28 (12): : 1127 - 1132