An integrated data- and theory-driven crash severity model

被引:2
|
作者
Liu, Dongjie [1 ]
Li, Dawei [1 ,2 ,3 ]
Sze, N. N. [4 ]
Ding, Hongliang [4 ,5 ]
Song, Yuchen [1 ]
机构
[1] Southeast Univ, Sch Transportat, Nanjing 211189, Jiangsu, Peoples R China
[2] Southeast Univ, Jiangsu Key Lab Urban ITS, Nanjing 211189, Jiangsu, Peoples R China
[3] Jiangsu Prov Collaborat Innovat Ctr Modern Urban T, Nanjing 211189, Jiangsu, Peoples R China
[4] Hong Kong Polytech Univ, Dept Civil & Environm Engn, Hong Kong, Peoples R China
[5] Southwest Jiaotong Univ, Inst Smart City & Intelligent Transportta, Inst Urban Rail Transportat, Chengdu 611756, Sichuan, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Logit model; Crash severity; Embedding representations; Data; and theory-driven model; Interpretable machine learning; DISCRETE-CHOICE MODELS; DEEP NEURAL-NETWORKS; INJURY SEVERITY; MULTINOMIAL LOGIT; ACCIDENT SEVERITY; ROLLOVER CRASHES; VEHICLE; REPRESENTATION; HIGHWAYS;
D O I
10.1016/j.aap.2023.107282
中图分类号
TB18 [人体工程学];
学科分类号
1201 ;
摘要
For crash severity modeling, researchers typically view theory-driven models and data-driven models as different or even conflicting approaches. The reason is that the machine-learning models offer good predictability but weak interpretability, while the latter has robust interpretability but moderate predictability. In order to alleviate the tension between them, this study proposes an integrated data- and theory-driven crash-severity model, known as Embedded Fusion model based on Text Vector Representations (TVR-EF), by leveraging the complementary strengths of both. The model specification consists of two parts. (i) the data-driven component not only mitigate the deficiencies of traditional econometric models, where one-hot encoding is frequently used and makes it impossible to observe semantic relatedness between variable categories, but also enhances the interpretability for the relationship between crash severity and potential influencing factors using the learned embedding weight matrix. (ii) In the theory-driven component, the multinomial logit model is implemented as a 2D-Convolutional Neural Network (2D-CNN) to increase flexibility and decrease dependency on prior knowledge for different crash-severity outcomes. A crash dataset from Guangdong Province, China, is utilized to estimate the TVR-EF model, which is then benchmarked against two traditional econometric models and three widely used machine-learning models. Results indicate that TVR-EF model does not only improve the predictive performance but also makes it easier to interpret.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Data- and theory-driven design of metastable materials for energy conversion
    Holder, Aaron
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [2] Data- and theory-driven approaches for understanding paths of epithelial-mesenchymal transition
    Hong, Tian
    Xing, Jianhua
    GENESIS, 2024, 62 (02)
  • [3] A theory-driven model of handshape similarity
    Keane, Jonathan
    Sehyr, Zed Sevcikova
    Emmorey, Karen
    Brentari, Diane
    PHONOLOGY, 2017, 34 (02) : 221 - 241
  • [4] Data- and model-driven multiresolution processing
    Califano, A
    Kjeldsen, R
    Bolle, RM
    COMPUTER VISION AND IMAGE UNDERSTANDING, 1996, 63 (01) : 27 - 49
  • [5] Discovering data: A neglected virtue of theory-driven research
    Stark, R
    SOCIOLOGICAL THEORY AND METHODS, 2001, 16 (01) : 19 - 29
  • [6] Toward Diabetes Device Development That Is Mindful to the Needs of Young People Living With Type 1 Diabetes: A Data- and Theory-Driven Qualitative Study
    Brew-Sam, Nicola
    Parkinson, Anne
    Chhabra, Madhur
    Henschke, Adam
    Brown, Ellen
    Pedley, Lachlan
    Pedley, Elizabeth
    Hannan, Kristal
    Brown, Karen
    Wright, Kristine
    Phillips, Christine
    Tricoli, Antonio
    Nolan, Christopher J.
    Suominen, Hanna
    Desborough, Jane
    JMIR DIABETES, 2023, 8
  • [7] A data- and model-driven approach for cancer treatment
    Schade, Sophia
    Ogilvie, Lesley A.
    Kessler, Thomas
    Schuette, Moritz
    Wierling, Christoph
    Lange, Bodo M.
    Lehrach, Hans
    Yaspo, Marie-Laure
    ONKOLOGE, 2019, 25 (Suppl 2): : 132 - 137
  • [8] A data- and model-driven approach for cancer treatment
    Sophia Schade
    Lesley A. Ogilvie
    Thomas Kessler
    Moritz Schütte
    Christoph Wierling
    Bodo M. Lange
    Hans Lehrach
    Marie-Laure Yaspo
    Der Onkologe, 2019, 25 : 132 - 137
  • [9] A THEORY-DRIVEN MODEL OF COMMUNITY COLLEGE STUDENT ENGAGEMENT
    Schuetz, Pam
    COMMUNITY COLLEGE JOURNAL OF RESEARCH AND PRACTICE, 2008, 32 (4-6) : 305 - 324
  • [10] Data- Driven Pedestrian Model: From OpenCV to NetLogo
    Prochazka, Jan
    Olsevicova, Kamila
    COMPUTATIONAL COLLECTIVE INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS, ICCCI 2014, 2014, 8733 : 322 - 331