A Lightweight and Efficient Distracted Driver Detection Model Fusing Convolutional Neural Network and Vision Transformer

被引:0
|
作者
Li, Zhao [1 ]
Zhao, Xia [2 ]
Wu, Fuwei [1 ]
Chen, Dan [3 ]
Wang, Chang [1 ]
机构
[1] Changan Univ, Sch Automobile, Xian 710061, Peoples R China
[2] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Peoples R China
[3] Changan Univ, Sch Elect & Control Engn, Xian 710061, Peoples R China
关键词
Feature extraction; Vehicles; Transformers; Convolution; Computational modeling; Accuracy; Convolutional neural networks; Traffic safety; distracted driver recognition; convolutional neural network; vision transformer; lightweight model; multi-scale feature; RECOGNITION; FRAMEWORK;
D O I
10.1109/TITS.2024.3447041
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying distracted drivers is crucial for enhancing driving safety and advancing intelligent driver assistance systems. Recently, researchers have applied Convolutional Neural Network (CNN) and Vision Transformer (ViT) models for driver state decision. However, both models often suffer from several issues such as numerous parameters and low detection efficiency. To address these challenges, this study proposes the Convolution Vision Transformer (CoViT) model for distracted driver identification, leveraging techniques such as Low Complexity Attention Mechanism (LCAM), Multi-scale Dilation Convolution (MSDC), and Depth Separable Convolution (DSC). Moreover, the CoViT model features a typical "pyramid" structure, enabling effective feature extraction across different scales. Subsequently, the proposed system is trained and evaluated using the publicly available driving behavior datasets SFD2 and 100-Driver, as well as real-world road experiments. Experimental results show that the CoViT model yields high recognition performance, with mean Accuracy (mAcc) scores of 95.17%, 97.89%, and 93.54% on the recorded dataset, SFD2 dataset, and 100-Driver dataset, respectively. These scores surpass those obtained by similar lightweight models. Furthermore, ablation experiments reveal that deep and dilated convolution significantly enhance model performance. In addition, the CoViT model demonstrates its applicability to real-time driving behavior detection tasks, with a parametric count of just 1.24M - a reduction of 2.67M compared to MobileNetV3 - and an online inference Frames Per Second (FPS) of 159.13.
引用
收藏
页码:19962 / 19978
页数:17
相关论文
共 50 条
  • [11] Distracted Driver Recognizer with Simple and Efficient Convolutional Neural Network for Real-time System
    Nguyen, Duy-Linh
    Putro, Muhamad Dwisnanto
    Jo, Kang-Hyun
    2021 21ST INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2021), 2021, : 371 - 375
  • [12] A Lightweight Convolutional Neural Network-Reformer Model for Efficient Epileptic Seizure Detection
    Cui, Haozhou
    Zhong, Xiangwen
    Li, Haotian
    Li, Chuanyu
    Dong, Xingchen
    Ji, Dezan
    He, Landi
    Zhou, Weidong
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2024, 34 (12)
  • [13] Light-weight Convolutional Neural Network for Distracted Driver Classification
    Duy-Linh Nguyen
    Putro, Muhamad Dwisnanto
    Xuan-Thuy Vo
    Kang-Hyun Jo
    IECON 2021 - 47TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2021,
  • [14] Landslide Susceptibility Mapping by Fusing Convolutional Neural Networks and Vision Transformer
    Bao, Shuai
    Liu, Jiping
    Wang, Liang
    Konecny, Milan
    Che, Xianghong
    Xu, Shenghua
    Li, Pengpeng
    SENSORS, 2023, 23 (01)
  • [15] An efficient lightweight convolutional neural network for industrial surface defect detection
    Zhang, Dehua
    Hao, Xinyuan
    Wang, Dechen
    Qin, Chunbin
    Zhao, Bo
    Liang, Linlin
    Liu, Wei
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (09) : 10651 - 10677
  • [16] An efficient lightweight convolutional neural network for industrial surface defect detection
    Dehua Zhang
    Xinyuan Hao
    Dechen Wang
    Chunbin Qin
    Bo Zhao
    Linlin Liang
    Wei Liu
    Artificial Intelligence Review, 2023, 56 : 10651 - 10677
  • [17] A Lightweight Face Detector via Bi-Stream Convolutional Neural Network and Vision Transformer
    Zhang, Zekun
    Chao, Qingqing
    Wang, Shijie
    Yu, Teng
    INFORMATION, 2024, 15 (05)
  • [18] An Intelligent System for Outfall Detection in UAV Images Using Lightweight Convolutional Vision Transformer Network
    Yu, Mingxin
    Zhang, Ji
    Zhu, Lianqing
    Liang, Shengjun
    Lu, Wenshuai
    Ji, Xinglong
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 6265 - 6277
  • [19] Causal Fusion of Convolutional Neural Network and Vision Transformer for Image Anomaly Detection and Localization
    Zhang, Shuo
    Hu, Xiongpeng
    Liu, Jing
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
  • [20] A Comparative Study of Vision Transformer and Convolutional Neural Network Models in Geological Fault Detection
    Wang, Jing
    Ma, Siteng
    An, Yu
    Dong, Ruihai
    IEEE ACCESS, 2024, 12 : 136148 - 136159