A Lightweight and Efficient Distracted Driver Detection Model Fusing Convolutional Neural Network and Vision Transformer

被引:0
|
作者
Li, Zhao [1 ]
Zhao, Xia [2 ]
Wu, Fuwei [1 ]
Chen, Dan [3 ]
Wang, Chang [1 ]
机构
[1] Changan Univ, Sch Automobile, Xian 710061, Peoples R China
[2] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Peoples R China
[3] Changan Univ, Sch Elect & Control Engn, Xian 710061, Peoples R China
关键词
Feature extraction; Vehicles; Transformers; Convolution; Computational modeling; Accuracy; Convolutional neural networks; Traffic safety; distracted driver recognition; convolutional neural network; vision transformer; lightweight model; multi-scale feature; RECOGNITION; FRAMEWORK;
D O I
10.1109/TITS.2024.3447041
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying distracted drivers is crucial for enhancing driving safety and advancing intelligent driver assistance systems. Recently, researchers have applied Convolutional Neural Network (CNN) and Vision Transformer (ViT) models for driver state decision. However, both models often suffer from several issues such as numerous parameters and low detection efficiency. To address these challenges, this study proposes the Convolution Vision Transformer (CoViT) model for distracted driver identification, leveraging techniques such as Low Complexity Attention Mechanism (LCAM), Multi-scale Dilation Convolution (MSDC), and Depth Separable Convolution (DSC). Moreover, the CoViT model features a typical "pyramid" structure, enabling effective feature extraction across different scales. Subsequently, the proposed system is trained and evaluated using the publicly available driving behavior datasets SFD2 and 100-Driver, as well as real-world road experiments. Experimental results show that the CoViT model yields high recognition performance, with mean Accuracy (mAcc) scores of 95.17%, 97.89%, and 93.54% on the recorded dataset, SFD2 dataset, and 100-Driver dataset, respectively. These scores surpass those obtained by similar lightweight models. Furthermore, ablation experiments reveal that deep and dilated convolution significantly enhance model performance. In addition, the CoViT model demonstrates its applicability to real-time driving behavior detection tasks, with a parametric count of just 1.24M - a reduction of 2.67M compared to MobileNetV3 - and an online inference Frames Per Second (FPS) of 159.13.
引用
收藏
页码:19962 / 19978
页数:17
相关论文
共 50 条
  • [41] An efficient intrusion detection model based on convolutional spiking neural network
    Wang, Zhen
    Ghaleb, Fuad A.
    Zainal, Anazida
    Siraj, Maheyzah Md
    Lu, Xing
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [42] A lightweight convolutional transformer neural network for EEG-based depression recognition
    Hou, Pengfei
    Li, Xiaowei
    Zhu, Jing
    Hu, Bin
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
  • [43] CloudViT: A Lightweight Vision Transformer Network for Remote Sensing Cloud Detection
    Zhang, Bin
    Zhang, Yongjun
    Li, Yansheng
    Wan, Yi
    Yao, Yongxiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [44] CloudViT: A Lightweight Vision Transformer Network for Remote Sensing Cloud Detection
    Zhang, Bin
    Zhang, Yongjun
    Li, Yansheng
    Wan, Yi
    Yao, Yongxiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [45] Efficient Pupil Detection with a Convolutional Neural Network
    Miron, Casian
    Pasarica, Alexandra
    Bozomitu, Radu Gabriel
    Manta, Vasile
    Timofte, Radu
    Ciucu, Radu
    2019 E-HEALTH AND BIOENGINEERING CONFERENCE (EHB), 2019,
  • [46] A Deep Learning-Based Intrusion Detection Model Integrating Convolutional Neural Network and Vision Transformer for Network Traffic Attack in the Internet of Things
    Du, Chunlai
    Guo, Yanhui
    Zhang, Yuhang
    ELECTRONICS, 2024, 13 (14)
  • [47] Designing a Lightweight Convolutional Neural Network for Camouflaged Object Detection
    Gonzales, Mark Edward M.
    Ibrahim, Hans Oswald A.
    Ong, Elyssia Barrie H.
    Laguna, Ann Franchesca B.
    2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 179 - 187
  • [48] Driver Distraction Detection using Single Convolutional Neural Network
    Kim, Whui
    Choi, Hyun-Kyun
    Jang, Byung-Tae
    Lim, Jinsu
    2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 1203 - 1205
  • [49] A LIGHTWEIGHT CONVOLUTIONAL NEURAL NETWORK FOR BITEMPORAL IMAGE CHANGE DETECTION
    Wang, Rongfang
    Ding, Fan
    Chen, Jia-Wei
    Jiao, Licheng
    Wang, Liang
    IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 2551 - 2554
  • [50] FlameNet: a lightweight convolutional neural network for flame detection and localisation
    Hu, Xing
    Li, Mei
    Zhang, Dawei
    INTERNATIONAL JOURNAL OF VEHICLE DESIGN, 2023, 91 (1-3) : 87 - 106