A Lightweight and Efficient Distracted Driver Detection Model Fusing Convolutional Neural Network and Vision Transformer

被引:0
|
作者
Li, Zhao [1 ]
Zhao, Xia [2 ]
Wu, Fuwei [1 ]
Chen, Dan [3 ]
Wang, Chang [1 ]
机构
[1] Changan Univ, Sch Automobile, Xian 710061, Peoples R China
[2] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Peoples R China
[3] Changan Univ, Sch Elect & Control Engn, Xian 710061, Peoples R China
关键词
Feature extraction; Vehicles; Transformers; Convolution; Computational modeling; Accuracy; Convolutional neural networks; Traffic safety; distracted driver recognition; convolutional neural network; vision transformer; lightweight model; multi-scale feature; RECOGNITION; FRAMEWORK;
D O I
10.1109/TITS.2024.3447041
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying distracted drivers is crucial for enhancing driving safety and advancing intelligent driver assistance systems. Recently, researchers have applied Convolutional Neural Network (CNN) and Vision Transformer (ViT) models for driver state decision. However, both models often suffer from several issues such as numerous parameters and low detection efficiency. To address these challenges, this study proposes the Convolution Vision Transformer (CoViT) model for distracted driver identification, leveraging techniques such as Low Complexity Attention Mechanism (LCAM), Multi-scale Dilation Convolution (MSDC), and Depth Separable Convolution (DSC). Moreover, the CoViT model features a typical "pyramid" structure, enabling effective feature extraction across different scales. Subsequently, the proposed system is trained and evaluated using the publicly available driving behavior datasets SFD2 and 100-Driver, as well as real-world road experiments. Experimental results show that the CoViT model yields high recognition performance, with mean Accuracy (mAcc) scores of 95.17%, 97.89%, and 93.54% on the recorded dataset, SFD2 dataset, and 100-Driver dataset, respectively. These scores surpass those obtained by similar lightweight models. Furthermore, ablation experiments reveal that deep and dilated convolution significantly enhance model performance. In addition, the CoViT model demonstrates its applicability to real-time driving behavior detection tasks, with a parametric count of just 1.24M - a reduction of 2.67M compared to MobileNetV3 - and an online inference Frames Per Second (FPS) of 159.13.
引用
收藏
页码:19962 / 19978
页数:17
相关论文
共 50 条
  • [21] Towards Computationally Efficient and Realtime Distracted Driver Detection With MobileVGG Network
    Baheti, Bhakti
    Talbar, Sanjay
    Gajre, Suhas
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2020, 5 (04): : 565 - 574
  • [22] Transformer with difference convolutional network for lightweight universal boundary detection
    Li, Mingchun
    Liu, Yang
    Chen, Dali
    Chen, Liangsheng
    Liu, Shixin
    PLOS ONE, 2024, 19 (04):
  • [23] CVTrack: Combined Convolutional Neural Network and Vision Transformer Fusion Model for Visual Tracking
    Wang, Jian
    Song, Yueming
    Song, Ce
    Tian, Haonan
    Zhang, Shuai
    Sun, Jinghui
    SENSORS, 2024, 24 (01)
  • [24] Lightweight Object Detection Network Based on Convolutional Neural Network
    Cheng Yequn
    Yan, Wang
    Fan Yuying
    Li Baoqing
    LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (16)
  • [25] Driver distraction detection using semi-supervised lightweight vision transformer
    Mohammed, Adam A. Q.
    Geng, Xin
    Wang, Jing
    Ali, Zafar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 129
  • [26] Res-MGCA-SE: a lightweight convolutional neural network based on vision transformer for medical image classification
    Soleimani-Fard S.
    Ko S.-B.
    Neural Computing and Applications, 2024, 36 (28) : 17631 - 17644
  • [27] Efficient J Peak Detection From Ballistocardiogram Using Lightweight Convolutional Neural Network
    Huang, Yongfeng
    Jin, Tianchen
    Sun, Chenxi
    Li, Xueyang
    Yang, Shuchen
    Zhang, Zhiming
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 269 - 272
  • [28] Lane Detection Based on a Lightweight Convolutional Neural Network
    Hu Jie
    Xiong Zongquan
    Xu Wencai
    Cao Kai
    Lu Ruoyu
    LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (10)
  • [29] Lightweight Convolutional Neural Network Model for Human Face Detection in Risk Situations
    Wieczorek, Michal
    Silka, Jakub
    Wozniak, Marcin
    Garg, Sahil
    Hassan, Mohammad Mehedi
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (07) : 4820 - 4829
  • [30] A Lightweight Convolutional Neural Network Flame Detection Algorithm
    Li, Wenzheng
    Yu, Zongyang
    PROCEEDINGS OF 2021 IEEE 11TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2021), 2021, : 83 - 86