A Lightweight and Efficient Distracted Driver Detection Model Fusing Convolutional Neural Network and Vision Transformer

被引:0
|
作者
Li, Zhao [1 ]
Zhao, Xia [2 ]
Wu, Fuwei [1 ]
Chen, Dan [3 ]
Wang, Chang [1 ]
机构
[1] Changan Univ, Sch Automobile, Xian 710061, Peoples R China
[2] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Peoples R China
[3] Changan Univ, Sch Elect & Control Engn, Xian 710061, Peoples R China
关键词
Feature extraction; Vehicles; Transformers; Convolution; Computational modeling; Accuracy; Convolutional neural networks; Traffic safety; distracted driver recognition; convolutional neural network; vision transformer; lightweight model; multi-scale feature; RECOGNITION; FRAMEWORK;
D O I
10.1109/TITS.2024.3447041
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying distracted drivers is crucial for enhancing driving safety and advancing intelligent driver assistance systems. Recently, researchers have applied Convolutional Neural Network (CNN) and Vision Transformer (ViT) models for driver state decision. However, both models often suffer from several issues such as numerous parameters and low detection efficiency. To address these challenges, this study proposes the Convolution Vision Transformer (CoViT) model for distracted driver identification, leveraging techniques such as Low Complexity Attention Mechanism (LCAM), Multi-scale Dilation Convolution (MSDC), and Depth Separable Convolution (DSC). Moreover, the CoViT model features a typical "pyramid" structure, enabling effective feature extraction across different scales. Subsequently, the proposed system is trained and evaluated using the publicly available driving behavior datasets SFD2 and 100-Driver, as well as real-world road experiments. Experimental results show that the CoViT model yields high recognition performance, with mean Accuracy (mAcc) scores of 95.17%, 97.89%, and 93.54% on the recorded dataset, SFD2 dataset, and 100-Driver dataset, respectively. These scores surpass those obtained by similar lightweight models. Furthermore, ablation experiments reveal that deep and dilated convolution significantly enhance model performance. In addition, the CoViT model demonstrates its applicability to real-time driving behavior detection tasks, with a parametric count of just 1.24M - a reduction of 2.67M compared to MobileNetV3 - and an online inference Frames Per Second (FPS) of 159.13.
引用
收藏
页码:19962 / 19978
页数:17
相关论文
共 50 条
  • [1] Convolutional Neural Network or Vision Transformer? Benchmarking Various Machine Learning Models for Distracted Driver Detection
    Koay, Hong Vin
    Chuah, Joon Huang
    Chow, Chee-Onn
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 417 - 422
  • [2] Detection of Distracted Driver using Convolutional Neural Network
    Baheti, Bhakti
    Gajre, Suhas
    Talbar, Sanjay
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1145 - 1151
  • [3] Distracted driver detection using compressed energy efficient convolutional neural network
    Alzubi, Jafar A.
    Jain, Rachna
    Alzubi, Omar
    Thareja, Anuj
    Upadhyay, Yash
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (02) : 1253 - 1265
  • [4] A lightweight model combining convolutional neural network and Transformer for driver distraction recognition
    Tang, Xuexi
    Chen, Yan
    Ma, Yifan
    Yang, Wenxuan
    Zhou, Houpan
    Huang, Jingzhou
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [5] Plug-and-play adapter for fusing convolutional neural network with vision transformer
    Chen, Bin
    Fan, Xianlian
    Wu, Shiqian
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (05)
  • [6] Distracted driver detection using convolutional neural networks based segmentation model
    Khellal, Atmane
    Boulahmar, Mehrez
    Bahi, Abdelhak
    Nemra, Abdelkrim
    PROGRAM OF THE 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND AUTOMATIC CONTROL, ICEEAC 2024, 2024,
  • [7] Driver Fatigue and Distracted Driving Detection Using Random Forest and Convolutional Neural Network
    Dong, Bing-Ting
    Lin, Huei-Yung
    Chang, Chin-Chen
    APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [8] A lightweight model for distracted driver detection based on neural architecture search and coordinate attention
    Sun, Haibin
    Zhang, Mengting
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 123
  • [9] Shifted-Window Hierarchical Vision Transformer for Distracted Driver Detection
    Koay, Hong Vin
    Chuah, Joon Huang
    Chow, Chee-Onn
    2021 IEEE REGION 10 SYMPOSIUM (TENSYMP), 2021,
  • [10] A forest fire smoke detection model combining convolutional neural network and vision transformer
    Zheng, Ying
    Zhang, Gui
    Tan, Sanqing
    Yang, Zhigao
    Wen, Dongxin
    Xiao, Huashun
    FRONTIERS IN FORESTS AND GLOBAL CHANGE, 2023, 6