A Lightweight and Efficient Distracted Driver Detection Model Fusing Convolutional Neural Network and Vision Transformer

被引:0
|
作者
Li, Zhao [1 ]
Zhao, Xia [2 ]
Wu, Fuwei [1 ]
Chen, Dan [3 ]
Wang, Chang [1 ]
机构
[1] Changan Univ, Sch Automobile, Xian 710061, Peoples R China
[2] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Peoples R China
[3] Changan Univ, Sch Elect & Control Engn, Xian 710061, Peoples R China
关键词
Feature extraction; Vehicles; Transformers; Convolution; Computational modeling; Accuracy; Convolutional neural networks; Traffic safety; distracted driver recognition; convolutional neural network; vision transformer; lightweight model; multi-scale feature; RECOGNITION; FRAMEWORK;
D O I
10.1109/TITS.2024.3447041
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Identifying distracted drivers is crucial for enhancing driving safety and advancing intelligent driver assistance systems. Recently, researchers have applied Convolutional Neural Network (CNN) and Vision Transformer (ViT) models for driver state decision. However, both models often suffer from several issues such as numerous parameters and low detection efficiency. To address these challenges, this study proposes the Convolution Vision Transformer (CoViT) model for distracted driver identification, leveraging techniques such as Low Complexity Attention Mechanism (LCAM), Multi-scale Dilation Convolution (MSDC), and Depth Separable Convolution (DSC). Moreover, the CoViT model features a typical "pyramid" structure, enabling effective feature extraction across different scales. Subsequently, the proposed system is trained and evaluated using the publicly available driving behavior datasets SFD2 and 100-Driver, as well as real-world road experiments. Experimental results show that the CoViT model yields high recognition performance, with mean Accuracy (mAcc) scores of 95.17%, 97.89%, and 93.54% on the recorded dataset, SFD2 dataset, and 100-Driver dataset, respectively. These scores surpass those obtained by similar lightweight models. Furthermore, ablation experiments reveal that deep and dilated convolution significantly enhance model performance. In addition, the CoViT model demonstrates its applicability to real-time driving behavior detection tasks, with a parametric count of just 1.24M - a reduction of 2.67M compared to MobileNetV3 - and an online inference Frames Per Second (FPS) of 159.13.
引用
收藏
页码:19962 / 19978
页数:17
相关论文
共 50 条
  • [31] A Lightweight Convolutional Neural Network for Salient Object Detection
    Fei, Fengchang
    Liu, Wei
    Shu, Lei
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2024, 31 (04): : 1402 - 1410
  • [32] Multispectral Plant Disease Detection with Vision Transformer-Convolutional Neural Network Hybrid Approaches
    De Silva, Malithi
    Brown, Dane
    SENSORS, 2023, 23 (20)
  • [33] A multi-feature fusion algorithm for driver fatigue detection based on a lightweight convolutional neural network
    Cheng, Wangfeng
    Wang, Xuanyao
    Mao, Bangguo
    VISUAL COMPUTER, 2024, 40 (04): : 2419 - 2441
  • [34] A multi-feature fusion algorithm for driver fatigue detection based on a lightweight convolutional neural network
    Wangfeng Cheng
    Xuanyao Wang
    Bangguo Mao
    The Visual Computer, 2024, 40 : 2419 - 2441
  • [35] Efficient knowledge distillation for hybrid models: A vision transformer-convolutional neural network to convolutional neural network approach for classifying remote sensing images
    Song, Huaxiang
    Yuan, Yuxuan
    Ouyang, Zhiwei
    Yang, Yu
    Xiang, Hui
    IET CYBER-SYSTEMS AND ROBOTICS, 2024, 6 (03)
  • [36] Bio-Mechanical Distracted Driver Recognition Based on Stacked Autoencoder and Convolutional Neural Network
    Assefa, Addis Abebe
    Tian Wenhong
    2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 449 - 453
  • [37] LD-Net: An Efficient Lightweight Denoising Model Based on Convolutional Neural Network
    Le, Trung-Hieu
    Lin, Po-Hsiung
    Huang, Shih-Chia
    IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2020, 1 : 173 - 181
  • [38] Lightweight and efficient octave convolutional neural network for fire recognition
    Ayala, Angel
    Lima, Estanislau
    Fernandes, Bruno
    Bezerra, Byron L. D.
    Cruz, Francisco
    2019 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2019, : 87 - 92
  • [39] Vision transformer meets convolutional neural network for plant disease classification
    Thakur, Poornima Singh
    Chaturvedi, Shubhangi
    Khanna, Pritee
    Sheorey, Tanuja
    Ojha, Aparajita
    ECOLOGICAL INFORMATICS, 2023, 77
  • [40] An efficient intrusion detection model based on convolutional spiking neural network
    Zhen Wang
    Fuad A. Ghaleb
    Anazida Zainal
    Maheyzah Md Siraj
    Xing Lu
    Scientific Reports, 14