YOLO-MMS for aerial object detection model based on hybrid feature extractor and improved multi-scale prediction

被引:1
|
作者
Junos, Mohamad Haniff [1 ]
Khairuddin, Anis Salwa Mohd [2 ]
机构
[1] Univ Sains Malaysia, Sch Aerosp Engn, Engn Campus, Nibong Tebal 14300, Penang, Malaysia
[2] Univ Malaya, Fac Engn, Dept Elect Engn, Kuala Lumpur 50603, Malaysia
来源
关键词
Lightweight YOLO; MixMBConv; Aerial object detection; Deep learning; NETWORK;
D O I
10.1007/s00371-024-03689-5
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Object detection in aerial images has become an important research subject due to the widespread use of aerial platforms, including satellites and unmanned aerial vehicles. However, the task is challenging because it involves a complex background, a high number of small objects, and densely distributed objects, leading to poor detection accuracy. Moreover, despite their excellent detection accuracy, existing one-stage object detection methods have complex structures that require huge computational power, generate high parameters, and exhibit slow inference speed, which makes them unsuitable for edge device applications. To address these issues, this paper proposes an accurate and lightweight object detection model named the YOLO-MMS model. The developed model incorporates several improvements, notably the hybrid backbone structure, which integrates a novel Mix-Mobile inverted bottleneck module to optimize efficiency by reducing the number of generated parameters. Additionally, the multi-scale prediction employs small efficient layer aggregation network and spatial pyramid pooling modules to improve feature extraction across multiple scales. Finally, the model includes an additional detection head and utilizes the Swish activation function to enhance detection accuracy. The evaluation results on the VisDrone and VEDAI datasets demonstrate that the proposed YOLO-MMS model achieved superior accuracy compared to other lightweight YOLO-based models. Furthermore, the proposed model showed significant improvements in model size with a reduction of 41.77% compared to its original YOLOv4-tiny model. These findings indicate that the proposed model presents optimal trade-offs in terms of accuracy and efficiency, rendering it exceptionally suitable for real-time applications on embedded systems. Our code is available at: https://github.com/hanifjunos/YOLO-MMS.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Object Detection Networks Based on Refined Multi-scale Depth Feature
    Li Y.-Q.
    Gai C.-Y.
    Xiao C.-J.
    Wu C.
    Liu J.-J.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (12): : 2360 - 2366
  • [22] Multi-Scale Feature Fusion Based Adaptive Object Detection for UAV
    Liu Fang
    Wu Zhiwei
    Yang Anzhe
    Han Xiao
    ACTA OPTICA SINICA, 2020, 40 (10)
  • [23] Underwater image object detection based on multi-scale feature fusion
    Yang, Chao
    Zhang, Ce
    Jiang, Longyu
    Zhang, Xinwen
    MACHINE VISION AND APPLICATIONS, 2024, 35 (06)
  • [24] Fast salient object detection based on multi-scale feature aggression
    Zhang, Xiaohu
    Zhu, Lei
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5734 - 5738
  • [25] Multi-scale Feature Fusion Object Detection Based on Swin Transformer
    Zhang, Ying
    Wu, Lin
    Deng, Huaxuan
    Hu, Jun
    Li, Xifan
    39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 1982 - 1987
  • [26] Multi-Scale Feature Integrated Attention-Based Rotation Network for Object Detection in VHR Aerial Images
    Yang, Feng
    Li, Wentong
    Hu, Haiwei
    Li, Wanyi
    Wang, Peng
    SENSORS, 2020, 20 (06)
  • [27] RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction
    Ren, Jinghui
    Yang, Jingmin
    Zhang, Wenjie
    Cai, Kunhui
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3421 - 3430
  • [28] RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction
    Jinghui Ren
    Jingmin Yang
    Wenjie Zhang
    Kunhui Cai
    Signal, Image and Video Processing, 2024, 18 : 3421 - 3430
  • [29] Multi-scale Pyramid Feature Maps for Object Detection
    Hao Huijun
    Ye Ronghua
    Chen Zhongyu
    Zheng Zhonglong
    2017 16TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2017, : 237 - 240
  • [30] Multi-scale HOG Feature Used in Object Detection
    Li, Jin
    Zhang, Hong
    Zhang, Lei
    Li, Yawei
    Kang, Qiaochu
    Luo, Zhaohui
    Wu, Yujie
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069