YOLO-MMS for aerial object detection model based on hybrid feature extractor and improved multi-scale prediction

被引：1

作者：

Junos, Mohamad Haniff ^{[1
]}

Khairuddin, Anis Salwa Mohd ^{[2
]}

机构：

[1] Univ Sains Malaysia, Sch Aerosp Engn, Engn Campus, Nibong Tebal 14300, Penang, Malaysia

[2] Univ Malaya, Fac Engn, Dept Elect Engn, Kuala Lumpur 50603, Malaysia

来源：

VISUAL COMPUTER | 2024年

关键词：

Lightweight YOLO; MixMBConv; Aerial object detection; Deep learning; NETWORK;

D O I：

10.1007/s00371-024-03689-5

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Object detection in aerial images has become an important research subject due to the widespread use of aerial platforms, including satellites and unmanned aerial vehicles. However, the task is challenging because it involves a complex background, a high number of small objects, and densely distributed objects, leading to poor detection accuracy. Moreover, despite their excellent detection accuracy, existing one-stage object detection methods have complex structures that require huge computational power, generate high parameters, and exhibit slow inference speed, which makes them unsuitable for edge device applications. To address these issues, this paper proposes an accurate and lightweight object detection model named the YOLO-MMS model. The developed model incorporates several improvements, notably the hybrid backbone structure, which integrates a novel Mix-Mobile inverted bottleneck module to optimize efficiency by reducing the number of generated parameters. Additionally, the multi-scale prediction employs small efficient layer aggregation network and spatial pyramid pooling modules to improve feature extraction across multiple scales. Finally, the model includes an additional detection head and utilizes the Swish activation function to enhance detection accuracy. The evaluation results on the VisDrone and VEDAI datasets demonstrate that the proposed YOLO-MMS model achieved superior accuracy compared to other lightweight YOLO-based models. Furthermore, the proposed model showed significant improvements in model size with a reduction of 41.77% compared to its original YOLOv4-tiny model. These findings indicate that the proposed model presents optimal trade-offs in terms of accuracy and efficiency, rendering it exceptionally suitable for real-time applications on embedded systems. Our code is available at: https://github.com/hanifjunos/YOLO-MMS.

引用

页数：20

共 50 条

[21] Object Detection Networks Based on Refined Multi-scale Depth Feature
Li Y.-Q.
Gai C.-Y.
Xiao C.-J.
Wu C.
Liu J.-J.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (12): : 2360 - 2366
[22] Multi-Scale Feature Fusion Based Adaptive Object Detection for UAV
Liu Fang
Wu Zhiwei
Yang Anzhe
Han Xiao
ACTA OPTICA SINICA, 2020, 40 (10)
[23] Underwater image object detection based on multi-scale feature fusion
Yang, Chao
Zhang, Ce
Jiang, Longyu
Zhang, Xinwen
MACHINE VISION AND APPLICATIONS, 2024, 35 (06)
[24] Fast salient object detection based on multi-scale feature aggression
Zhang, Xiaohu
Zhu, Lei
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5734 - 5738
[25] Multi-scale Feature Fusion Object Detection Based on Swin Transformer
Zhang, Ying
Wu, Lin
Deng, Huaxuan
Hu, Jun
Li, Xifan
39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 1982 - 1987
[26] Multi-Scale Feature Integrated Attention-Based Rotation Network for Object Detection in VHR Aerial Images
Yang, Feng
Li, Wentong
Hu, Haiwei
Li, Wanyi
Wang, Peng
SENSORS, 2020, 20 (06)
[27] RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction
Ren, Jinghui
Yang, Jingmin
Zhang, Wenjie
Cai, Kunhui
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3421 - 3430
[28] RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction
Jinghui Ren
Jingmin Yang
Wenjie Zhang
Kunhui Cai
Signal, Image and Video Processing, 2024, 18 : 3421 - 3430
[29] Multi-scale Pyramid Feature Maps for Object Detection
Hao Huijun
Ye Ronghua
Chen Zhongyu
Zheng Zhonglong
2017 16TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2017, : 237 - 240
[30] Multi-scale HOG Feature Used in Object Detection
Li, Jin
Zhang, Hong
Zhang, Lei
Li, Yawei
Kang, Qiaochu
Luo, Zhaohui
Wu, Yujie
TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069

← 1 2 3 4 5 →