«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1671-6833. 2023. 06. 009]
点击复制

基于注意力与多级特征融合的 YOLOv5 算法()

分享到：

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:: 45
期数:: 2024年03期

页码:: 38-45

栏目:

出版日期:: 2024-04-20

文章信息/Info

Title:: Geological Named Entity Recognition Based on MacBERT and R-Drop

文章编号:: 1671-6833(2024)03-0038-08

作者:: 王瑜; 毕玉; 石健彤; 肖洪兵; 孙梅; 北京工商大学计算机与人工智能学院,北京 100048

Author(s):: LIU Xin¹; XU Hongzhen ¹; 2; LIU Aihua ²; DENG Dejun ¹; 1. School of Information Engineering, East China University of Technology, Nanchang 330013, China; 2. School of Software, East China University of Technology, Nanchang 330013, China

关键词:: 深度学习; YOLOv5s; 目标检测; 多级特征融合; 注意力机制

Keywords:: named entity recognition; geology; MacBERT; BiGRU; R-Drop

分类号:: TP391

DOI:: 10. 13705 / j. issn. 1671-6833. 2023. 06. 009

文献标志码:: A

摘要:: 针对复杂场景下目标检测与识别精度较低的问题,提出了一种基于注意力与多级特征融合的 YOLOv5 目标检测与识别算法。该算法在传统 YOLOv5s 模型的主干网络中引入双空间方向的金字塔切分注意力机制,增强对特征空间和通道信息的学习能力,同时在瓶颈网络中采用多级特征融合结构,对不同分支的特征进行融合,增加特征的丰富性,提升应对复杂场景的能力。此外,利用 C3Ghost 模块和深度可分离卷积分别替换 C3 模块和普通卷积,降低网络参数量和复杂度。结果表明:与传统的 YOLOv5s 算法相比,所提算法在 VOC2007+2012 数据集的均值平均精度高达 85%,在智能零售柜商品识别数据集的均值平均精度高达 97. 2%,表现出较好的性能。

Abstract:: To tackle the problem of low accuracy of detection and recognition for object in complex scenes, YOLOv5 object detection and recognition algorithm based on attention and multistage feature fusion(AMFF) was proposed in this study. The main ideas included adding the proposed dual space directions pyramid split attention (DSD-PSA) mechanism to the backbone network of the traditional YOLOv5s model to enhance the learning of the feature map space and channel information, adopting multistage feature fusion(MFF) structure in the bottleneck network to fuse the features of different branches, increasing richness of the feature and improving the ability to cope with complex scenes. In addition, C3Ghost module and depthwise separable convolution were used to replace C3 module and common convolution to reduce the number of parameters and the complexity of network. Compared with the traditional YOLOv5s algorithm, the mean average accuracy of the proposed algorithm in the VOC2007+2012 data set reached 85%, and the mean average accuracy of the smart retail cabinet commodity identification data set reached 97.2%, which verified the effectiveness and feasibility of the proposed algorithm.

参考文献/References:

[1] 李柯泉, 陈燕, 刘佳晨, 等. 基于深度学习的目标检测算法综述[ J] . 计算机工程, 2022, 48(7) : 1-12. LI K Q, CHEN Y, LIU J C, et al. Survey of deep learning-based object detection algorithms[ J] . Computer Engineering, 2022, 48(7) : 1-12.

[2] 包晓敏, 王思琪. 基于深度学习的目标检测算法综述 [ J] . 传感器与微系统, 2022, 41(4) :5-9. BAO X M, WANG S Q. Survey of object detection algorithm based on deep learning[ J] . Transducer and Microsystem Technologies, 2022, 41(4) :5-9.

[3] 赵永强, 饶元, 董世鹏, 等. 深度学习目标检测方法综述 [ J ] . 中国图象图形学报, 2020, 25 ( 4 ) 629-654. ZHAO Y Q, RAO Y, DONG S P, et al. Survey on deep learning object detection [ J ] . Journal of Image and Graphics, 2020, 25(4) : 629-654.

[4] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [ C ] ∥ 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587.

[5] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[ J] . IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9) : 1904-1916.

[6] GIRSHICK R. Fast R-CNN[C]∥2015 IEEE International Conference on Computer Vision ( ICCV) . Piscataway: IEEE, 2016: 1440-1448.

[7] REN S Q, HE K M, GIRSHICK R, et al. Faster RCNN: towards real-time object detection with region proposal networks[C]∥IEEE Transactions on Pattern Analysis and Machine Intelligence. Piscataway: IEEE, 2016: 1137-1149.

[8] HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [ C ] ∥2017 IEEE International Conference on Computer Vision ( ICCV ) . Piscataway: IEEE, 2017: 2980-2988.

[9] CHEN L K, YE F Y, RUAN Y D, et al. An algorithm for highway vehicle detection based on convolutional neural network [ J] . EURASIP Journal on Image and Video Processing, 2018, 2018(1) : 1-7.

[10] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [ C ] ∥ European Conference on Computer Vision. Cham: Springer, 2016: 21-37.

[11] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [ C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2016: 779-788.

[12] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition ( CVPR) . Piscataway: IEEE, 2017: 6517-6525.

[13] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327.

[14] REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB / OL] . ( 2018 - 04 - 08) [ 2022 - 12 - 23] . https:∥arxiv. org / pdf / 1804. 02767v1.

[15] ZHAO Q J, SHENG T, WANG Y T, et al. M2Det: a single-shot object detector based on multi-level feature pyramid network[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 9259-9266.

[16] TAN M X, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks [EB / OL] . (2020- 09-11) [2022-12-23] . https:∥arxiv. org / pdf / 1905. 11 946v5.

[17] TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]∥2020 IEEE / CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway: IEEE, 2020: 10778-10787. [18] BOCHKOVSKIY A, WANG C Y, LIAO H M. YOLOv4: optimal speed and accuracy of object detection [ EB / OL] . ( 2020 - 04 - 23) [ 2022 - 12 - 23] . https:∥arxiv. org / abs/ 2004. 10934.

[19] JOCHER G. YOLOv5 [EB / OL]. (2020-06-17) [2022- 12-23]. https:∥github. com / ultralytics/ YOLOv5.

[20] ZHANG C X, KANG F, WANG Y X. An improved apple object detection method based on lightweight YOLOv4 in complex backgrounds [ J ] . Remote Sensing, 2022, 14 (17) : 4150.

[21] HONG W W, MA Z H, YE B L, et al. Detection of green asparagus in complex environments based on the improved YOLOv5 algorithm[J]. Sensors, 2023, 23(3): 1562.

[22] 贾云飞, 郑红木, 刘闪亮. 基于 YOLOv5s 的金属制品表面缺陷的轻量化算法研究[ J] . 郑州大学学报( 工学版) , 2022, 43(5) : 31-38. JIA Y F, ZHENG H M, LIU S L. Lightweight surface defect detection method of metal products based on YOLOv5s [ J] . Journal of Zhengzhou University ( Engineering Science) , 2022, 43(5) : 31-38.

[23] ZHANG H, ZU K K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network [ C ] ∥ Asian Conference on Computer Vision. Cham: Springer, 2023: 541-557.

[24] HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]∥2020 IEEE / CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 1577-1586.

相似文献/References:

[1]袁航,钟发海,聂上上,等.基于卷积神经网络的道路拥堵识别研究[J].郑州大学学报(工学版),2019,40(02):21.[doi:10.13705/j.issn.1671-6833.2019.02.008]
　LUO Ronghui,YUAN Hang,ZHONG Fahai,et al.The Research of Traffic Jam Detection Based on Convolutional Neural Network[J].Journal of Zhengzhou University (Engineering Science),2019,40(03):21.[doi:10.13705/j.issn.1671-6833.2019.02.008]
[2]朱俊丞,杨之乐,郭媛君,等.深度学习在电力负荷预测中的应用综述[J].郑州大学学报(工学版),2019,40(05):12.[doi:10.13705/j.issn.1671-6833.2019.05.005]
　Zhu Juncheng,Young Joy,Guo Yuanjun,et al.A review of the application of deep learning in power load forecasting[J].Journal of Zhengzhou University (Engineering Science),2019,40(03):12.[doi:10.13705/j.issn.1671-6833.2019.05.005]
[3]黄文锋,徐珊珊,孙燚,等.基于多分辨率卷积神经网络的火焰检测[J].郑州大学学报(工学版),2019,40(05):79.[doi:10.13705/j.issn.1671-6833.2019.05.022]
　Huang Wenfeng,Susan Hsu,Sun Yi,et al.Fire Detection Based on Multi-resolution Convolution Neural Network in Various Scenes[J].Journal of Zhengzhou University (Engineering Science),2019,40(03):79.[doi:10.13705/j.issn.1671-6833.2019.05.022]
[4]陈义飞、郭胜、潘文安、陆彦辉.基于多源传感器数据融合的三维场景重建[J].郑州大学学报(工学版),2021,42(02):81.[doi:10.13705/j.issn.1671-6833.2021.02.008]
　Chen Yifei,Guo Sheng,Pan Wenan,et al.3D Scene Reconstruction Based on Multi-source Sensor Data Fusion[J].Journal of Zhengzhou University (Engineering Science),2021,42(03):81.[doi:10.13705/j.issn.1671-6833.2021.02.008]
[5]李学相,曹淇,刘成明.基于无配对生成对抗网络的图像超分辨率重建[J].郑州大学学报(工学版),2021,42(05):1.[doi:10.13705/j.issn.1671-6833.2021.05.018]
　LI Xuexiang,CAO Qi,LIU Chengming.Image Super-resolution Based on No Match Generative Adversarial Network[J].Journal of Zhengzhou University (Engineering Science),2021,42(03):1.[doi:10.13705/j.issn.1671-6833.2021.05.018]
[6]王希鹏,李永,李智,等.融合图像深度的抗遮挡目标跟踪算法[J].郑州大学学报(工学版),2021,42(05):19.[doi:10.13705/j.issn.1671-6833.2021.05.011]
　Wang Xipeng,Li Yong,Li Zhi,et al.Anti-occlusion Target Tracking Algorithm Based on Image Depth[J].Journal of Zhengzhou University (Engineering Science),2021,42(03):19.[doi:10.13705/j.issn.1671-6833.2021.05.011]
[7]卢晨辉,冯硕,易爱华,等.基于深度学习的加油站销量预测与营销策略应用研究[J].郑州大学学报(工学版),2022,43(01):1.[doi:10.13705/j.issn.1671-6833.2022.01.014]
　LU Chenhui,FENG Shuo,YI Aihua,et al.Gasoline Station Sales Prediction Method Based on Deep Learning and Its Application of Promotion Strategy[J].Journal of Zhengzhou University (Engineering Science),2022,43(03):1.[doi:10.13705/j.issn.1671-6833.2022.01.014]
[8]陈浩杰,黄锦,左兴权,等.基于宽度&深度学习的基站网络流量预测方法[J].郑州大学学报(工学版),2022,43(01):7.[doi:10.13705/j.issn.1671-6833.2022.01.011]
　CHEN Haojie,HUANG Jin,ZUO Xingquan,et al.Base Station Network Traffic Prediction Method Based on Wide & Deep Learning[J].Journal of Zhengzhou University (Engineering Science),2022,43(03):7.[doi:10.13705/j.issn.1671-6833.2022.01.011]
[9]成科扬,荣兰,蒋森林,等.基于深度学习的遥感图像超分辨率重建技术综述[J].郑州大学学报(工学版),2022,43(05):8.[doi:10.13705/j.issn.1671-6833.2022.05.013]
　CHENG Keyang,RONG Lan,JIANG Senlin,et al.Overview of Methods for Remote Sensing Image Super-resolution Reconstruction Based on Deep Learning[J].Journal of Zhengzhou University (Engineering Science),2022,43(03):8.[doi:10.13705/j.issn.1671-6833.2022.05.013]
[10]高宇飞,马自行,徐静,等.基于卷积和可变形注意力的脑胶质瘤图像分割[J].郑州大学学报(工学版),2024,45(02):27.[doi:10.13705/j.issn.1671-6833.2023.05.007]
　GAO Yufei,MA Zixing,XU Jing,et al.Brain Glioma Image Segmentation Based on Convolution and Deformable Attention[J].Journal of Zhengzhou University (Engineering Science),2024,45(03):27.[doi:10.13705/j.issn.1671-6833.2023.05.007]
[11]院老虎,常玉坤,刘家夫.基于改进YOLOv5s的雾天场景车辆检测方法[J].郑州大学学报(工学版),2023,44(03):37.[doi:10.13705/j.issn.1671-6833.2023.03.005]
　YUAN Laohu,CHANG Yukun,LIU Jiafu.Vehicle Detection Method Based on Improved YOLOv5s in Foggy Scene[J].Journal of Zhengzhou University (Engineering Science),2023,44(03):37.[doi:10.13705/j.issn.1671-6833.2023.03.005]

更新日期/Last Update: 2024-04-29

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

文章信息/Info

参考文献/References:

相似文献/References:

常用功能

导航/Navigate

工具/Tools

统计/Statistics