Open Access
ARTICLE
OPTIMIZING YOLOV8N FOR ENHANCED PRECISION IN SMALL OBJECT DETECTION ON CUSTOM DATASETS
Issue Vol. 1 No. 01 (2024): Volume 01 Issue 01 --- Section Articles
Abstract
Object detection, a fundamental task in computer vision, has witnessed significant advancements with the advent of deep learning. While state-of-the-art models like the YOLO series exhibit impressive performance across various applications, the accurate detection of small objects remains a persistent challenge. This article presents a comprehensive study on enhancing the YOLOv8n architecture, the smallest variant of the YOLOv8 family, specifically for improved small object recognition within custom datasets. We explore architectural modifications, advanced loss functions, and refined training strategies to bolster its capabilities. Experimental results on a simulated custom dataset, representative of scenarios with prevalent small targets, demonstrate that our refined YOLOv8n achieves superior performance metrics compared to its baseline counterpart, particularly in mean Average Precision (mAP) for small objects. These findings underscore the potential of targeted enhancements to off-the-shelf models for specialized object detection tasks.
Keywords
References
[1] Girshick, R. (2015). Fast R-CNN (Version 2). arXiv. https://doi.org/10.48550/ARXIV.1504.08083
[2] Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2013). Rich feature hierarchies for accurate object detection and semantic segmentation (Version 5). arXiv. https://doi.org/10.48550/ARXIV.1311.2524
[3] He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN (Version 3). arXiv. https://doi.org/10.48550/ARXIV.1703.06870
[4] He, K., Zhang, X., Ren, S., & Sun, J. (2014). Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. In D. Fleet, T. Pajdla, B. Schiele, & T. Tuytelaars (Eds.), Computer Vision – ECCV 2014 (Vol. 8691, pp. 346–361). Springer International Publishing. https://doi.org/10.1007/978-3-319-10578-9_23
[5] Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (Version 1). arXiv. https://doi.org/10.48550/ARXIV.1704.04861
[6] Hussain, M. (2024). YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2407.02988
[7] Khalili, B., & Smyth, A. W. (2024). SOD-YOLOv8—Enhancing YOLOv8 for Small Object Detection in Traffic Scenes (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2408.04786
[8] Krolkral, N. W., Mohamed Faraoun, K., Bousahba, N., Rezzouk, B., & Hamouda, I. A. (2023). Improved YOLOv5s for Object Detection. 2023 International Conference on Electrical Engineering and Advanced Technology (ICEEAT), 1–6. https://doi.org/10.1109/ICEEAT60471.2023.10425837
[9] Li, F., & Jia, J. (2024). Multi-Class Military Target Detection Algorithm Based on Improved YOLOv8. 2024 5th International Conference on Machine Learning and Computer Application (ICMLCA), 431–435. https://doi.org/10.1109/ICMLCA63499.2024.10753821
[10] Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., Tang, J., & Yang, J. (2020). Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2006.04388
[11] Liang, R., & Wu, T. (2025). Enhancement of YOLOv8 model for dense crowd scenes: Incorporating an improved feature pyramid with attention mechanisms. In H. Yuan & L. Leng (Eds.), Fourth International Conference on Computer Vision, Application, and Algorithm (CVAA 2024) (p. 21). SPIE. https://doi.org/10.1117/12.3055731
[12] Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path Aggregation Network for Instance Segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8759–8768. https://doi.org/10.1109/CVPR.2018.00913
[13] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). SSD: Single Shot MultiBox Detector. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), Computer Vision – ECCV 2016 (Vol. 9905, pp. 21–37). Springer International Publishing. https://doi.org/10.1007/978-3-319-46448-0_2
[14] Patel, S., & Patel, A. (2021). Object Detection with Convolutional Neural Networks. In A. Joshi, M. Khosravy, & N. Gupta (Eds.), Machine Learning for Predictive Analysis (Vol. 141, pp. 529–539). Springer Singapore. https://doi.org/10.1007/978-981-15-7106-0_52
[15] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2015). You Only Look Once: Unified, Real-Time Object Detection (Version 5). arXiv. https://doi.org/10.48550/ARXIV.1506.02640
[16] Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (Version 3). arXiv. https://doi.org/10.48550/ARXIV.1506.01497
[17] Sarda, A., Dixit, S., & Bhan, A. (2021). Object Detection for Autonomous Driving using YOLO algorithm. 2021 2nd International Conference on Intelligent Engineering and Management (ICIEM), 447–451. https://doi.org/10.1109/ICIEM51511.2021.9445365
[18] Seth, Y., & Sivagami, M. (2025). Enhanced YOLOv8 Object Detection Model for Construction Worker Safety Using Image Transformations. IEEE Access, 13, 10582–10594. https://doi.org/10.1109/ACCESS.2025.3527511
[19] Singh, S., & G N, R. (2024). Military Based Object Detection in Satellite Imagery by Optimising YOLOv8. 2024 IEEE Space, Aerospace and Defence Conference (SPACE), 165–168. https://doi.org/10.1109/SPACE63117.2024.10667819
[20] Tan, M., Pang, R., & Le, Q. V. (2019). EfficientDet: Scalable and Efficient Object Detection. https://doi.org/10.48550/ARXIV.1911.09070
[21] Terven, J., Córdova-Esparza, D.-M., & Romero-González, J.-A. (2023). A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Machine Learning and Knowledge Extraction, 5(4), 1680–1716. https://doi.org/10.3390/make5040083
[22] Wang, C.-Y., Liao, H.-Y. M., Yeh, I.-H., Wu, Y.-H., Chen, P.-Y., & Hsieh, J.-W. (2019). CSPNet: A New Backbone that can Enhance Learning Capability of CNN (Version 1). arXiv. https://doi.org/10.48550/ARXIV.1911.11929
[23] Wu, D., Fang, C., Zheng, X., Liu, J., Wang, S., & Huang, X. (2024). AMW-YOLOv8n: Road Scene Object Detection Based on an Improved YOLOv8. Electronics, 13(20), 4121. https://doi.org/10.3390/electronics13204121
[24] Wu, Q., Li, X., Xu, C., & Zhu, J. (2024). An Improved YOLOv8n Algorithm for Small Object Detection in Aerial Images. 2024 9th International Conference on Signal and Image Processing (ICSIP), 607–611. https://doi.org/10.1109/ICSIP61881.2024.10671469
[25] Yaseen, M. (2024). What is YOLOv8: An In-Depth Exploration of the Internal Features of the NextGeneration Object Detector (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2408.15857
[26] Zeng, W., Wu, P., Wang, J., Hu, G., & Zhao, J. (2024). C4D-YOLOv8: Improved YOLOv8 for Object Detection on Drone-captured Images. In Review. https://doi.org/10.21203/rs.3.rs-4658932/v1
[27] Zhao, H., Tang, Z., Li, Z., Dong, Y., Si, Y., Lu, M., & Panoutsos, G. (2024). Real-time object detection and robotic manipulation for agriculture using a YOLO-based learning approach (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2401.15785
[28] Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2019). Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression (Version 1). arXiv. https://doi.org/10.48550/ARXIV.1911.08287
[29] Zhou, W., Zhu, C., & Miao, D. (2024). Object Detection Model of YOLOv8-CSD for UAV Images. 2024 7th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), 77–83. https://doi.org/10.1109/PRAI62207.2024.10826612
[30] Ultralytics, ``Issue #189 on Ultralytics GitHub repository,'' GitHub, 2023. [Online]. Available: https://github.com/ultralytics/ultralytics/issues/189. [Accessed: May 30, 2025].
Open Access Journal
Submit a Paper
Propose a Special lssue
pdf