الكشف عن الأهداف الصغيرة تحت الماء باستخدام نموذج YOLOv8-LA Underwater small target detection under YOLOv8-LA model

المجلة: Scientific Reports، المجلد: 14، العدد: 1
DOI: https://doi.org/10.1038/s41598-024-66950-w
PMID: https://pubmed.ncbi.nlm.nih.gov/38997415
تاريخ النشر: 2024-07-12
المؤلف: Shenming Qu وآخرون
الموضوع الرئيسي: البحوث في الصوتيات تحت الماء

نظرة عامة

في مجال هندسة البيئة البحرية، يعد اكتشاف الأهداف تحت الماء أمرًا حيويًا، وقد أظهرت التطورات الأخيرة باستخدام الشبكات العصبية التلافيفية (CNN) وعودًا كبيرًا. ومع ذلك، غالبًا ما تكافح الشبكات العصبية العميقة التقليدية من حيث سرعة المعالجة والدقة، خاصة بالنسبة للأهداف الصغيرة والمتقاربة. للتغلب على هذه التحديات، يقترح المؤلفون نموذج شبكة عصبية جديد، YOLOv8-LA، الذي يتضمن وحدة تلافيف جزئية فعالة وخفيفة الوزن (LEPC) لتعزيز استخراج الميزات المكانية مع تقليل التكرار الحاسوبي. بالإضافة إلى ذلك، تم تقديم بنية AP-FasterNet لتحسين اكتشاف الأهداف الصغيرة من خلال دمج التلافيف القابلة للفصل حسب العمق بمعدلات توسع متغيرة.

تشير التقييمات على مجموعة بيانات URPC2021 إلى أن YOLOv8-LA يحقق دقة متوسطة (mAP) تبلغ 84.7% ويعمل بسرعة 189.3 إطارًا في الثانية (FPS)، متجاوزًا الأساليب الحالية الرائدة. تصميم النموذج لا يعزز فقط دقة الاكتشاف ولكن أيضًا يحافظ على قدرات المعالجة في الوقت الحقيقي، مما يجعله فعالًا بشكل خاص في البيئات تحت الماء الصعبة. على الرغم من نقاط قوته، يعترف المؤلفون بالقيود المتعلقة بزيادة تعقيد عملية استخراج الميزات، مما قد يؤثر على السرعة. ستركز الأبحاث المستقبلية على تحسين هذه العملية واستكشاف بنى الشبكات المتقدمة لتوسيع تطبيقات النموذج عبر مهام ومنصات متنوعة.

طرق

في هذا القسم، يوضح المؤلفون إعداد التجارب المستخدمة في دراستهم. تتكون تكوينات الأجهزة من معالج Intel Xeon Gold 6330 CPU بقدرة 14 vCPU يعمل بسرعة 2.00GHz، مقترنًا ببطاقة NVIDIA GeForce RTX 3090 GPU التي تحتوي على 24 جيجابايت من ذاكرة الرسومات. تم إنشاء بيئة البرمجيات باستخدام CUDA 11.3 وCUDNN 8.2.2 وPython 3.8. تم تقديم إعدادات المعلمات الفائقة للنموذج في الجدول 1.

تُجرى التجارب باستخدام مجموعتي بيانات: مجموعة بيانات URPC ومجموعة بيانات مسابقة اكتشاف الأهداف تحت الماء في Zhanjiang، حيث يتم وصف كل منهما بالتفصيل. يختتم القسم بعرض النتائج التجريبية والتحليل اللاحق، على الرغم من عدم تضمين النتائج المحددة في النص المقدم.

مناقشة

في هذه الدراسة، نقدم YOLOv8-LA، نسخة محسّنة من إطار عمل YOLOv8 مصممة لاكتشاف الأجسام تحت الماء. تتضمن البنية عدة وحدات مبتكرة، بما في ذلك التلافيف الجزئية الفعالة والخفيفة الوزن (LEPC) وAP-FasterNet، تهدف إلى تعزيز دقة الاكتشاف وكفاءة الحوسبة. تقلل وحدة LEPC من عدد المعلمات وتحسن سرعة المعالجة من خلال استخدام التلافيف الانتقائية، بينما تعزز وحدة AP-FasterNet قدرة النموذج على اكتشاف الأهداف الصغيرة من خلال تقنيات استخراج الميزات المتقدمة، بما في ذلك التلافيف القابلة للفصل حسب العمق والتلافيف الممددة. تعالج هذه التعديلات بشكل جماعي التحديات التي تطرحها البيئات تحت الماء المعقدة، خاصة في اكتشاف الأهداف الصغيرة والمتقاربة.

تظهر نتائجنا التجريبية أن YOLOv8-LA يتفوق بشكل كبير على النماذج الحالية، حيث يحقق دقة متوسطة (mAP) تبلغ 84.7% على مجموعة بيانات URPC ويحافظ على معدل إطار مرتفع يبلغ 189.3 FPS، وهو أمر ضروري للتطبيقات في الوقت الحقيقي. يتم التحقق من أداء النموذج من خلال تحليلات مقارنة، مما يظهر تفوقه على خوارزميات الاكتشاف التقليدية ذات المرحلتين مثل Faster R-CNN، التي، على الرغم من دقتها الأعلى، تعاني من بطء في سرعات المعالجة. بينما يظهر YOLOv8-LA تحسينات ملحوظة، فإنه يواجه أيضًا تحديات تتعلق بزيادة المتطلبات الحاسوبية أثناء استخراج الميزات. ستركز الأعمال المستقبلية على تحسين هذه العمليات واستكشاف بنى متقدمة لتعزيز تطبيق النموذج عبر مهام ومنصات متنوعة.

Journal: Scientific Reports, Volume: 14, Issue: 1
DOI: https://doi.org/10.1038/s41598-024-66950-w
PMID: https://pubmed.ncbi.nlm.nih.gov/38997415
Publication Date: 2024-07-12
Author(s): Shenming Qu et al.
Primary Topic: Underwater Acoustics Research

Overview

In the field of marine environmental engineering, the detection of underwater targets is critical, and recent advancements using Convolutional Neural Networks (CNN) have shown promise. However, traditional deep neural networks often struggle with processing speed and accuracy, particularly for small and closely spaced targets. To overcome these challenges, the authors propose a novel neural network model, YOLOv8-LA, which incorporates a Lightweight Efficient Partial Convolution (LEPC) module to enhance spatial feature extraction while minimizing computational redundancy. Additionally, the AP-FasterNet architecture is introduced to improve the detection of small targets by integrating depth-separable convolutions with varying expansion rates.

Evaluation on the URPC2021 dataset indicates that YOLOv8-LA achieves a mean accuracy (mAP) of 84.7% and operates at 189.3 frames per second (FPS), surpassing existing state-of-the-art methods. The model’s design not only enhances detection accuracy but also maintains real-time processing capabilities, making it particularly effective in challenging underwater environments. Despite its strengths, the authors acknowledge limitations related to the increased complexity of the feature extraction process, which may affect speed. Future research will focus on optimizing this process and exploring advanced network architectures to broaden the model’s applicability across various tasks and platforms.

Methods

In this section, the authors detail the experimental setup utilized for their study. The hardware configuration comprises a 14 vCPU Intel Xeon Gold 6330 CPU operating at 2.00GHz, paired with an NVIDIA GeForce RTX 3090 GPU featuring 24 GB of graphics memory. The software environment is established using CUDA 11.3, CUDNN 8.2.2, and Python 3.8. Hyperparameter settings for the model are provided in Table 1.

The experiments are conducted using two datasets: the URPC dataset and the Zhanjiang Underwater Target Detection Competition Dataset, each of which is described in detail. The section concludes with a presentation of the experimental results and subsequent analysis, although specific findings are not included in the provided text.

Discussion

In this study, we present YOLOv8-LA, an optimized version of the YOLOv8 framework tailored for underwater object detection. The architecture incorporates several innovative modules, including the Lightweight Efficient Partial Convolution (LEPC) and AP-FasterNet, aimed at enhancing detection accuracy and computational efficiency. The LEPC module reduces the parameter count and improves processing speed by employing selective convolution, while the AP-FasterNet module enhances the model’s ability to detect small targets through advanced feature extraction techniques, including Depthwise Separable Convolutions and dilated convolutions. These modifications collectively address the challenges posed by complex underwater environments, particularly in detecting small and densely packed targets.

Our experimental results demonstrate that YOLOv8-LA significantly outperforms existing models, achieving a mean Average Precision (mAP) of 84.7% on the URPC dataset and maintaining a high frame rate of 189.3 FPS, which is essential for real-time applications. The model’s performance is further validated through comparative analyses, showcasing its superiority over traditional two-stage detection algorithms like Faster R-CNN, which, despite higher accuracy, suffers from slower processing speeds. While YOLOv8-LA exhibits remarkable improvements, it also faces challenges related to increased computational demands during feature extraction. Future work will focus on optimizing these processes and exploring advanced architectures to enhance the model’s applicability across various tasks and platforms.