شبكة انتباه معززة بالانتشار متعددة المقاييس لاكتشاف عيوب سطح الفولاذ في إنتاج البوليسيلكون Multiscale diffusion-enhanced attention network for steel surface defect detection in Polysilicon Production

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-026-35913-8
PMID: https://pubmed.ncbi.nlm.nih.gov/41545545
تاريخ النشر: 2026-01-16
المؤلف: Yiwei Duan وآخرون
الموضوع الرئيسي: تطبيقات الشبكات العصبية المتقدمة

نظرة عامة

إن اكتشاف العيوب السطحية في مكونات الصلب أمر ضروري لضمان الجودة في إنتاج البوليسيلكون، ومع ذلك فإنه يواجه تحديات كبيرة بسبب الحجم الصغير للعيوب، والأشكال غير المنتظمة، والخلفيات المعقدة، والتباين المنخفض. لمواجهة هذه التحديات، يقدم المؤلفون MSEOD-DDFusionNet (شبكة دمج الانتشار للكشف عن الأجسام متعددة المقاييس والفعالة)، وهي بنية جديدة تستخدم آلية انتباه معززة بالانتشار متعددة المقاييس. تتكون هذه الشبكة من أربعة وحدات رئيسية: MTECAAttention (قناة انتباه معززة بالنسيج متعددة المقاييس) لدمج الميزات متعددة المقاييس بشكل فعال، ODConv (التفاف ديناميكي متعدد الأبعاد) للتكيف مع الأشكال غير المنتظمة، LMDP (إدراك تمييزي محلي متعدد المقاييس) لكبت الضوضاء وتعزيز العيوب الدقيقة، وDDFusion (دمج الميزات المدفوعة بالانتشار) لنمذجة الضوضاء المعتمدة على المشهد.

تظهر النموذج المقترح أداءً رائدًا في مجموعة بيانات DDTE المتخصصة، حيث تحقق معدل دقة متوسط (mAP) قدره 82.6% عند IoU 0.50 و61.6% عند IoU 0.50:0.95، مع الحفاظ على سرعة استدلال عالية تبلغ 193.5 إطارًا في الثانية (FPS) مع 8.46 مليون معلمة فقط. بالإضافة إلى ذلك، يظهر MSEOD-DDFusionNet قدرات تعميم قوية عبر مجموعات بيانات متنوعة، بما في ذلك NEU-DET وGC10-DET، فضلاً عن التطبيقات عبر المجالات، مما يجعله حلاً قويًا وفعالًا لفحص العيوب الصناعية.

مقدمة

تسلط المقدمة الضوء على الأهمية الحاسمة لسلامة السطح في مكونات الصلب المستخدمة في إنتاج البوليسيلكون للطاقة الشمسية، مشددة على أن العيوب السطحية—مثل الشقوق الدقيقة (Cr)، رواسب السيليكون (SD)، الحفر (PT)، وبقع الشوائب (IS)—يمكن أن تنشأ من الضغوط التصنيعية أو التشغيلية، مما قد يؤدي إلى فشل كارثي. يتم التأكيد على ضرورة الكشف الآلي والدقيق عن العيوب، حيث تواجه الحلول الحالية المعتمدة على التعلم العميق تحديات كبيرة في عمليات الفحص الفوتوفولطية في العالم الحقيقي. على وجه التحديد، تم تحديد ثلاث قيود: (1) محدودية التمييز متعدد المقاييس، (2) عدم كفاية التكيف الهندسي لنوى الالتفاف الثابتة، و(3) توازن الحساسية والصلابة الذي يؤثر على الكشف عن العيوب ذات التباين المنخفض.

للتغلب على هذه التحديات، يقترح المؤلفون MSEOD-DDFusionNet، وهو إطار عمل متكامل للكشف عن العيوب. يقدم هذا الإطار مبدأ دمج متعدد المقاييس بدون فقدان يحافظ على توقيعات العيوب الدقيقة، وآلية تكيف ديناميكية متعددة الأبعاد لالتقاط الأشكال الهندسية غير المنتظمة للعيوب، واستراتيجية صلابة ضوضائية مفصولة لتعزيز حساسية الميزات دون المساس بالصلابة. بالإضافة إلى ذلك، يقدم المؤلفون مجموعة بيانات DDTE، وهي معيار موضح عالي الدقة تهدف إلى معالجة نقص البيانات المحددة بالنطاق. تشير النتائج التجريبية إلى أن MSEOD-DDFusionNet يحقق دقة وكفاءة رائدة، مما يظهر تعميمًا متفوقًا عبر مجالات متنوعة.

طرق

في هذا القسم، يوضح المؤلفون المنهجية وتصميم التجارب المستخدمة في بحثهم. تم تنفيذ جميع التجارب على خادم حاسوبي يحتوي على وحدة معالجة الرسوميات NVIDIA A30 ومعالج Intel Xeon Silver 4314. تم تنفيذ النماذج باستخدام PyTorch 2.4.1، مع تحسينات الأداء المقدمة من CUDA 12.4 وcuDNN 8.2.4. تم التدريب على مدى 200 دورة باستخدام خوارزمية الانحدار العشوائي، مع تعيين معدل التعلم الأولي إلى 0.01، وزخم قدره 0.937، وتدهور الوزن بمقدار 0.0005.

لتعزيز صلابة النموذج ضد التغيرات التي يتم مواجهتها عادة في البيئات الصناعية، استخدم المؤلفون استراتيجية واسعة لتكبير البيانات. على وجه التحديد، خلال الـ 190 دورة الأولى، استخدموا تكبير الموزاييك، والذي يتضمن دمج أربع صور ذات مقاييس عشوائية في إدخال واحد بحجم 640×640. كان الهدف من هذه الطريقة هو تحسين قدرة النموذج على التعميم من خلال تعريضه لمجموعة متنوعة من ظروف الإدخال. يشير القسم أيضًا إلى خط أنابيب معالجة معقد يتضمن كبت الضوضاء، وتضخيم الإشارة، وعمليات التفاف متنوعة، مما يدل على نهج متطور لاستخراج الميزات وتعزيزها.

مناقشة

في قسم المناقشة من الورقة، يبرز المؤلفون قيود الطرق الحالية في دمج الميزات متعددة المقاييس، والتفاف الديناميكي، وصلابة الضوضاء في سياق فحص السطح. يجادلون بأن الأساليب التقليدية، مثل تلك التي تستخدم آليات الانتباه والتفاف الديناميكي، غالبًا ما تؤدي إلى فقدان المعلومات والتكيف غير المكتمل للأبعاد، مما يعيق اكتشاف العيوب الدقيقة. على وجه التحديد، يتم انتقاد نموذج الانتباه القائم على الضغط لتقليل التنشيطات عالية التردد التي تعتبر حاسمة لتحديد العيوب الدقيقة، بينما تميل استراتيجيات صلابة الضوضاء الحالية إلى المساس بالحساسية تجاه الإشارات ذات التباين المنخفض.

لمعالجة هذه التحديات، يقترح المؤلفون إطار عمل MSEOD-DDFusionNet، الذي يدمج عدة وحدات متخصصة مصممة لتعزيز استخراج الميزات وكشف العيوب. تضمن وحدة MTECAAttention دمج ميزات متعددة المقاييس بدون فقدان، مما يحافظ على المعلومات الحرجة للقناة. تقدم وحدة ODConv آلية تعديل وزن ديناميكي رباعي الأبعاد للتفاف التكيفي، بينما تستخدم وحدة LMDP بنية ذات مجرى مزدوج لكبت الضوضاء الانتقائي وتضخيم الإشارة. أخيرًا، تعزز وحدة DDFusion الصلابة من خلال نمذجة الضوضاء المعتمدة على المشهد وفصل الميزات بشكل تدريجي. يقدم المؤلفون دراسات إلغاء توضح الأهمية الإحصائية لمساهمة كل وحدة في الأداء العام، مما يظهر تحسينات في دقة الكشف وكفاءته مقارنة بالنماذج الأساسية. لا يعالج هذا النهج الشامل التحديات المتشابكة لتغير المقياس، وعدم الانتظام الهندسي، والضوضاء البيئية فحسب، بل يؤسس أيضًا أساسًا قابلًا للتوسع للكشف الدقيق عن العيوب في التطبيقات الصناعية.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-026-35913-8
PMID: https://pubmed.ncbi.nlm.nih.gov/41545545
Publication Date: 2026-01-16
Author(s): Yiwei Duan et al.
Primary Topic: Advanced Neural Network Applications

Overview

The detection of surface defects in steel components is essential for ensuring quality in polysilicon production, yet it poses significant challenges due to the small size of defects, irregular shapes, complex backgrounds, and low contrast. To tackle these challenges, the authors introduce MSEOD-DDFusionNet (Multi-Scale and Effective Object-Detection Diffusion Fusion Network), a novel architecture that employs a multi-scale diffusion-enhanced attention mechanism. This network comprises four key modules: MTECAAttention (Multi-Scale Texture Enhancement Channel-Aware Attention) for effective multi-scale feature fusion, ODConv (Omni-Dimensional Dynamic Convolution) for adapting to irregular geometries, LMDP (Local Multi-Scale Discriminative Perception) for noise suppression and micro-defect enhancement, and DDFusion (Diffusion-Driven Feature Fusion) for modeling scene-aware noise.

The proposed model demonstrates state-of-the-art performance on the specialized DDTE dataset, achieving a mean Average Precision (mAP) of 82.6% at IoU 0.50 and 61.6% at IoU 0.50:0.95, while maintaining a high inference speed of 193.5 frames per second (FPS) with only 8.46 million parameters. Additionally, MSEOD-DDFusionNet exhibits strong generalization capabilities across various datasets, including NEU-DET and GC10-DET, as well as in cross-domain applications, making it a robust and efficient solution for industrial defect inspection.

Introduction

The introduction highlights the critical importance of surface integrity in steel components used in polysilicon production for photovoltaics, emphasizing that surface defects—such as microcracks (Cr), silicon deposits (SD), pits (PT), and impurity spots (IS)—can arise from manufacturing or operational stresses, potentially leading to catastrophic failures. The necessity for automated and precise defect detection is underscored, with current deep learning-based solutions facing significant challenges in real-world photovoltaic inspections. Specifically, three limitations are identified: (1) limited multi-scale discriminability, (2) insufficient geometric adaptability of static convolutional kernels, and (3) a robustness-sensitivity trade-off that affects the detection of low-contrast defects.

To overcome these challenges, the authors propose MSEOD-DDFusionNet, a novel integrated defect detection framework. This framework introduces a lossless multi-scale fusion principle that preserves micro-defect signatures, a multi-dimensional dynamic adaptation mechanism for capturing irregular defect geometries, and a decoupled noise robustness strategy to enhance feature sensitivity without compromising robustness. Additionally, the authors present the DDTE dataset, a high-resolution annotated benchmark aimed at addressing the lack of domain-specific data. Experimental results indicate that MSEOD-DDFusionNet achieves state-of-the-art accuracy and efficiency, demonstrating superior generalization across various domains.

Methods

In this section, the authors detail the methodology and experimental design utilized in their research. All experiments were executed on a computational server featuring an NVIDIA A30 GPU and an Intel Xeon Silver 4314 processor. The models were implemented using PyTorch 2.4.1, with performance enhancements provided by CUDA 12.4 and cuDNN 8.2.4. Training was conducted over 200 epochs using stochastic gradient descent, with the optimizer set to an initial learning rate of 0.01, a momentum of 0.937, and a weight decay of 0.0005.

To enhance the model’s robustness against variations typically encountered in industrial settings, the authors employed an extensive data augmentation strategy. Specifically, during the first 190 epochs, they utilized mosaic augmentation, which involved merging four randomly scaled images into a single input of size 640×640. This approach aimed to improve the model’s ability to generalize by exposing it to a diverse range of input conditions. The section also references a complex processing pipeline involving noise suppression, signal amplification, and various convolutional operations, indicating a sophisticated approach to feature extraction and enhancement.

Discussion

In the discussion section of the paper, the authors highlight the limitations of existing methods in multi-scale feature fusion, dynamic convolution, and noise robustness within the context of surface inspection. They argue that traditional approaches, such as those utilizing attention mechanisms and dynamic convolution, often lead to information loss and incomplete dimensional adaptation, which hinder the detection of micro-defects. Specifically, the compression-based attention paradigm is criticized for attenuating high-frequency activations that are crucial for identifying subtle defects, while existing noise robustness strategies tend to compromise sensitivity to low-contrast signals.

To address these challenges, the authors propose the MSEOD-DDFusionNet framework, which integrates several specialized modules designed to enhance feature extraction and defect detection. The MTECAAttention module ensures lossless multi-scale feature fusion, preserving critical channel information. The ODConv module introduces a four-dimensional dynamic weight co-modulation mechanism for adaptive convolution, while the LMDP module employs a dual-stream architecture for selective noise suppression and signal amplification. Finally, the DDFusion module enhances robustness through scene-aware noise modeling and progressive feature decoupling. The authors present ablation studies demonstrating the statistical significance of each module’s contribution to overall performance, showcasing improvements in detection accuracy and efficiency compared to baseline models. This comprehensive approach not only addresses the intertwined challenges of scale variation, geometric irregularity, and environmental noise but also establishes a scalable foundation for precision defect detection in industrial applications.