LSFConvformer: طريقة خفيفة لتشخيص الأعطال الميكانيكية تحت عينات صغيرة وسرعات متغيرة مع دمج الزمن والتردد LSFConvformer: A lightweight method for mechanical fault diagnosis under small samples and variable speeds with time-frequency fusion

المجلة: Mechanical Systems and Signal Processing، المجلد: 236
DOI: https://doi.org/10.1016/j.ymssp.2025.113016
تاريخ النشر: 2025-06-20
المؤلف: Haidong Shao وآخرون
الموضوع الرئيسي: تقنيات تشخيص أعطال الآلات

نظرة عامة

تقدم ورقة البحث إطار عمل LSFConvformer، وهو نموذج جديد قائم على Transformer مصمم لتعزيز تشخيص الأعطال الذكي من خلال معالجة القيود الرئيسية للطرق الحالية. تشمل هذه القيود فقدان معلومات الميزات في Transformers خفيفة الوزن والاستخدام غير الكافي لميزات إشارة المجال الزمني، خاصة في السيناريوهات ذات أحجام العينات الصغيرة وبيانات السرعة المتغيرة. يتضمن الإطار المقترح وحدة Convformer خفيفة الوزن تعمل على تحسين استخراج الميزات المعقدة مع تقليل الحمل الحسابي، ووحدة دمج ميزات الوقت-التردد Shuffle التي تعزز الخصائص متعددة الأبعاد لميزات الأعطال.

تظهر النتائج التجريبية فعالية LSFConvformer، حيث تحقق دقة تشخيص عالية حتى مع عينات تدريب محدودة. على سبيل المثال، في مجموعة بيانات D3 للحالة 1، حقق النموذج دقة 98% مع 20 عينة فقط لكل نوع من الأعطال. بالإضافة إلى ذلك، تفوق على النماذج التقليدية مثل ResNet18 وDenseNet من حيث الكفاءة الحسابية ودقة التصنيف. توضح الدراسة الاتجاهات المستقبلية، بما في ذلك توسيع النموذج ليشمل تطبيقات هندسية أوسع، وتحسين معلمات Transformer من خلال طرق آلية، وتعزيز المتانة لمهام نقل التعلم عبر المجالات.

مقدمة

تؤكد مقدمة الورقة على الدور الحاسم لتشخيص الأعطال الميكانيكية في تعزيز موثوقية المعدات وكفاءة التشغيل. من خلال تمكين الكشف المبكر عن المشكلات المحتملة، يسهل هذا النهج الصيانة الاستباقية، مما يقلل التكاليف، ويقلل من التوقف غير المتوقع، ويحسن سلامة مكان العمل. يتم تسليط الضوء على دمج تشخيص الأعطال ضمن إطار التصنيع الذكي والصناعة 4.0، حيث يدعم أتمتة عمليات الصيانة ويساهم في تطوير المصانع الذكية.

تناقش الورقة التقدم في نماذج التعلم العميق، وخاصة الشبكات العصبية التلافيفية (CNN)، والشبكات العصبية العميقة (DBN)، والشبكات التنافسية التوليدية (GAN)، وTransformers، التي تتقن معالجة البيانات المعقدة وعالية الأبعاد لتشخيص الأعطال الذكي. يتم الاستشهاد بمساهمات بارزة من باحثين مختلفين، مما يظهر الابتكارات مثل دمج تقنيات تقليل الضوضاء واستراتيجيات زيادة البيانات. ومع ذلك، لا تزال هناك تحديات تواجه الطرق القائمة على Transformer، خاصة فيما يتعلق بفقدان المعلومات في النماذج الخفيفة الوزن والقيود في التعامل مع العمليات ذات السرعة المتغيرة ومجموعات البيانات الصغيرة. لمعالجة هذه القضايا، يقترح المؤلفون إطار عمل LSFConvformer، الذي يتضمن وحدة Convformer خفيفة الوزن ووحدة دمج ميزات الوقت-التردد Shuffle التي تهدف إلى تعزيز الأداء التشخيصي في ظل ظروف صعبة. يتم التحقق من الطريقة المقترحة من خلال تجارب مقارنة، مما يظهر تحسين المتانة والكفاءة في تشخيص الأعطال.

طرق

في التحليل التجريبي للحالة 1، تم جمع البيانات من إعداد عطل المحامل، والذي يشمل ثماني حالات صحية: طبيعية، عطل في القفص، وأعطال في السباقات الداخلية والخارجية بدرجات متفاوتة من الشدة. تم ضبط تردد العينة عند 8192 هرتز، مع زيادة سرعة مجموعة البيانات بشكل خطي من 0 إلى 1800 دورة في الدقيقة. استخدم التحليل أول 512 نقطة بيانات من إشارة المجال الترددي لتوضيح خصائص توزيع التردد، مما يكشف أن سعة الإشارة تزداد مع السرعة. تم إنشاء ثلاث مجموعات بيانات (D1 وD2 وD3)، باستخدام 100 و40 و20 عينة تدريب لكل حالة صحية، إلى جانب مجموعة اختبار موحدة من 40 عينة لتقييم أداء نموذج التشخيص الذكي.

في الحالة 2، تم الحصول على البيانات من مجموعة بيانات المحامل ذات ظروف التشغيل المتغيرة من جامعة أوتاوا، باستخدام جهاز محاكاة الأعطال الميكانيكية SpectraQuest مع تردد عينة يبلغ 200 كيلوهرتز. تضمنت هذه المجموعة خمس حالات صحية: طبيعية، عطل في السباق الداخلي، عطل في السباق الخارجي، عطل في الكرة، وعطل مركب. تم إجراء التجارب تحت ظروف متسارعة، مع بناء مجموعات البيانات D4 وD5 وD6 من 50 و25 و15 عينة تدريب، على التوالي، إلى جانب 50 عينة اختبار لكل فئة. كان هذا التصميم يهدف إلى تقييم قدرات النموذج التشخيصية تحت سيناريوهات العينات الصغيرة.

نتائج

في هذا القسم، يتم التحقق من فعالية الطريقة التشخيصية المقترحة من خلال تحليل مقارن مع نماذج متقدمة مختلفة، بما في ذلك هياكل Transformer المحسنة (ViT وLinformer وCLformer) والعديد من الشبكات العصبية التلافيفية (CNNs) مثل ResNet18 وDenseNet وMobileNet وShuffleNet. تم تحسين النماذج باستخدام خوارزمية Adamax مع معدل تعلم قدره 0.002، وتم تدريبها على مدى 100 دورة مع حجم دفعة قدره 8. تشير النتائج، الملخصة في الجدول 4، إلى أن الطريقة المقترحة تحقق دقة تزيد عن 95% على مجموعة بيانات صغيرة، مع دقة قصوى تبلغ 99%، متفوقة على النماذج الأخرى. كما تسلط التحليلات الضوء على أن الطريقة المقترحة تحافظ على دقة تشخيص عالية حتى مع عدد أقل من عينات التدريب، مما يظهر متانتها في مهام تشخيص الأعطال ذات العينات الصغيرة.

علاوة على ذلك، فإن التعقيد الحسابي للطريقة المقترحة، المقاس من حيث المعلمات وعمليات النقطة العائمة في الثانية (FLOPs)، قابل للمقارنة مع النماذج الخفيفة الوزن مثل MobileNet وShuffleNet، بينما تتجاوز دقة ResNet18 الأكثر تعقيدًا. يتم تمثيل النتائج بصريًا من خلال مخططات الصندوق ومصفوفات الالتباس، والتي تكشف أن الطريقة المقترحة تظهر أداء تصنيف متفوق. تؤكد تجارب الإزالة أيضًا مزايا دمج ميزات المجال الترددي، التي تعزز قدرة النموذج على التقاط الخصائص الثابتة تحت ظروف السرعة المتغيرة، مما يحسن الأداء التشخيصي في السيناريوهات ذات بيانات التدريب المحدودة. يتم تقديم مؤشر J كمقياس إضافي لتقييم خصائص التجميع، مما يبرز أداء التصنيف بما يتجاوز مجرد الدقة.

مناقشة

تتناول قسم المناقشة في الورقة بنية وأداء نموذج LSFConvformer المقترح، الذي يدمج وحدة Convformer خفيفة الوزن لتشخيص الأعطال الميكانيكية تحت ظروف السرعة المتغيرة. يستفيد النموذج من آلية انتباه متعددة الرؤوس مبسطة تُسمى الانتباه العالمي، والتي تقلل من الازدواجية الحسابية بينما تعزز استخراج الميزات العالمية. يتم تحليل التعقيدات الحسابية لكل من الانتباه الذاتي متعدد الرؤوس ومكونات الشبكة العصبية الأمامية، مما يظهر كفاءة الطريقة المقترحة مقارنة بـ Transformers التقليدية.

تم تصميم بنية LSFConvformer لمعالجة ميزات المجال الزمني والمجال الترددي بشكل منفصل، تليها عملية دمج تعزز تمثيل الميزات. تشير النتائج التجريبية إلى أن الطريقة المقترحة تتفوق على النماذج الحالية، محققة دقة عالية حتى مع أحجام عينات صغيرة. على سبيل المثال، في مجموعة بيانات تحتوي على 20 عينة فقط لكل نوع من الأعطال، حقق النموذج دقة 98%. كما تسلط التحليلات الضوء على متانة النموذج في تمييز بين فئات الأعطال، كما يتضح من نتائج التجميع من تصورات T-SNE. بشكل عام، يظهر LSFConvformer انخفاضًا كبيرًا في التكلفة الحسابية مع الحفاظ على أداء تصنيف متفوق، مما يجعله نهجًا واعدًا لتشخيص الأعطال في التطبيقات الهندسية العملية. ستركز الأعمال المستقبلية على توسيع قابلية تطبيق النموذج وتحسين هيكله من خلال طرق آلية.

Journal: Mechanical Systems and Signal Processing, Volume: 236
DOI: https://doi.org/10.1016/j.ymssp.2025.113016
Publication Date: 2025-06-20
Author(s): Haidong Shao et al.
Primary Topic: Machine Fault Diagnosis Techniques

Overview

The research paper presents the LSFConvformer framework, a novel Transformer-based model designed to enhance intelligent fault diagnosis by addressing key limitations of existing methods. These limitations include the loss of feature information in lightweight Transformers and the inadequate utilization of time-domain signal features, particularly in scenarios with small sample sizes and variable speed data. The proposed framework incorporates a Lightweight Convformer module that optimizes complex feature extraction while minimizing computational load, and a Shuffle Time-Frequency Feature Fusion Module that enriches the multidimensional characteristics of fault features.

Experimental results demonstrate the effectiveness of the LSFConvformer, achieving high diagnostic accuracy even with limited training samples. For instance, in the D3 dataset of Case 1, the model attained an accuracy of 98% with only 20 samples per fault type. Additionally, it outperformed traditional models like ResNet18 and DenseNet in terms of computational efficiency and classification accuracy. The study outlines future directions, including extending the model to broader engineering applications, optimizing Transformer hyperparameters through automated methods, and enhancing robustness for cross-domain data transfer learning tasks.

Introduction

The introduction of the paper emphasizes the critical role of mechanical fault diagnosis in enhancing equipment reliability and operational efficiency. By enabling early detection of potential issues, this approach facilitates proactive maintenance, thereby reducing costs, minimizing unexpected downtime, and improving workplace safety. The integration of fault diagnosis within the framework of smart manufacturing and Industry 4.0 is highlighted, as it supports the automation of maintenance processes and contributes to the development of intelligent factories.

The paper discusses the advancements in deep learning models, particularly Convolutional Neural Networks (CNN), Deep Belief Networks (DBN), Generative Adversarial Networks (GAN), and Transformers, which are adept at processing complex, high-dimensional data for intelligent fault diagnosis. Notable contributions from various researchers are cited, showcasing innovations such as the integration of noise reduction techniques and data augmentation strategies. However, challenges remain for Transformer-based methods, particularly regarding information loss in lightweight models and limitations in handling variable speed operations and small datasets. To address these issues, the authors propose the LSFConvformer framework, which includes a lightweight Convformer module and a Shuffle time-frequency feature fusion module aimed at enhancing diagnostic performance under challenging conditions. The proposed method is validated through comparative experiments, demonstrating improved robustness and efficiency in fault diagnosis.

Methods

In the experimental analysis of Case 1, data was collected from a bearing fault setup, encompassing eight health states: normal, cage fault, and inner and outer race faults of varying severity. The sampling frequency was set at 8192 Hz, with the speed of the dataset increasing linearly from 0 to 1800 rpm. The analysis utilized the first 512 data points of the frequency domain signal to illustrate frequency distribution characteristics, revealing that signal amplitude increases with speed. Three datasets (D1, D2, and D3) were created, employing 100, 40, and 20 training samples for each health state, alongside a unified test set of 40 samples to evaluate the intelligent diagnostic model’s performance.

In Case 2, data was sourced from the University of Ottawa’s variable operating condition bearing dataset, using the SpectraQuest Mechanical Fault Simulator with a sampling frequency of 200 kHz. This dataset included five health states: normal, inner race fault, outer race fault, ball fault, and combined fault. The experiments were conducted under accelerated conditions, with datasets D4, D5, and D6 constructed from 50, 25, and 15 training samples, respectively, along with 50 test samples per class. This design aimed to assess the diagnostic capabilities of the proposed model under small sample scenarios.

Results

In this section, the effectiveness of the proposed diagnostic method is validated through a comparative analysis with various advanced models, including improved Transformer architectures (ViT, Linformer, CLformer) and several convolutional neural networks (CNNs) such as ResNet18, DenseNet, MobileNet, and ShuffleNet. The models were optimized using the Adamax algorithm with a learning rate of 0.002, trained over 100 epochs with a batch size of 8. The results, summarized in Table 4, indicate that the proposed method achieves over 95% accuracy on a small sample dataset, with a peak accuracy of 99%, outperforming the other models. The analysis also highlights that the proposed method maintains high diagnostic accuracy even with a reduced number of training samples, demonstrating its robustness in small sample fault diagnosis tasks.

Furthermore, the computational complexity of the proposed method, measured in terms of parameters and floating-point operations per second (FLOPs), is comparable to lightweight models like MobileNet and ShuffleNet, while surpassing the accuracy of the more complex ResNet18. The results are visually represented through box plots and confusion matrices, which reveal that the proposed method exhibits superior classification performance. Ablation experiments further confirm the advantages of incorporating frequency-domain features, which enhance the model’s ability to capture invariant characteristics under variable speed conditions, thereby improving diagnostic performance in scenarios with limited training data. The J index is introduced as an additional metric for evaluating clustering characteristics, emphasizing the classification performance beyond mere accuracy.

Discussion

The discussion section of the paper elaborates on the architecture and performance of the proposed LSFConvformer model, which integrates a Lightweight Convformer module for mechanical fault diagnosis under variable speed conditions. The model leverages a simplified multi-head attention mechanism termed Global Attention, which reduces computational redundancy while enhancing global feature extraction. The computational complexities of both the multi-head self-attention and feedforward neural network components are analyzed, demonstrating the efficiency of the proposed method compared to traditional Transformers.

The LSFConvformer architecture is designed to process time-domain and frequency-domain features separately, followed by a fusion process that enhances feature representation. Experimental results indicate that the proposed method outperforms existing models, achieving high accuracy even with small sample sizes. For instance, in a dataset with only 20 samples per fault type, the model attained an accuracy of 98%. The analysis also highlights the model’s robustness in distinguishing between fault classes, as evidenced by the clustering results from T-SNE visualizations. Overall, the LSFConvformer demonstrates a significant reduction in computational cost while maintaining superior classification performance, making it a promising approach for fault diagnosis in practical engineering applications. Future work will focus on extending the model’s applicability and optimizing its architecture through automated methods.