البيانات الاصطناعية للصيانة التنبؤية: مراجعة منهجية وإطار لتطبيقات الصناعة 4.0 Synthetic Data for Predictive Maintenance: A Systematic Review and Framework for Industry 4.0 Applications

المجلة: Journal of Intelligent Manufacturing
DOI: https://doi.org/10.1007/s10845-026-02795-6
تاريخ النشر: 2026-01-28
المؤلف: Walter Nieminen وآخرون
الموضوع الرئيسي: تقنيات تشخيص أعطال الآلات

نظرة عامة

تقدم هذه القسم مراجعة منهجية للأدبيات تتعلق بـ 86 مقالة تمت مراجعتها من قبل الأقران نُشرت منذ عام 2020، تركز على تقنيات توليد البيانات الاصطناعية في سياق الصيانة التنبؤية (PdM) للآلات المتوسطة والثقيلة والعمليات الصناعية. تحدد المراجعة أربع فئات رئيسية من طرق توليد البيانات: تعزيز البيانات، النماذج التوليدية، المحاكاة المعتمدة على الفيزياء، والأساليب الهجينة، والتحولات المعتمدة على الميزات. يتم تحليل نقاط القوة والقيود لكل طريقة، مما يكشف أن النماذج الهجينة والمعتمدة على الفيزياء فعالة بشكل خاص في البيئات الحرجة للسلامة حيث يكون الالتزام بالقوانين الفيزيائية وشفافية النموذج أمرًا بالغ الأهمية.

يقترح الدراسة إطار عمل الصيانة التنبؤية المعززة بالبيانات الاصطناعية (SD-PdM)، وهي منهجية من خمس مراحل مصممة لدمج البيانات الاصطناعية في استراتيجيات الصيانة، مما يدعم حلول الصيانة الذكية القابلة للتوسع، القابلة للتفسير، والاقتصادية. بينما تقدم توليد البيانات الاصطناعية نهجًا قابلاً للتطبيق للتغلب على التحديات مثل ندرة البيانات والتكاليف العالية، فإن فعاليتها تعتمد على الاختيار الدقيق للطرق وعمليات التحقق المخصصة للتطبيقات المحددة. يجب أن تعطي الأبحاث المستقبلية الأولوية للتحقق التجريبي من إطار عمل SD-PdM، ودراسات الحالة الطولية، والتحسينات في قابلية التكيف وشفافية النموذج، مما يضمن أن تواصل البيانات الاصطناعية تعزيز موثوقية وشفافية استراتيجيات الصيانة الصناعية.

مقدمة

في مقدمة هذه الورقة البحثية، يؤكد المؤلفون على أهمية الصيانة التنبؤية (PdM) كاستراتيجية لتعزيز التميز التشغيلي في الأنظمة الصناعية. يسلطون الضوء على التحديات التي تواجه تنفيذ PdM الفعالة بسبب ندرة وجودة البيانات اللازمة لتدريب النماذج التنبؤية. للتخفيف من هذه القضايا، تناقش الورقة إمكانيات البيانات الاصطناعية التي يتم توليدها من خلال تقنيات حسابية متقدمة، مثل الشبكات التوليدية المتعارضة (GANs)، والتوائم الرقمية، والمحاكاة المعتمدة على الفيزياء. يمكن أن تخلق هذه الطرق مجموعات بيانات عالية الدقة تعالج الفجوات في تطبيقات PdM، خاصة في السياقات التي تكون فيها البيانات الواقعية محدودة أو مكلفة للحصول عليها.

يقوم المؤلفون بتحليل منهجي لـ 86 دراسة تمت مراجعتها من قبل الأقران لاستكشاف دور البيانات الاصطناعية في دعم خدمات الصيانة عبر مختلف القطاعات الصناعية، بما في ذلك الطيران والتصنيع. يصنفون طرق توليد البيانات الاصطناعية إلى أربع عائلات: تعزيز البيانات البسيط، مولدات التعلم العميق، المحاكاة المعتمدة على الفيزياء أو الهجينة، وتحولات هندسة الميزات. يتم تقييم كل فئة من حيث نقاط قوتها وضعفها وانتشارها في هذا المجال. تقدم الورقة إطار عمل من خمس مراحل للصيانة التنبؤية المعززة بالبيانات الاصطناعية (SD-PdM)، والذي يوضح كيف يمكن للبيانات الاصطناعية تسهيل نماذج الخدمة المبتكرة مثل الصيانة كخدمة والعقود المعتمدة على الأداء، مما يعزز في النهاية ممارسات الصيانة الأكثر استدامة وفعالية من حيث التكلفة. بالإضافة إلى ذلك، تحدد المراجعة الفجوات البحثية الحالية والاتجاهات الناشئة التي يمكن أن تخفف من اعتماد الصناعة على بيانات الفشل الواقعية النادرة.

الطرق

تناقش هذه القسم المنهجيات المستخدمة في توليد مجموعات البيانات الاصطناعية من خلال الطرق المعتمدة على الفيزياء والهجينة، مع التأكيد على أهميتها في السيناريوهات التي يكون فيها الحصول على بيانات حقيقية تحديًا. تستخدم الطرق المعتمدة على الفيزياء نماذج حسابية لتكرار السلوكيات الواقعية، مما يمكّن من توليد مجموعات بيانات واسعة ومحاكاة الأحداث النادرة. من الجدير بالذكر أن التوائم الرقمية تعمل كنماذج افتراضية تعكس الأنظمة الحقيقية، مما يعزز جودة البيانات ويوفر رؤى حول العمليات الأساسية. توضح دراسات مختلفة، مثل تلك التي أجراها Burger وآخرون (2022) وHarries وآخرون (2023)، تطبيق التوائم الرقمية في نمذجة العمليات وتوليد بيانات التشغيل حتى الفشل للصيانة التنبؤية.

تتناول النماذج الهجينة، التي تجمع بين المحاكاة المعتمدة على الفيزياء والنهج المعتمد على البيانات، قيود كلا المنهجين من خلال دمج الخبرة في المجال لتحقيق نتائج أكثر دقة وقابلية للتفسير. تعتبر هذه التكاملات مفيدة بشكل خاص للأنظمة المعقدة حيث قد تفشل النماذج المعتمدة على الفيزياء بمفردها. ومع ذلك، يتطلب تطوير هذه النماذج معرفة كبيرة بالمجال ويتكبد تكاليف حسابية عالية. كما تسلط هذه القسم الضوء على الحاجة إلى تطوير طرق التحقق في ضوء الاستخدام المتزايد للذكاء الاصطناعي والتعلم الآلي في الصيانة التنبؤية. يجب أن تتكيف طرق التحقق التقليدية، المستندة إلى أطر عمل فيزيائية حتمية، مع الطبيعة الاحتمالية لنماذج الذكاء الاصطناعي. يجب أن تركز طرق التحقق المستقبلية على الواقعية السياقية والجدوى التشغيلية، متجاوزة المقاييس الإحصائية البسيطة لضمان أن تعكس البيانات الاصطناعية بدقة السيناريوهات التشغيلية الواقعية.

النتائج

تقدم قسم النتائج تحليلًا شاملاً لطرق توليد البيانات الاصطناعية المستخدمة في التطبيقات الصناعية. يقيم الحالة الحالية لهذه الطرق، مسلطًا الضوء على فعاليتها وأهميتها في سياقات مختلفة. علاوة على ذلك، تقارن هذه القسم الإطار المقترح مع الأساليب الحالية، موضحة مزاياه وإمكاناته في تعزيز عمليات اتخاذ القرار المعتمدة على البيانات.

بالإضافة إلى ذلك، يكشف تقييم التحسينات التي تم تحقيقها من خلال تنفيذ البيانات الاصطناعية عن تقدم كبير في الكفاءة التشغيلية والدقة. تؤكد النتائج على قيمة دمج تقنيات توليد البيانات الاصطناعية في الممارسات الصناعية، مما يشير إلى أن مثل هذه الطرق يمكن أن تؤدي إلى نتائج تحليلية أكثر قوة وتحسين مقاييس الأداء.

المناقشة

تسلط قسم المناقشة في الورقة البحثية الضوء على تطور توليد البيانات الاصطناعية في الصيانة التنبؤية (PdM) وأهميتها عبر مجالات مختلفة، خاصة في البيئات الصناعية. لقد دفعت ندرة بيانات الفشل في البيئات الصناعية إلى استخدام التوائم الرقمية وتقنيات التعلم الآلي المتقدمة، مثل الشبكات التوليدية المتعارضة (GANs) وآلات الدعم الناقل (SVMs)، لمحاكاة سيناريوهات الفشل الواقعية. ومع ذلك، لا يزال هناك تحدٍ كبير في ضمان التزام البيانات الاصطناعية بالقوانين الفيزيائية، حيث قد تؤدي الأساليب المعتمدة على البيانات فقط إلى سيناريوهات تشغيلية غير واقعية. تؤكد الورقة على الحاجة إلى إطار عمل قائم على دورة الحياة يدمج توليد البيانات الاصطناعية في سير عمل الصيانة بالكامل، مع معالجة الطبيعة الديناميكية للعمليات الصناعية والتغلب على القيود المرتبطة بالتطبيقات الثابتة.

أجرى المؤلفون مراجعة منهجية للأدبيات، محددين ستة تحديات رئيسية للبيانات في PdM الصناعية: ندرة بيانات التشغيل حتى الفشل، عدم توازن البيانات، التجارب عالية التكلفة، التحديات التشغيلية، البيانات المفقودة، والضوضاء. هذه التحديات مترابطة، مما يتطلب حلولًا متكاملة. يدمج إطار العمل المقترح للصيانة التنبؤية المعززة بالبيانات الاصطناعية (SD-PdM) توليد البيانات الاصطناعية بشكل منهجي عبر جميع مراحل الصيانة، مما يمكّن من حلقات تغذية راجعة تكيفية تقوم بتنقيح البيانات مع تطور الظروف التشغيلية. يهدف هذا الإطار إلى تقديم حل شامل لتحسين موثوقية وقابلية التوسع لاستراتيجيات PdM في البيئات الصناعية الحديثة، مما يعالج الفجوات الحرجة في المعايير الحالية ويعزز الفعالية العامة لممارسات الصيانة التنبؤية.

Journal: Journal of Intelligent Manufacturing
DOI: https://doi.org/10.1007/s10845-026-02795-6
Publication Date: 2026-01-28
Author(s): Walter Nieminen et al.
Primary Topic: Machine Fault Diagnosis Techniques

Overview

This section presents a systematic literature review of 86 peer-reviewed articles published since 2020, focusing on synthetic data generation techniques in the context of Predictive Maintenance (PdM) for medium-to-heavy machinery and industrial processes. The review identifies four primary categories of data generation methods: data augmentation, generative models, physics-based simulations and hybrid approaches, and feature-based transformations. Each method’s strengths and limitations are analyzed, revealing that hybrid and physics-informed models are particularly effective in safety-critical environments where adherence to physical laws and model transparency are paramount.

The study proposes the Synthetic Data-Enhanced PdM (SD-PdM) framework, a five-phase methodology designed to integrate synthetic data into maintenance strategies, thereby supporting scalable, explainable, and economically viable smart maintenance solutions. While synthetic data generation offers a viable approach to overcoming challenges such as data scarcity and high costs, its effectiveness is contingent upon the careful selection of methods and validation processes tailored to specific applications. Future research should prioritize empirical validation of the SD-PdM framework, longitudinal case studies, and improvements in model adaptability and explainability, ensuring that synthetic data continues to enhance the reliability and transparency of industrial maintenance strategies.

Introduction

In the introduction of this research paper, the authors emphasize the significance of Predictive Maintenance (PdM) as a strategy for enhancing operational excellence in industrial systems. They highlight the challenges faced in implementing effective PdM due to the scarcity and quality of data necessary for training predictive models. To mitigate these issues, the paper discusses the potential of synthetic data generated through advanced computational techniques, such as Generative Adversarial Networks (GANs), digital twins, and physics-based simulations. These methods can create high-fidelity datasets that address gaps in PdM applications, particularly in contexts where real-world data is limited or costly to obtain.

The authors systematically analyze 86 peer-reviewed studies to explore the role of synthetic data in supporting maintenance services across various industrial sectors, including aerospace and manufacturing. They categorize the synthetic data generation methods into four families: straightforward data augmentation, deep-learning generators, physics-based or hybrid simulations, and feature-engineering transforms. Each category is evaluated for its strengths, weaknesses, and prevalence in the field. The paper introduces a five-stage Synthetic Data-Enhanced PdM (SD-PdM) framework, which illustrates how synthetic data can facilitate innovative service models like maintenance-as-a-service and performance-based contracts, ultimately promoting more sustainable and cost-effective maintenance practices. Additionally, the review identifies existing research gaps and emerging trends that could alleviate the industry’s reliance on scarce real-world failure data.

Methods

The section discusses the methodologies employed in generating synthetic datasets through physics-based and hybrid methods, emphasizing their significance in scenarios where real data acquisition is challenging. Physics-based methods utilize computational models to replicate real-world behaviors, enabling the generation of extensive datasets and the simulation of rare events. Notably, digital twins serve as virtual models that mirror real systems, enhancing data quality and providing insights into underlying processes. Various studies, such as those by Burger et al. (2022) and Harries et al. (2023), illustrate the application of digital twins in modeling operations and generating run-to-failure data for predictive maintenance.

Hybrid models, which combine physics-based simulations with data-driven approaches, address the limitations of both methodologies by incorporating domain expertise to yield more accurate and interpretable results. This integration is particularly beneficial for complex systems where standalone physics-based models may falter. However, the development of these models requires substantial domain knowledge and incurs high computational costs. The section also highlights the need for evolving validation methods in light of the increasing use of AI and machine learning in predictive maintenance. Traditional validation methods, grounded in deterministic physics-based frameworks, must adapt to the probabilistic nature of AI models. Future validation approaches should focus on contextual realism and operational plausibility, moving beyond mere statistical metrics to ensure that synthetic data accurately reflects real-world operational scenarios.

Results

The results section provides a comprehensive analysis of synthetic data generation methods utilized in industrial applications. It evaluates the current state of these methods, highlighting their effectiveness and relevance in various contexts. Furthermore, the section compares the proposed framework against existing approaches, demonstrating its advantages and potential for enhancing data-driven decision-making processes.

Additionally, the assessment of improvements achieved through the implementation of synthetic data reveals significant advancements in operational efficiency and accuracy. The findings underscore the value of integrating synthetic data generation techniques into industrial practices, suggesting that such methods can lead to more robust analytical outcomes and improved performance metrics.

Discussion

The discussion section of the research paper highlights the evolution of synthetic data generation in predictive maintenance (PdM) and its relevance across various domains, particularly in industrial settings. The scarcity of failure data in industrial environments has prompted the use of digital twins and advanced machine learning techniques, such as Generative Adversarial Networks (GANs) and Support Vector Machines (SVMs), to simulate realistic failure scenarios. However, a significant challenge remains in ensuring that synthetic data adheres to physical laws, as purely data-driven approaches may yield unrealistic operational scenarios. The paper emphasizes the need for a lifecycle-based framework that integrates synthetic data generation into the entire maintenance workflow, addressing the dynamic nature of industrial operations and overcoming limitations associated with static implementations.

The authors conducted a systematic literature review, identifying six key data challenges in industrial PdM: scarcity of run-to-failure data, data imbalance, high-cost experiments, operational challenges, missing data, and noise. These challenges are interconnected, necessitating integrated solutions. The proposed Synthetic Data-Enhanced PdM (SD-PdM) framework systematically embeds synthetic data generation across all maintenance phases, enabling adaptive feedback loops that refine data as operational conditions evolve. This framework aims to provide a comprehensive solution for improving the reliability and scalability of PdM strategies in modern industrial environments, thereby addressing critical gaps in existing standards and enhancing the overall effectiveness of predictive maintenance practices.