نظام خبير قائم على مشفر تلقائي مكدس هجين وآلات الدعم المتجه للكشف عن فشل القلب A hybrid stacked autoencoder and support vector machines-based expert system for heart failure detection

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-34430-4
PMID: https://pubmed.ncbi.nlm.nih.gov/41507422
تاريخ النشر: 2026-01-08
المؤلف: Mian Muhammad Kamal وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في الرعاية الصحية

نظرة عامة

تقدم ورقة البحث نظام خبير هجين مبتكر من ثلاث مراحل يهدف إلى تعزيز اكتشاف فشل القلب (HF) من خلال تحسين طرق التشخيص. يستخدم النظام مشفرًا تلقائيًا مكدسًا (AE) لاستخراج الميزات من عوامل خطر HF، يليه آلة دعم المتجهات (SVM) مع عقوبة L1 التي تختار مجموعة فرعية عالية الجودة من الميزات. في مرحلة التصنيف النهائية، يتم تطبيق SVM غير الخطية باستخدام دالة الأساس الشعاعي (RBF) على مجموعة الميزات المكررة. يكشف التحقق من صحة مجموعة بيانات HF المرجعية أن النظام يحقق مقاييس مثيرة للإعجاب: دقة 97.78%، حساسية 97.56%، خصوصية 97.96%، ومعامل ارتباط ماثيو (MCC) قدره 0.955، مما يتفوق على الطرق الحالية المتطورة.

تؤكد الخاتمة على كفاءة النظام، حيث يعمل على مجموعة فرعية مخفضة من الميزات بحجم 11، مما يساهم في تقليل التعقيد الحسابي. على الرغم من هذه النتائج الواعدة، يعترف المؤلفون بوجود قيد في عدم وجود تحقق على مجموعات بيانات أكبر ومتعددة المراكز. يجب أن تركز الأبحاث المستقبلية على جمع بيانات متنوعة من إعدادات سريرية متعددة لاختبار عمومية النماذج المقترحة بدقة، بهدف نهائي يتمثل في النشر السريري في الوقت الحقيقي. يمثل هذا العمل تقدمًا كبيرًا في اكتشاف HF، حيث يقدم دعمًا قيمًا لاتخاذ القرار للمهنيين في الرعاية الصحية.

مقدمة

تتناول مقدمة الورقة فشل القلب (HF)، وهي حالة تتميز بعدم قدرة القلب على ضخ كمية كافية من الدم بسبب مشاكل في الشرايين التاجية. تشمل الأعراض ضيق التنفس والتورم، مع تحديات تشخيصية شائعة في المناطق النامية بسبب محدودية الموارد الطبية. تعتبر طرق التشخيص التقليدية، وخاصة تصوير الأوعية، مكلفة وتتطلب مهارات متخصصة، مما يدفع إلى التحول نحو تقنيات التعلم الآلي والتنقيب عن البيانات لتصنيف HF. تم تطوير أنظمة خبير مختلفة، تستخدم خوارزميات مثل الجيران الأقرب (KNN)، وآلات دعم المتجهات (SVM)، والشبكات العصبية، محققة دقة اكتشاف تتراوح من 84.5% إلى 93.3%.

تسلط الورقة الضوء على المشكلة المستمرة للإفراط في التكيف في النماذج الحالية، حيث لا تتوافق الدقة العالية في الاختبار مع أداء التدريب. لمعالجة ذلك، يقترح المؤلفون نظامًا هجينًا ذكيًا مبتكرًا يدمج مشفرًا تلقائيًا مكدسًا لاستخراج الميزات مع SVM خطية معاقبة L1 وSVM غير خطية لتحسين اكتشاف HF. تهدف الدراسة إلى تعزيز كل من دقة التدريب والاختبار، باستخدام طرق تقييم قوية مثل اختبار التدريب واختبار التقاطع k-fold. ستفصل الأقسام التالية من الورقة مجموعة البيانات، ومقاييس التقييم، والنتائج التجريبية، ومناقشة النتائج، مما يؤدي إلى ملخص للمساهمات والنتائج.

طرق

تحدد قسم “المواد والطرق” تصميم التجربة والإجراءات المستخدمة في الدراسة. يوضح المواد المحددة المستخدمة، بما في ذلك أي مواد كيميائية، ومعدات، وعينات بيولوجية، مما يضمن إمكانية تكرار التجارب. تشمل المنهجية التقنيات المطبقة لجمع البيانات وتحليلها، مثل الاختبارات الإحصائية أو النماذج الحسابية، والتي تعتبر ضرورية للتحقق من النتائج.

بالإضافة إلى ذلك، قد يصف القسم إعداد التجربة، بما في ذلك تدابير التحكم وأحجام العينات، لضمان موثوقية النتائج. من خلال تقديم نظرة شاملة على الطرق، يؤسس هذا القسم أساسًا لفهم كيفية إجراء البحث ويدعم مصداقية الاستنتاجات المستخلصة في الدراسة.

نتائج

في هذا القسم، يقدم المؤلفون النتائج من ثلاثة إعدادات تجريبية تهدف إلى تعزيز اكتشاف التردد العالي (HF). أنشأ الإعداد الأول خط أساس باستخدام نموذج آلة دعم المتجهات (SVM). تقدم الإعداد الثاني من خلال استخدام تكوين مكدس من SVMs لتحسين قدرات الاكتشاف. قدم الإعداد الثالث والأخير نظام خبير مبتكر يدمج مشفرات تلقائية مكدسة مع SVMs معاقبة L1 وL2، مصممة خصيصًا لتسهيل دراسة الإزالة. تهدف هذه الدراسة إلى إظهار أهمية كل مكون ضمن نظام اكتشاف HF الهجين ثلاثي المراحل المقترح.

تتميز بنية المشفر التلقائي المستخدمة في التجارب بطبقة إدخال تحتوي على 13 عقدة، وطبقة مخفية تحتوي على 32 عقدة، وطبقة تمثيل كامنة تحتوي أيضًا على 13 عقدة. يعكس جهاز فك التشفير هذه البنية مع طبقات من 32 و13 عقدة، على التوالي. استخدمت جميع الطبقات المخفية دالة تنشيط ReLU، وتم تدريب النموذج على مدى 50 حقبة. أجريت التجارب على نظام مزود بمعالج Intel(R) Core(TM) Ultra 7 155H، وذاكرة وصول عشوائي سعتها 16 جيجابايت، ويعمل بنسخة 64 بت من Windows 11 Pro، باستخدام Python ومكتباته للمحاكاة والتحليل.

مناقشة

في هذه الدراسة، طورنا نظام خبير مبتكر ثلاثي المراحل لاكتشاف فشل القلب (HF)، باستخدام مجموعة بيانات مرض القلب في كليفلاند، التي تتكون من 297 حالة كاملة وتركز على 13 ميزة ذات صلة سريرية. يدمج الأسلوب المقترح مشفرًا تلقائيًا مكدسًا (AE) لاستخراج الميزات، يليه آلة دعم المتجهات (SVM) معاقبة L1 لاختيار الميزات، وينتهي بـ SVM غير خطية للتصنيف. يعزز AE المكدس تمثيل الميزات، بينما يعزز L1-SVM التشتت في مجموعة الميزات، مما يحدد بفعالية المتغيرات الأكثر صلة بالتصنيف. حقق SVM غير الخطية النهائي، باستخدام دالة الأساس الشعاعي (RBF)، مقاييس أداء ملحوظة، بما في ذلك دقة 97.78%، حساسية 97.56%، وخصوصية 97.96%، مما يظهر نتائج متفوقة مقارنة بالطرق الحالية.

على الرغم من النتائج الواعدة، تعترف الدراسة بالقيود، وخاصة حجم مجموعة البيانات الصغيرة، والتي قد تؤثر على عمومية النموذج في الإعدادات السريرية الواقعية. يجب أن تهدف الأبحاث المستقبلية إلى جمع مجموعات بيانات أكبر ومتعددة المراكز للتحقق من أداء النموذج بشكل شامل. بالإضافة إلى ذلك، يمكن أن يوفر تنفيذ تحقق A-Test تقييمًا أكثر قوة لقدرات تعميم النموذج. بشكل عام، يمثل نظام الخبراء المقترح تقدمًا كبيرًا في اكتشاف HF، مما يبرز أهمية دمج تقنيات التعلم الآلي المتقدمة لتحسين النتائج السريرية.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-34430-4
PMID: https://pubmed.ncbi.nlm.nih.gov/41507422
Publication Date: 2026-01-08
Author(s): Mian Muhammad Kamal et al.
Primary Topic: Artificial Intelligence in Healthcare

Overview

The research paper presents a novel hybrid three-stage expert system aimed at enhancing heart failure (HF) detection through improved diagnostic methods. The system employs a stacked autoencoder (AE) for feature extraction from HF risk factors, followed by an L1-penalized support vector machine (SVM) that selects a high-quality subset of features. In the final classification stage, a non-linear SVM utilizing a radial basis function (RBF) kernel is applied to the refined feature set. Validation on a benchmark HF dataset reveals that the system achieves impressive metrics: an accuracy of 97.78%, sensitivity of 97.56%, specificity of 97.96%, and a Matthews correlation coefficient (MCC) of 0.955, thereby outperforming existing state-of-the-art methods.

The conclusion emphasizes the system’s efficiency, operating on a reduced feature subset of size 11, which contributes to lower computational complexity. Despite these promising results, the authors acknowledge a limitation in the lack of validation on larger, multi-center datasets. Future research should focus on gathering diverse data from multiple clinical settings to rigorously test the generalizability of the proposed models, ultimately aiming for real-time clinical deployment. This work represents a significant advancement in HF detection, offering valuable decision support for healthcare professionals.

Introduction

The introduction of the paper addresses heart failure (HF), a condition marked by the heart’s inability to pump sufficient blood due to coronary artery issues. Symptoms include shortness of breath and swelling, with diagnostic challenges prevalent in underdeveloped regions due to limited medical resources. Traditional diagnostic methods, particularly angiography, are costly and require specialized skills, prompting a shift towards machine learning and data mining techniques for HF classification. Various expert systems have been developed, employing algorithms such as k-nearest neighbors (KNN), support vector machines (SVM), and neural networks, achieving detection accuracies ranging from 84.5% to 93.3%.

The paper highlights the persistent issue of overfitting in existing models, where high testing accuracy does not correlate with training performance. To address this, the authors propose a novel hybrid intelligent system that integrates a stacked autoencoder for feature extraction with L1 penalized linear SVM and non-linear SVM for improved HF detection. The study aims to enhance both training and testing accuracies, utilizing robust evaluation methods such as train-test and k-fold cross-validation. The subsequent sections of the paper will detail the dataset, evaluation metrics, experimental results, and a discussion of findings, culminating in a summary of contributions and results.

Methods

The “Materials and Methods” section outlines the experimental design and procedures employed in the study. It details the specific materials used, including any reagents, equipment, and biological samples, ensuring reproducibility of the experiments. The methodology encompasses the techniques applied for data collection and analysis, such as statistical tests or computational models, which are crucial for validating the findings.

Additionally, the section may describe the experimental setup, including control measures and sample sizes, to ensure the reliability of the results. By providing a comprehensive overview of the methods, this section establishes a foundation for understanding how the research was conducted and supports the credibility of the conclusions drawn in the study.

Results

In this section, the authors present the results from three experimental setups aimed at enhancing high-frequency (HF) detection. The first setup established a baseline using a Support Vector Machine (SVM) model. The second setup advanced this by employing a stacked configuration of two SVMs to improve detection capabilities. The third and final setup introduced a novel expert system that integrates stacked autoencoders with L1 and L2 regularized SVMs, specifically designed to facilitate an ablation study. This study aims to demonstrate the significance of each component within the proposed three-stage hybrid HF detection system.

The autoencoder architecture utilized in the experiments features an input layer with 13 nodes, a hidden layer with 32 nodes, and a latent representation layer also comprising 13 nodes. The decoder mirrors this structure with layers of 32 and 13 nodes, respectively. All hidden layers employed the ReLU activation function, and the model was trained over 50 epochs. The experiments were conducted on a system equipped with an Intel(R) Core(TM) Ultra 7 155H processor, 16 GB of RAM, and running a 64-bit version of Windows 11 Pro, utilizing Python and its libraries for simulation and analysis.

Discussion

In this study, we developed a novel three-stage expert system for heart failure (HF) detection, utilizing the Cleveland Heart Disease dataset, which comprises 297 complete cases and focuses on 13 clinically relevant features. The proposed method integrates a stacked autoencoder (AE) for feature extraction, followed by an L1-penalized support vector machine (SVM) for feature selection, and concludes with a non-linear SVM for classification. The stacked AE enhances feature representation, while the L1-SVM promotes sparsity in the feature set, effectively identifying the most relevant variables for classification. The final non-linear SVM, utilizing a radial basis function (RBF) kernel, achieved remarkable performance metrics, including 97.78% accuracy, 97.56% sensitivity, and 97.96% specificity, demonstrating superior results compared to existing methods.

Despite the promising outcomes, the study acknowledges limitations, particularly the small dataset size, which may affect the generalizability of the model in real-world clinical settings. Future research should aim to collect larger, multi-center datasets to validate the model’s performance comprehensively. Additionally, implementing A-Test validation could provide a more robust assessment of the model’s generalization capabilities. Overall, the proposed expert system represents a significant advancement in HF detection, emphasizing the importance of integrating advanced machine learning techniques for improved clinical outcomes.