إطار تعلم جماعي ديناميكي مرجح لتوقع مخاطر القلب والأوعية الدموية في مرض السكري من النوع الثاني: دراسة مقارنة مع قابلية التفسير المعتمدة على SHAP A dynamic weighted ensemble learning framework for cardiovascular risk prediction in type 2 diabetes: a comparative study with SHAP-based interpretability

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-28786-w
PMID: https://pubmed.ncbi.nlm.nih.gov/41274943
تاريخ النشر: 2025-11-22
المؤلف: ChunHong Yuan وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في الرعاية الصحية

نظرة عامة

تتناول ورقة البحث التحدي الكبير الذي تمثله السكري كمرض عام للصحة العامة، وخاصة مضاعفاته القلبية الوعائية، التي تعد سببًا رئيسيًا للوفاة بين الأفراد المتأثرين. تقدم الدراسة نموذجًا تنبؤيًا يهدف إلى تقييم مخاطر القلب والأوعية الدموية لدى مرضى السكري، لكنها تعترف بعدة قيود يجب معالجتها لتعزيز قابلية تطبيق النموذج وموثوقيته.

تشمل القيود الرئيسية الاعتماد على التحقق الداخلي من مركز طبي واحد، مما يثير القلق بشأن إمكانية تعميم النموذج على مجموعات مرضى متنوعة. كما أن غياب تقييم المعايرة يعقد تفسير دقة النموذج التنبؤية، حيث إن مقاييس مثل المساحة تحت المنحنى لمؤشر التشغيل (AUC-ROC) والدقة-الاسترجاع (AUC-PR) لا تؤكد ما إذا كانت الاحتمالات المتوقعة تتماشى مع مستويات المخاطر الفعلية. يوصي المؤلفون بأن تشمل الدراسات المستقبلية منحنيات المعايرة ودرجات بريير لتقييم أكثر قوة. بالإضافة إلى ذلك، فإن التصميم العرضي يقيد القدرة على استنتاج العلاقات السببية ويتطلب دراسات طولية مستقبلية للتحقق من قدرات النموذج التنبؤية على مر الزمن. أخيرًا، بينما يظهر نهج التجميع الديناميكي المحسوب أداءً محسّنًا، قد تعيق تعقيداته الحسابية التطبيق في الوقت الحقيقي في البيئات ذات الموارد المحدودة.

مقدمة

تسلط مقدمة هذه الدراسة الضوء على المخاوف المتزايدة المتعلقة بالصحة العامة المحيطة بداء السكري من النوع 2 ومضاعفاته القلبية الوعائية المرتبطة به، والتي ترتبط بآليات مرضية معقدة مثل الالتهاب المزمن والاضطرابات الأيضية. تكافح نماذج تقييم المخاطر التقليدية، التي تعتمد أساسًا على افتراضات خطية، لتحديد المرضى ذوي المخاطر العالية بدقة بسبب عدم قدرتها على التقاط التفاعلات غير الخطية بين مؤشرات جسم المريض المختلفة. لمعالجة ذلك، أجرت الباحثون دراسة عرضية شملت 2,895 مريضًا بداء السكري من النوع 2 في المستشفى الثاني التابع لجامعة تيانجين للطب التقليدي الصيني، مستخدمين طرق جمع بيانات صارمة تشمل المقابلات، الاستبيانات، القياسات البدنية، والاختبارات المعملية.

استخدمت الدراسة نموذج تعلم تجميعي ديناميكي يدمج بين عدة خوارزميات، بما في ذلك الغابة العشوائية، وتعزيز التدرج، وXGBoost، لتعزيز توقع مخاطر القلب والأوعية الدموية من خلال دمج مؤشرات التشخيص من كل من الطب التقليدي الصيني (TCM) والطب الغربي مع العلامات الحيوية الحديثة. تم تعزيز قابلية تفسير النموذج من خلال قيم SHapley Additive exPlanations (SHAP)، مما يسمح بتقييم كمي لمساهمة كل متغير في ناتج النموذج. تشير النتائج إلى أن النموذج التجميعي يتفوق على نهج الخوارزمية الواحدة، محققًا خصوصية عالية (0.9621) وحساسية (0.9492) في توقع المخاطر. ومن الجدير بالذكر أن ضيق الصدر ومعدل ترسيب كريات الدم الحمراء (ESR) ظهرا كميزات تنبؤية هامة، مما يبرز أهميتها في استراتيجيات الكشف المبكر والتدخل للمرضى ذوي المخاطر العالية، والتي غالبًا ما يتم تجاهلها في النماذج التقليدية.

الطرق

في هذه الدراسة، تم تطوير نظام تقييم شامل متعدد الأبعاد لتقييم أداء النموذج بشكل منهجي. استخدم التقييم مقاييس تصنيف تقليدية، بما في ذلك دقة التصنيف، الحساسية، الخصوصية، ودرجات F1، والتي توفر رؤى حول قدرات النموذج التنبؤية من زوايا مختلفة. بالإضافة إلى ذلك، تم استخدام أدوات رسومية مثل منحنى التشغيل (ROC) ومنحنى الدقة-الاسترجاع (PR) لت quantifying الأداء العام، مع كون المساحة تحت المنحنى (AUC) بمثابة مقياس رئيسي.

تم حساب مقاييس التقييم باستخدام المعادلات التالية:

– الدقة: \( \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN} \)
– الحساسية: \( \text{Sensitivity} = \frac{TP}{TP + FN} \)
– الخصوصية: \( \text{Specificity} = \frac{TN}{TN + FP} \)
– درجة F1: \( \text{F1 Score} = \frac{2 \cdot (Precision \cdot Recall)}{Precision + Recall} \)

حيث \( TP \) تشير إلى الإيجابيات الحقيقية، \( TN \) السلبية الحقيقية، \( FP \) الإيجابيات الكاذبة، و\( FN \) السلبيات الكاذبة. لتعزيز قوة التقييم، تم تنفيذ التحقق المتقاطع بخمس طيات، مما يضمن أن النتائج لم تكن متحيزة بواسطة تقسيم بيانات واحد. تم استخدام مجموعة اختبار مستقلة لاحقًا للتحقق من قدرة النموذج على التعميم، مما يؤكد أداءه على بيانات غير مرئية من خلال تحليل منحنيات ROC وPR والقيم AUC المقابلة.

النتائج

تسلط قسم النتائج الضوء على المخاطر القلبية الوعائية الكبيرة التي يواجهها مرضى السكري مقارنة بالأفراد غير المصابين بالسكري، مما يبرز ضرورة وجود نماذج فعالة لتوقع المخاطر للتدخل المبكر. أظهرت النماذج التقليدية، مثل درجة خطر فرامينغهام ومحرك خطر دراسة السكري المستقبلية في المملكة المتحدة (UKPDS)، أنها تبالغ في تقدير المخاطر القلبية الوعائية لدى مرضى السكري بحوالي 1.7 إلى 2 مرة، مع إحصائيات C تتراوح من 0.57 إلى 0.71. وهذا يشير إلى الحاجة إلى تعديلات على هذه الطرق الكلاسيكية لتحسين قدرتها على التعميم.

لمعالجة هذه القيود، قدمت الكلية الأمريكية لأمراض القلب/الرابطة الأمريكية للقلب (ACC/AHA) معادلات مجموعة الأفراد المجمعة، التي تهدف إلى توفير توقع أكثر دقة لمخاطر الأمراض القلبية الوعائية التصلبية (ASCVD) عبر مجموعات سكانية متنوعة. ومع ذلك، لا يزال هناك انحياز في المعايرة داخل المجموعات الفرعية لمرضى السكري. استخدمت الدراسات الحديثة بشكل متزايد تقنيات التعلم الآلي لتعزيز دقة توقع مخاطر القلب والأوعية الدموية لمرضى السكري. ومن الجدير بالذكر أن تطوير معادلة توقع خطر RECODe أظهر معايرة متفوقة مقارنة بنماذج UKPDS وACC/AHA. بالإضافة إلى ذلك، أظهرت نهج التعلم الآلي، بما في ذلك خوارزميات التعلم التجميعي ونماذج التعلم العميق، وعدًا في التفوق على الآلات الحاسبة التقليدية، كما يتضح من آلة حساب خطر ASCVD الجديدة التي تفوقت على نموذج ACC/AHA في توقع المخاطر القلبية الوعائية ضمن مجموعة ACCORD.

المناقشة

تسلط قسم المناقشة في ورقة البحث الضوء على المشهد المتطور لتوقع مخاطر القلب والأوعية الدموية في السكري، مع التركيز على دمج مؤشرات التشخيص من الطب التقليدي الصيني (TCM) مع تقنيات التعلم الآلي الحديثة. تشير النتائج الرئيسية من الدراسات ذات الصلة، مثل تجارب ACCORD وADVANCE، إلى أن التحكم المكثف في الجلوكوز قد لا يفيد جميع مرضى السكري بشكل عالمي، مما يبرز الحاجة إلى تصنيف المخاطر الشخصية. أظهرت نماذج التعلم الآلي، وخاصة تلك التي تستخدم خوارزميات الغابة المتدرجة، وعدًا في توقع الوفيات بناءً على عوامل مثل العمر، ومؤشر كتلة الجسم (BMI)، ومؤشر جليكوز الهيموغلوبين (HGI). ومع ذلك، لا تزال قابلية تعميم هذه النماذج مصدر قلق، مما يتطلب تحققًا متعدد المراكز لضمان قابلية تطبيقها عبر مجموعات سكانية متنوعة.

علاوة على ذلك، تناقش الورقة تطبيق طرق التشخيص من الطب التقليدي الصيني، مثل تشخيص اللسان، في تقييم السكري ومضاعفاته. كشفت التطورات الأخيرة في التحليل الرقمي لميزات اللسان عن ارتباطات كبيرة مع المضاعفات الوعائية الكبيرة، مما يشير إلى أن مؤشرات التشخيص من الطب التقليدي الصيني يمكن أن توفر رؤى قيمة حول تقدم المرض. إن دمج الطب التقليدي الصيني مع العلامات الحيوية الحديثة من خلال نهج التعلم الآلي لديه القدرة على تعزيز دقة التنبؤ وتسهيل استراتيجيات التدخل الشخصية. على الرغم من هذه التقدمات، لا تزال هناك تحديات، بما في ذلك الحاجة إلى دمج البيانات بشكل فعال من مصادر غير متجانسة، وتحسين قابلية تفسير النموذج، وضمان قابلية التطبيق السريري من خلال التحقق الدقيق. تهدف الدراسة إلى معالجة هذه التحديات من خلال تطوير نموذج متكامل قوي يستفيد من كل من وجهات نظر الطب التقليدي الصيني والطب الغربي لتعزيز تقييم مخاطر القلب والأوعية الدموية لدى مرضى السكري من النوع 2.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-28786-w
PMID: https://pubmed.ncbi.nlm.nih.gov/41274943
Publication Date: 2025-11-22
Author(s): ChunHong Yuan et al.
Primary Topic: Artificial Intelligence in Healthcare

Overview

The research paper addresses the significant public health challenge posed by diabetes mellitus, particularly its cardiovascular complications, which are a leading cause of mortality among affected individuals. The study presents a predictive model aimed at assessing cardiovascular risk in diabetic patients, but it acknowledges several limitations that must be addressed to enhance the model’s applicability and reliability.

Key limitations include the reliance on internal validation from a single medical center, which raises concerns about the model’s generalizability to diverse patient populations. The absence of calibration assessment further complicates the interpretation of the model’s predictive accuracy, as metrics like the area under the curve for receiver operating characteristic (AUC-ROC) and precision-recall (AUC-PR) do not confirm whether predicted probabilities align with actual risk levels. The authors recommend future studies incorporate calibration curves and Brier scores for a more robust evaluation. Additionally, the cross-sectional design restricts the ability to infer causal relationships and necessitates prospective longitudinal studies to validate the model’s predictive capabilities over time. Lastly, while the dynamic weighted ensemble approach shows improved performance, its computational complexity may hinder real-time application in resource-constrained environments.

Introduction

The introduction of this study highlights the growing public health concerns surrounding Type 2 Diabetes and its associated cardiovascular complications, which are linked to complex pathological mechanisms such as chronic inflammation and metabolic disorders. Traditional risk assessment models, primarily based on linear assumptions, struggle to accurately identify high-risk patients due to their inability to capture nonlinear interactions among various patient body indexes. To address this, the researchers conducted a cross-sectional study involving 2,895 Type 2 diabetes patients at the Second Affiliated Hospital of Tianjin University of Traditional Chinese Medicine, employing rigorous data collection methods including interviews, questionnaires, physical measurements, and laboratory tests.

The study utilized a dynamic weighted ensemble learning model that integrates multiple algorithms, including random forest, gradient boosting, and XGBoost, to enhance the prediction of cardiovascular risks by combining diagnostic indexes from both Traditional Chinese Medicine (TCM) and Western medicine with modern biomarkers. The model’s interpretability was further enhanced through SHapley Additive exPlanations (SHAP) values, allowing for a quantifiable assessment of each variable’s contribution to the model’s output. The findings indicate that the ensemble model outperforms single algorithm approaches, achieving high specificity (0.9621) and sensitivity (0.9492) in risk prediction. Notably, chest tightness and erythrocyte sedimentation rate (ESR) emerged as significant predictive features, underscoring their relevance in early detection and intervention strategies for high-risk patients, which are often overlooked in traditional models.

Methods

In this study, a comprehensive multi-dimensional evaluation system was developed to systematically assess model performance. The evaluation utilized conventional classification metrics, including classification accuracy, sensitivity, specificity, and F1 scores, which provide insights into the model’s predictive capabilities from various angles. Additionally, graphical tools such as the receiver operating characteristic (ROC) curve and precision-recall (PR) curve were employed to quantify overall performance, with the area under the curve (AUC) serving as a key metric.

The evaluation metrics were computed using the following equations:

– Accuracy: \( \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN} \)
– Sensitivity: \( \text{Sensitivity} = \frac{TP}{TP + FN} \)
– Specificity: \( \text{Specificity} = \frac{TN}{TN + FP} \)
– F1 Score: \( \text{F1 Score} = \frac{2 \cdot (Precision \cdot Recall)}{Precision + Recall} \)

where \( TP \) denotes true positives, \( TN \) true negatives, \( FP \) false positives, and \( FN \) false negatives. To enhance the robustness of the evaluation, 5-fold cross-validation was implemented, ensuring that the results were not biased by a single data partition. An independent test set was subsequently used to validate the model’s generalization ability, further confirming its performance on unseen data through the analysis of ROC and PR curves and the corresponding AUC values.

Results

The results section highlights the significant cardiovascular risk faced by diabetic patients compared to non-diabetic individuals, underscoring the necessity for effective risk prediction models for early intervention. Traditional models, such as the Framingham Risk Score and the UK Prospective Diabetes Study (UKPDS) Risk Engine, have been shown to overestimate cardiovascular risk in diabetic patients by approximately 1.7 to 2 times, with C-statistics ranging from 0.57 to 0.71. This indicates a need for adjustments to these classical methods to improve their generalization ability.

To address these limitations, the American College of Cardiology/American Heart Association (ACC/AHA) introduced the Pooled Cohort Equations, which aim to provide a more accurate risk prediction for atherosclerotic cardiovascular disease (ASCVD) across diverse populations. However, calibration bias persists within diabetic subgroups. Recent studies have increasingly employed machine learning techniques to enhance cardiovascular risk prediction accuracy for diabetic patients. Notably, the development of the RECODe risk prediction equation has demonstrated superior calibration compared to the UKPDS and ACC/AHA models. Additionally, machine learning approaches, including ensemble learning algorithms and deep learning models, have shown promise in outperforming traditional calculators, as evidenced by a novel ASCVD risk calculator that surpassed the ACC/AHA model in predicting cardiovascular risk within the Action to Control Cardiovascular Risk in Diabetes (ACCORD) cohort.

Discussion

The discussion section of the research paper highlights the evolving landscape of cardiovascular risk prediction in diabetes, emphasizing the integration of traditional Chinese medicine (TCM) diagnostic indexes with modern machine learning techniques. Key findings from related studies, such as the ACCORD and ADVANCE trials, indicate that intensive glucose control may not universally benefit all diabetic patients, underscoring the need for personalized risk stratification. Machine learning models, particularly those utilizing gradient forest algorithms, have shown promise in predicting mortality based on factors like age, body mass index (BMI), and hemoglobin glycation index (HGI). However, the generalizability of these models remains a concern, necessitating multi-center validation to ensure applicability across diverse populations.

Furthermore, the paper discusses the application of TCM diagnostic methods, such as tongue diagnosis, in assessing diabetes and its complications. Recent advancements in digital analysis of tongue features have revealed significant correlations with macrovascular complications, suggesting that TCM diagnostic indexes can provide valuable insights into disease progression. The integration of TCM with modern biomarkers through machine learning approaches has the potential to enhance predictive accuracy and facilitate personalized intervention strategies. Despite these advancements, challenges remain, including the need for effective data fusion from heterogeneous sources, improving model interpretability, and ensuring clinical applicability through rigorous validation. The study aims to address these challenges by developing a robust, integrated model that leverages both TCM and Western medical perspectives to enhance cardiovascular risk assessment in Type 2 diabetes patients.