الذكاء الاصطناعي القابل للتفسير لتشخيص مرض الزهايمر المبكر باستخدام ميزات علاقات رمادية محسنة وبيانات متعددة الأنماط Explainable artificial intelligence for early Alzheimer’s diagnosis using enhanced grey relational features and multimodal data

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-026-43707-1
PMID: https://pubmed.ncbi.nlm.nih.gov/41844810
تاريخ النشر: 2026-03-17
المؤلف: Wusat Ullah وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي القابل للتفسير (XAI)

نظرة عامة

تتناول ورقة البحث القضية الملحة لمرض الزهايمر، وهو اضطراب تنكسي عصبي يتميز بارتفاع انتشاره والتحديات المرتبطة بالتشخيص المبكر. يقترح المؤلفون بنية تعلم آلي قابلة للتفسير تستفيد من البيانات السريرية والسلوكية متعددة الأنماط، بما في ذلك البيانات الديموغرافية، وعوامل الخطر الوعائية، وخيارات نمط الحياة، والتقييمات المعرفية. لتعزيز الأداء التنبؤي مع الحفاظ على القابلية للتفسير، تستخدم الدراسة هندسة ميزات واسعة لإنشاء ميزات مركبة مثل نسبة ضغط الدم، ونسبة عمر MMSE، ونسبة الكوليسترول، ودرجة التدهور المعرفي. بالإضافة إلى ذلك، يتم استخدام تقنية زيادة العينة الأقلية الاصطناعية لمعالجة عدم توازن الفئات، ويتم تقديم مؤشر درجة العلاقة الرمادية الجديد، مما يحسن بشكل كبير من ارتباط الميزات بالتشخيص من 0.725 إلى 0.891.

في تحليلهم المقارن لسبعة مصنفات رئيسية—بما في ذلك الانحدار اللوجستي، وغابة عشوائية، والشبكات العصبية العميقة—يجد المؤلفون أن الشبكات العصبية العميقة تحقق أعلى مقاييس الأداء، حيث تحقق دقة تبلغ 98.01% وAUC تبلغ 99.43%. يتبعها تجميع CatBoost القائم على التجميع بدقة تبلغ 97.91% وAUC تبلغ 98.10%. يعزز استخدام تفسيرات شابلي الإضافية من قابلية تفسير النموذج من خلال تحديد المتنبئين الرئيسيين القابلين للتعديل مثل التاريخ العائلي، والتدخين، والأعراض المعرفية المبكرة. بشكل عام، توضح الدراسة أن دمج مقاييس درجة العلاقة الرمادية المتقدمة مع تقنيات التعلم الآلي القوية يمكن أن يؤدي إلى نماذج دقيقة وقابلة للتفسير للتشخيص المبكر لمرض الزهايمر.

مقدمة

يُعتبر مرض الزهايمر (AD) أحد الأسباب الرئيسية للخرف، حيث يؤثر على أكثر من 55 مليون فرد على مستوى العالم. يتميز بتراكم البروتينات غير الطبيعية مثل لويحات الأميلويد والتشابكات العصبية، مما يعطل وظيفة الخلايا العصبية ويؤدي إلى موت الخلايا. بينما تمثل الطفرات الجينية مثل APP وPS1 وPS2 جزءًا صغيرًا من الحالات، فإن الطبيعة متعددة العوامل للمرض تشمل عوامل بيولوجية وجينية ونمط حياة مختلفة، مما يعقد التشخيص المبكر. الطرق التشخيصية الحالية، بما في ذلك التقييمات المعرفية وتصوير الدماغ، محدودة في قدرتها على اكتشاف AD في مراحله الأولية. تقترح هذه الدراسة إطار عمل جديد يدمج العلامات البيولوجية والسلوكية لتعزيز استراتيجيات التدخل المبكر والوقاية من خلال عوامل نمط الحياة القابلة للتعديل.

تقدم التطورات الأخيرة في التعلم الآلي (ML) والتعلم العميق (DL) فرصًا كبيرة لتحسين تشخيص AD من خلال تحليل بيانات معقدة وعالية الأبعاد. تكافح خوارزميات ML التقليدية، على الرغم من قابليتها للتفسير، مع العلاقات غير الخطية المتأصلة في الأمراض التنكسية العصبية. لمعالجة هذه القيود، تستخدم الدراسة طرق دمج متقدمة مثل XGBoost وLightGBM، جنبًا إلى جنب مع التحليل الرمادي لتحديد العوامل المؤثرة الرئيسية. تقيم الأبحاث أداء سبعة نماذج، بما في ذلك الانحدار اللوجستي وطرق التجميع المختلفة، مع معالجة عدم توازن الفئات من خلال تقنية SMOTE. تسلط النتائج الضوء على ضعف الذاكرة، والمشاكل السلوكية، وصحة القلب والأوعية الدموية كعوامل تنبؤية حاسمة، حيث يقدم إطار العمل للدراسة دقة وشفافية محسنتين في فهم مخاطر AD من خلال استخدام قيم SHAP لتحديد أهمية الميزات.

طرق

تستخدم المنهجية المعتمدة في هذه الدراسة إطار تحليل متعدد الأنماط منظم يدمج بفعالية نقاط القوة في التعلم الإحصائي التقليدي، والتعلم الآلي، وتقنيات التعلم العميق. من خلال دمج هذه الأساليب، تهدف الأبحاث إلى تطوير نموذج محسّن يظهر أداءً متفوقًا في مهام التحليل. تم تصميم هذا النموذج القوي للاستفادة من الفوائد التكميلية لكل طريقة، وبالتالي تحقيق نتائج مثلى في أهداف الدراسة، كما هو موضح في الشكل 4.

النتائج

تظهر نتائج الدراسة المقارنة على خمسة نماذج تعلم آلي—مصنف الغابة العشوائية (RFC)، XGBoost (XGB)، LightGBM (LGBM)، CatBoost، والانحدار اللوجستي—تباينًا كبيرًا في الأداء التنبؤي لمهمة التصنيف. حقق الانحدار اللوجستي أدنى دقة وROC-AUC بنسبة 80.2%، ويعزى ذلك إلى حدوده الخطية في القرار، والتي واجهت صعوبة مع الميزات المعقدة. في المقابل، تفوقت النماذج القائمة على الأشجار، حيث حقق CatBoost أعلى دقة وROC-AUC بنسبة 91.9%، تليه LGBM (91.2%) وXGB (91.0%). كما قدم نموذج الغابة العشوائية أداءً جيدًا بنسبة 89.2%، مما يبرز فعالية التعلم التجميعي في التقاط الأنماط غير الخطية. من الجدير بالذكر أن المتعلم غير الخطي لـ CatBoost تفوق على المتعلم الخطي لتجميع الانحدار اللوجستي، مما يدل على قدراته التنبؤية المتفوقة، خاصة في اتخاذ القرارات الطبية ذات المخاطر العالية.

استخدمت الدراسة نهجًا صارمًا للتقييم المتقاطع الطبقي بخمسة طيات، مما أسفر عن دقة تصنيف متوسطة تتراوح من 84.62% إلى 86.51%، مع دقة متوسطة إجمالية تبلغ 85.11% وانحراف معياري قدره ±0.71%، مما يعكس أداءً مستقرًا عبر تقسيمات البيانات. بالإضافة إلى ذلك، عززت الابتكارات في الشبكات العصبية العميقة (DNNs) أداء النموذج بشكل كبير، حيث حقق DNN الأساسي AUC قدره 0.9746، والذي تحسن إلى 0.9889 بعد تنفيذ تقنيات تنظيم متقدمة وضبط المعلمات. يبرز هذا التقدم الدور الحاسم للتنظيم المناسب وتوازن الفئات في تطبيقات الذكاء الاصطناعي الطبية، مما يضع DNN في أعلى مستوى من أنظمة التشخيص مع معدل خطأ يبلغ 1.11% فقط.

مناقشة

تسلط قسم المناقشة في الورقة الضوء على التقدم الكبير في منهجيات التعلم الآلي (ML) والتعلم العميق (DL) لتشخيص مرض الزهايمر (AD). ويؤكد على التطور من النماذج الإحصائية التقليدية إلى الخوارزميات المتطورة القادرة على معالجة مجموعات بيانات عالية الأبعاد ومتنوعة. من الجدير بالذكر أن تقنيات ML مثل آلات الدعم الشعاعي، والانحدار اللوجستي، وطرق التجميع مثل XGBoost وCatBoost قد أظهرت وعدًا في تحديد AD من خلال قدرتها على التقاط الأنماط المعقدة في البيانات السريرية وتصوير الأعصاب. تناقش الورقة أيضًا أهمية قابلية تفسير النموذج، خاصة من خلال طرق مثل SHAP (تفسيرات شابلي الإضافية)، التي تعزز الشفافية في التطبيقات السريرية من خلال توضيح تأثير الميزات المختلفة على تنبؤات النموذج.

علاوة على ذلك، تقدم الأبحاث إطار عمل متعدد المستويات لتنبؤ AD يدمج النماذج التقليدية، وتقنيات التجميع، وطرق DL، جنبًا إلى جنب مع SHAP من أجل القابلية للتفسير. تتكون مجموعة البيانات المستخدمة من 2,149 سجلًا للمرضى مع 35 ميزة، والتي خضعت لعمليات معالجة مسبقة صارمة، بما في ذلك التطبيع وهندسة الميزات لتعزيز أداء النموذج. تؤكد النتائج على إمكانية دمج تقنيات النمذجة المتنوعة وإطارات القابلية للتفسير لتحسين دقة التشخيص والثقة السريرية في التنبؤات المدفوعة بالذكاء الاصطناعي لمرض الزهايمر، مما يسهم في استراتيجيات إدارة المرضى الأكثر فعالية.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-026-43707-1
PMID: https://pubmed.ncbi.nlm.nih.gov/41844810
Publication Date: 2026-03-17
Author(s): Wusat Ullah et al.
Primary Topic: Explainable Artificial Intelligence (XAI)

Overview

The research paper addresses the pressing issue of Alzheimer’s disease, a neurodegenerative disorder characterized by its rising prevalence and the challenges associated with early diagnosis. The authors propose an interpretable machine learning architecture that leverages multimodal clinical and behavioral data, including demographics, vascular risk factors, lifestyle choices, and cognitive assessments. To enhance predictive performance while maintaining interpretability, the study employs extensive feature engineering to create composite features such as blood pressure ratio, MMSE age ratio, cholesterol ratio, and cognitive decline score. Additionally, the Synthetic Minority Oversampling Technique is utilized to tackle class imbalance, and a novel Grey Relational Grade index is introduced, significantly improving feature-diagnosis correlation from 0.725 to 0.891.

In their comparative analysis of seven mainstream classifiers—including Logistic Regression, Random Forest, and Deep Neural Networks—the authors find that Deep Neural Networks yield the highest performance metrics, achieving an accuracy of 98.01% and an AUC of 99.43%. The CatBoost-based Stacking Ensemble follows closely with an accuracy of 97.91% and an AUC of 98.10%. The use of Shapley Additive Explanations further enhances model interpretability by identifying key modifiable predictors such as family history, smoking, and early cognitive symptoms. Overall, the study demonstrates that integrating advanced Grey Relational Grade metrics with robust machine learning techniques can lead to accurate and interpretable models for the early diagnosis of Alzheimer’s disease.

Introduction

Alzheimer’s disease (AD) is a leading cause of dementia, affecting over 55 million individuals globally. Characterized by the accumulation of abnormal proteins such as amyloid plaques and neurofibrillary tangles, AD disrupts neuronal function and leads to cell death. While genetic mutations like APP, PS1, and PS2 account for a minority of cases, the disease’s multifactorial nature involves various biological, genetic, and lifestyle factors, complicating early diagnosis. Current diagnostic methods, including cognitive assessments and brain imaging, are limited in their ability to detect AD in its initial stages. This study proposes a novel framework that integrates biological and behavioral markers to enhance early intervention and prevention strategies through modifiable lifestyle factors.

Recent advancements in Machine Learning (ML) and Deep Learning (DL) present significant opportunities for improving AD diagnosis by analyzing complex, high-dimensional data. Traditional ML algorithms, while interpretable, struggle with nonlinear relationships inherent in neurodegenerative diseases. To address these limitations, the study employs advanced integration methods such as XGBoost and LightGBM, alongside grey relational analysis to identify key influencing factors. The research evaluates the performance of seven models, including logistic regression and various ensemble methods, while addressing class imbalance through SMOTE technology. The findings highlight memory impairment, behavioral issues, and cardiovascular health as critical predictive factors, with the study’s framework offering improved accuracy and transparency in understanding AD risk through the use of SHAP values for feature importance quantification.

Methods

The methodology employed in this study utilizes a structured multimodal analysis framework that effectively merges the strengths of traditional statistical learning, machine learning, and deep learning techniques. By integrating these approaches, the research aims to develop an enhanced model that demonstrates superior performance in analysis tasks. This robust paradigm is designed to leverage the complementary benefits of each method, thereby achieving optimal results in the study’s objectives, as illustrated in Figure 4.

Results

The results of the comparative study on five machine learning models—Random Forest Classifier (RFC), XGBoost (XGB), LightGBM (LGBM), CatBoost, and Logistic Regression—demonstrate significant variability in predictive performance for the classification task. Logistic Regression yielded the lowest accuracy and ROC-AUC at 80.2%, attributed to its linear decision boundary, which struggled with complex features. In contrast, the tree-based models excelled, with CatBoost achieving the highest accuracy and ROC-AUC of 91.9%, followed closely by LGBM (91.2%) and XGB (91.0%). The Random Forest model also performed well at 89.2%, showcasing the effectiveness of ensemble learning in capturing non-linear patterns. Notably, the non-linear meta-learner of CatBoost outperformed the linear meta-learner of logistic regression stacking, indicating its superior predictive capabilities, particularly in high-stakes medical decision-making.

The study employed a rigorous five-fold stratified cross-validation approach, yielding mean classification accuracies ranging from 84.62% to 86.51%, with an overall mean accuracy of 85.11% and a standard deviation of ±0.71%, reflecting stable performance across data splits. Additionally, innovations in deep neural networks (DNNs) significantly enhanced model performance, with the baseline DNN achieving an AUC of 0.9746, which improved to 0.9889 after implementing advanced regularization techniques and hyperparameter tuning. This progression underscores the critical role of appropriate regularization and class balancing in medical AI applications, positioning the DNN in the top tier of diagnostic systems with only a 1.11% error rate.

Discussion

The discussion section of the paper highlights the significant advancements in machine learning (ML) and deep learning (DL) methodologies for diagnosing Alzheimer’s disease (AD). It emphasizes the evolution from traditional statistical models to sophisticated algorithms capable of processing high-dimensional and heterogeneous datasets. Notably, ML techniques such as Support Vector Machines, logistic regression, and ensemble methods like XGBoost and CatBoost have shown promise in identifying AD through their ability to capture complex patterns in clinical and neuroimaging data. The paper also discusses the importance of model interpretability, particularly through methods like SHAP (Shapley Additive Explanations), which enhance transparency in clinical applications by elucidating the influence of various features on model predictions.

Furthermore, the research introduces a multi-level AD prediction framework that integrates traditional models, ensemble techniques, and DL approaches, alongside SHAP for interpretability. The dataset utilized comprises 2,149 patient records with 35 features, which underwent rigorous preprocessing, including normalization and feature engineering to enhance model performance. The findings underscore the potential of combining diverse modeling techniques and interpretability frameworks to improve diagnostic accuracy and clinical trust in AI-driven predictions for AD, ultimately contributing to more effective patient management strategies.