نموذج جديد للتعلم العميق لتوقع مخاطر السكري المبكر باستخدام شبكات الاعتقاد العميق المعززة بالانتباه مع بيانات غير متوازنة بشدة A novel deep learning model for early diabetes risk prediction using attention-enhanced deep belief networks with highly imbalanced data

المجلة: International Journal of Information Technology، المجلد: 17، العدد: 4
DOI: https://doi.org/10.1007/s41870-025-02459-3
تاريخ النشر: 2025-03-04
المؤلف: Olusola Olabanjo وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في الرعاية الصحية

نظرة عامة

تقدم هذه الدراسة نهجًا مبتكرًا للتنبؤ بمخاطر مرض السكري المبكر من خلال شبكة الاعتقاد العميق المعززة بالاهتمام (DBN). يتناول النموذج التحديات التي تطرحها مجموعات البيانات غير المتوازنة بشدة من خلال استخدام الشبكات التنافسية التوليدية (GANs) لتوليد بيانات اصطناعية، مما يحسن تصنيف الحالات الممثلة تمثيلًا ناقصًا. تم استخدام طريقة اختيار الميزات الجماعية لتحديد المتنبئين الرئيسيين من مجموعة بيانات مأخوذة من مستشفى سيلهيت للسكري، والتي تشمل أعراض المرضى والمعلومات الديموغرافية. حقق النموذج مقاييس أداء مثيرة للإعجاب، بما في ذلك منطقة تحت المنحنى (AUC) تبلغ 1.00، ودرجة F1 تبلغ 0.97، ودقة تبلغ 0.98، واسترجاع تبلغ 0.95، متجاوزًا العديد من النماذج الأساسية.

في الختام، تسلط الدراسة الضوء على فعالية الجمع بين آليات الانتباه، وزيادة البيانات المعتمدة على GAN، ودالة خسارة هجينة تدمج بين الانتروبيا المتقاطعة والخسارة البؤرية لتعزيز دقة التصنيف في التشخيصات الطبية. يسمح آلية الانتباه للنموذج بإعطاء الأولوية للميزات المهمة، مثل البوال والعطش المفرط، بينما تضمن دالة الخسارة الهجينة أداءً عاليًا في التصنيفات الصعبة. تشمل اتجاهات البحث المستقبلية دمج مصادر بيانات متنوعة، وتحسين قابلية تفسير النموذج من خلال تقنيات مثل SHAP وLIME، واختبار النموذج عبر مجموعات ديموغرافية متنوعة لتعزيز القابلية للتعميم. بالإضافة إلى ذلك، يمكن أن يسهل تطوير تطبيقات الهاتف المحمول التي تستخدم هذه النماذج التنبؤية رؤى صحية في الوقت الحقيقي، مما يعزز إدارة المخاطر بشكل استباقي للأفراد.

مقدمة

تسلط مقدمة الورقة الضوء على انتشار مرض السكري من النوع الثاني وشدته على مستوى العالم، حيث يؤثر على حوالي 500 مليون فرد في جميع أنحاء العالم. يتميز هذا المرض المزمن بارتفاع مستويات السكر في الدم بسبب نقص إنتاج الأنسولين أو عدم فعالية استخدام الأنسولين من قبل الجسم. تميز الورقة بين مرض السكري من النوع الأول، الناتج عن تدمير المناعة الذاتية لخلايا بيتا في البنكرياس، ومرض السكري من النوع الثاني، حيث ينتج الجسم إما أنسولين غير كافٍ أو يطور مقاومة له. بالإضافة إلى ذلك، يتم الإشارة إلى سكري الحمل كنوع أقل شيوعًا يحدث أثناء الحمل. تؤكد المقدمة على أهمية الكشف المبكر، حيث يمكن أن يؤدي السكري غير المشخص لفترة طويلة إلى مضاعفات خطيرة، بما في ذلك الأمراض القلبية الوعائية وفشل الأعضاء، مما يؤثر بشكل خاص على السكان في البلدان ذات الدخل المنخفض والمتوسط.

يؤكد المؤلفون على دور التعلم الآلي في تعزيز التنبؤ بمخاطر السكري والتشخيص، مشيرين إلى تحول نحو أساليب التعلم غير المراقب، وخاصة من خلال استخدام الشبكات العميقة. يستعرضون أدوات تقييم المخاطر الحالية ونماذج التعلم الآلي، مسلطين الضوء على فعاليتها والتحديات في قياس دقتها بسبب نقص البيانات الشاملة. تمهد المقدمة الطريق للأقسام التالية من الورقة، التي ستفصل المنهجية والنتائج والآثار لنموذجهم المقترح للكشف المبكر عن مخاطر السكري، بهدف المساهمة بشكل كبير في الممارسة الطبية ونتائج المرضى.

الطرق

تحدد قسم “الطرق” في ورقة البحث التصميم التجريبي والتقنيات التحليلية المستخدمة للتحقيق في أسئلة البحث. استخدمت الدراسة نهجًا كميًا، يتضمن تحليلات إحصائية لتقييم البيانات المجمعة من تجارب مختلفة. شملت المنهجيات المحددة تجارب مختبرية محكومة، حيث تم التلاعب بالمتغيرات بشكل منهجي لملاحظة آثارها على النتائج ذات الصلة.

شمل جمع البيانات استخدام أدوات وبروتوكولات موحدة لضمان الموثوقية والصلاحية. تم إجراء التحليل باستخدام برامج إحصائية مناسبة، وتطبيق تقنيات مثل تحليل الانحدار وANOVA لتقييم دلالة النتائج. يبرز القسم أهمية القابلية للتكرار والصلابة في التصميم التجريبي، موضحًا حسابات حجم العينة والمعايير لاختيار المشاركين. بشكل عام، كانت الطرق المستخدمة مصممة لاختبار الفرضيات بدقة وتوفير رؤى واضحة حول أسئلة البحث المطروحة.

النتائج

في هذه الدراسة، تم تطوير طريقة اختيار ميزات جماعية تعتمد على التصويت، تتضمن تقنيات Chi-Square (CS)، ومعلومات التفاعل المتبادل (MIG)، وحدود التباين (VT) لتحديد المتنبئين الرئيسيين لمرض السكري. تم استخدام أفضل عشرة ميزات تم اختيارها لاحقًا في التدريب المسبق لنموذج شبكة الاعتقاد العميق (DBN). أظهرت التحليلات أن العمر، والحكة، والسمنة تم استبعادها كمتنبئين مبكرين محتملين، حيث كانت السمنة في أدنى مرتبة بين الميزات التي تم تقييمها. تضمنت الميزات المختارة لمزيد من النمذجة الجنس، والبوال، والعطش المفرط، وفقدان الوزن المفاجئ، والضعف، والشهية المفرطة، والطفح الجلدي التناسلي، وتشوش الرؤية، والانفعالية، وتأخر الشفاء، والشلل الجزئي، وتصلب العضلات، وفقدان الشعر.

تم تقييم أداء نماذج التعلم الآلي التقليدية – KNN، SVM الخطي، الانحدار اللوجستي، أشجار القرار، والغابات العشوائية – تحت ثلاثة ظروف: باستخدام جميع الميزات من مجموعة البيانات الأصلية، وجميع الميزات المؤهلة، وفقط الميزات المؤهلة بقوة. توفر نتائج هذه التجارب، المقدمة في الجدول 5 والشكل 6، رؤى حول فعالية عملية اختيار الميزات والأداء المقارن للنماذج، مما يبرز أهمية اختيار الميزات في تعزيز الدقة التنبؤية لمرض السكري.

المناقشة

تسلط قسم المناقشة في ورقة البحث الضوء على التقدم في التنبؤ بمخاطر السكري من خلال دمج تقنيات التعلم الآلي والتعلم العميق. تكشف مراجعة شاملة للدراسات الحديثة أن نماذج مثل الغابات العشوائية (RF) وXGBoost قد أظهرت دقة واسترجاع عالية، خاصة عند استخدام مجموعات بيانات واسعة. ومع ذلك، لا تزال التحديات مثل عدم توازن مجموعة البيانات وملاءمة الميزات قائمة، مما يؤدي إلى أداء دون المستوى الأمثل في توقعات الفئة الأقل. يقترح المؤلفون نهجًا جديدًا يجمع بين الشبكات التنافسية التوليدية (GANs) لتوليد البيانات الاصطناعية، وآلية الانتباه لإعطاء الأولوية للميزات الحرجة، ودالة خسارة هجينة تدمج بين الانتروبيا المتقاطعة والخسارة البؤرية. يهدف هذا الاستراتيجية متعددة الأوجه إلى تعزيز صلابة النموذج ودقته، خاصة في الكشف عن مرض السكري في مراحله المبكرة.

تظهر بنية شبكة الاعتقاد العميق (DBN) المقترحة، المعززة بهذه التقنيات، تحسينات كبيرة مقارنة بالنماذج التقليدية. تشير النتائج التجريبية إلى أن DBN تحقق دقة تبلغ 0.98، واسترجاعًا يبلغ 0.95، ودرجة F1 تبلغ 0.97، ومنطقة تحت المنحنى (AUC) تبلغ 1.00، مما يبرز فعاليتها في معالجة عدم توازن الفئات وتحسين الدقة التنبؤية. تسمح التحسينات المنهجية، بما في ذلك وزن الانتباه وزيادة البيانات، للنموذج بالتركيز على الميزات ذات الصلة مع التخفيف من تأثير البيانات غير ذات الصلة. بشكل عام، تشير النتائج إلى أن النموذج المقترح لا يتجاوز الأساليب التقليدية فحسب، بل يقدم أيضًا إطارًا قويًا للكشف المبكر عن السكري، مما يمهد الطريق لتطبيقات سريرية أكثر فعالية.

Journal: International Journal of Information Technology, Volume: 17, Issue: 4
DOI: https://doi.org/10.1007/s41870-025-02459-3
Publication Date: 2025-03-04
Author(s): Olusola Olabanjo et al.
Primary Topic: Artificial Intelligence in Healthcare

Overview

This research presents an innovative approach to early diabetes risk prediction through an attention-enhanced Deep Belief Network (DBN). The model addresses the challenges posed by highly imbalanced datasets by employing Generative Adversarial Networks (GANs) to generate synthetic data, thereby improving the classification of underrepresented cases. An ensemble feature selection method was utilized to identify critical predictors from a dataset sourced from Sylhet Diabetes Hospital, which includes patient symptoms and demographic information. The model achieved impressive performance metrics, including an area under the curve (AUC) of 1.00, an F1-score of 0.97, precision of 0.98, and recall of 0.95, surpassing several baseline models.

In conclusion, the study highlights the effectiveness of combining attention mechanisms, GAN-based data augmentation, and a hybrid loss function that integrates cross-entropy and focal loss to enhance classification accuracy in medical diagnostics. The attention mechanism allows the model to prioritize significant features, such as polyuria and polydipsia, while the hybrid loss function ensures high performance in challenging classifications. Future research directions include incorporating diverse data sources, improving model interpretability through techniques like SHAP and LIME, and testing the model across varied demographic groups to enhance generalizability. Additionally, the development of mobile applications utilizing these predictive models could facilitate real-time health insights, promoting proactive risk management for individuals.

Introduction

The introduction of the paper highlights the global prevalence and severity of diabetes mellitus, affecting approximately 500 million individuals worldwide. This chronic condition is characterized by elevated blood sugar levels due to insufficient insulin production or ineffective insulin utilization by the body. The paper distinguishes between Type I diabetes, resulting from autoimmune destruction of pancreatic beta cells, and Type II diabetes, where the body either produces inadequate insulin or develops resistance to it. Additionally, gestational diabetes is noted as a less common form that occurs during pregnancy. The introduction underscores the importance of early detection, as prolonged undiagnosed diabetes can lead to severe complications, including cardiovascular diseases and organ failure, particularly affecting populations in low- and middle-income countries.

The authors emphasize the role of machine learning in enhancing diabetes risk prediction and diagnosis, noting a shift towards unsupervised learning approaches, specifically through the use of deep belief networks. They review various existing risk assessment tools and machine learning models, highlighting their effectiveness and the challenges in quantifying their accuracy due to the lack of comprehensive data. The introduction sets the stage for the subsequent sections of the paper, which will detail the methodology, results, and implications of their proposed model for early diabetes risk detection, aiming to contribute significantly to medical practice and patient outcomes.

Methods

The “Methods” section of the research paper outlines the experimental design and analytical techniques employed to investigate the research questions. The study utilized a quantitative approach, incorporating statistical analyses to evaluate the data collected from various experiments. Specific methodologies included controlled laboratory experiments, where variables were systematically manipulated to observe their effects on the outcomes of interest.

Data collection involved the use of standardized instruments and protocols to ensure reliability and validity. The analysis was conducted using appropriate statistical software, applying techniques such as regression analysis and ANOVA to assess the significance of the results. The section emphasizes the importance of replicability and robustness in the experimental design, detailing the sample size calculations and the criteria for participant selection. Overall, the methods employed were designed to rigorously test the hypotheses and provide clear insights into the research questions posed.

Results

In this study, a voting ensemble feature selection method was developed, incorporating Chi-Square (CS), Mutual Information Gain (MIG), and Variance Threshold (VT) techniques to identify key predictors for diabetes mellitus. The top ten features selected were subsequently utilized in the pretraining of a Deep Belief Network (DBN) model. The analysis revealed that age, itching, and obesity were excluded as potential early predictors, with obesity ranked lowest among the features assessed. The selected features for further modeling included sex, polyuria, polydipsia, sudden weight loss, weakness, polyphagia, genital thrush, visual blurring, irritability, delayed healing, partial paresis, muscle stiffness, and alopecia.

The performance of classical machine learning models—KNN, Linear SVM, Logistic Regression, Decision Trees, and Random Forests—was evaluated under three conditions: using all features from the original dataset, all qualified features, and only strongly qualified features. The results of these experiments, presented in Table 5 and Figure 6, provide insights into the effectiveness of the feature selection process and the comparative performance of the models, highlighting the importance of feature selection in enhancing predictive accuracy for diabetes mellitus.

Discussion

The discussion section of the research paper highlights the advancements in diabetes risk prediction through the integration of machine learning and deep learning techniques. A comprehensive review of recent studies reveals that models like Random Forest (RF) and XGBoost have shown high accuracy and recall, particularly when utilizing extensive datasets. However, challenges such as dataset imbalance and feature relevance persist, leading to suboptimal performance in minority class predictions. The authors propose a novel approach that combines Generative Adversarial Networks (GANs) for synthetic data generation, an attention mechanism for prioritizing critical features, and a hybrid loss function that merges cross-entropy with focal loss. This multifaceted strategy aims to enhance model robustness and accuracy, particularly in detecting early-stage diabetes.

The proposed Deep Belief Network (DBN) architecture, enhanced by these techniques, demonstrates significant improvements over traditional models. Experimental results indicate that the DBN achieves a precision of 0.98, recall of 0.95, F1-score of 0.97, and an AUC of 1.00, underscoring its effectiveness in addressing class imbalance and improving predictive accuracy. The systematic enhancements, including attention weighting and data augmentation, allow the model to focus on relevant features while mitigating the impact of irrelevant data. Overall, the findings suggest that the proposed model not only surpasses traditional methods but also offers a robust framework for early diabetes detection, paving the way for more effective clinical applications.