إطار عمل تعلم عميق هجيني للكشف المبكر عن اعتلال الشبكية السكري باستخدام صور قاع العين A hybrid deep learning framework for early detection of diabetic retinopathy using retinal fundus images

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-99309-w
PMID: https://pubmed.ncbi.nlm.nih.gov/40307328
تاريخ النشر: 2025-04-30
المؤلف: Mishmala Sushith وآخرون
الموضوع الرئيسي: تصوير الشبكية والتحليل

نظرة عامة

تقدم البحث نموذجًا جديدًا للتعلم العميق الهجين المدرك للزمان (TAHDL) مصممًا للكشف المبكر ومراقبة اعتلال الشبكية السكري (DR) باستخدام صور قاع العين. يدمج هذا الإطار الشبكات العصبية التلافيفية (CNNs) لاستخراج الميزات المكانية والشبكات العصبية المتكررة (RNNs) مع آلية الانتباه لالتقاط الاعتماديات الزمنية عبر مسحات الشبكية المتعددة. من خلال الاستفادة من الطبيعة التسلسلية لتقدم المرض، يعزز النموذج دقة الكشف، مما يسمح له بتحديد التغيرات الطفيفة التي تشير إلى اعتلال الشبكية السكري في مراحله المبكرة.

تم تقييم أداء نموذج TAHDL باستخدام مجموعات بيانات مرجعية، بما في ذلك DRIVE وKaggle، محققًا معدلات دقة مثيرة للإعجاب بلغت 97.5% و94.04%، على التوالي. تفوق النموذج على الهياكل التقليدية للتعلم العميق مثل CNN وRNN وInceptionV3 وVGG19 وLSTM، مما يظهر حساسية وخصوصية متفوقتين. تشير النتائج إلى أن دمج المعلومات الزمنية يحسن بشكل كبير تحليل صور الشبكية، مما يمهد الطريق للتقدم في الكشف الآلي عن DR ويقدم إطارًا للبحث المستقبلي في تحليل الصور الطبية. قد تتضمن الأعمال المستقبلية التحقق المتبادل من النموذج مع مجموعات بيانات متنوعة لتقييم قدراته على التعميم.

النتائج

تقيم نتائج الدراسة أداء نموذج التعلم العميق الهجين المدرك للزمان (TAHDL) المقترح باستخدام تحليلات محاكاة أجريت على مجموعات بيانات مرجعية، تحديدًا DRIVE وKaggle لاعتلال الشبكية السكري. تعتبر مجموعة بيانات DRIVE، التي تحتوي على صور قاع العين عالية الدقة مشروحة لتجزئة الأوعية الدموية، ضرورية لتحديد التغيرات الدقيقة التي تشير إلى اعتلال الشبكية السكري. تتضمن مجموعة بيانات Kaggle أكثر من 88,000 صورة شبكية عبر خمس درجات شدة، مما يقدم تباينًا ضروريًا لتطوير نماذج قوية قادرة على التعامل مع العيوب الواقعية مثل ضبابية الصورة وعدم اتساق التعرض. يعزز تضمين مجموعة بيانات EyePACS تدريب النموذج من خلال توفير صور عالية الجودة مع شروحات من خبراء، مما يضمن عدالة ديموغرافية في تقييم الأداء.

شملت إعدادات التجربة خطوات ما قبل المعالجة مثل تغيير حجم الصور إلى $224 \times 224 \times 3$، وتطبيع البكسل، وتطبيق تحسين التباين المحدود التكيفي (CLAHE) لتعزيز التباين. تم استخدام تقنيات زيادة البيانات لزيادة حجم عينة التدريب من 40 إلى 200. تميز هيكل نموذج TAHDL بطبقات تلافيفية مع تنشيط ReLU، ومسارات تلافيفية متعددة المقاييس، وطبقات LSTM مع آلية انتباه لالتقاط الاعتماديات الزمنية. تم التدريب على مدى 50 حقبة مع حجم دفعة قدره 32، باستخدام مُحسِّن Adam بمعدل تعلم ابتدائي قدره 0.001، وشملت تقنيات تنظيم لتجنب الإفراط في التكيف. تم حساب مقاييس الأداء مثل الدقة والدقة والاسترجاع وF1-score والخصوصية لتقييم قدرات تصنيف النموذج، مع تحليلات مقارنة ضد نماذج الأساس لضمان إطار تقييم صارم.

المناقشة

في مناقشة التقدمات الحديثة في الكشف عن اعتلال الشبكية السكري (DR)، يتم مراجعة نماذج مختلفة تستخدم تقنيات التعلم العميق (DL). من الجدير بالذكر أن النماذج الهجينة التي تجمع بين هياكل مثل GoogleNet وResNet وغيرها قد أظهرت نتائج واعدة، محققة دقة كشف تتراوح بين 82% إلى 94.3% عبر مجموعات بيانات مختلفة. ومع ذلك، تواجه هذه النماذج غالبًا تحديات تتعلق بالقدرة على التفسير، والكفاءة الحسابية، والحاجة إلى دقة محسنة في التصنيف المتعدد الفئات لشدة DR. تركز الأساليب الحالية بشكل أساسي على الميزات المكانية المستمدة من صور الشبكية الثابتة، متجاهلة الديناميات الزمنية الحاسمة لفهم تقدم المرض.

يهدف نموذج التعلم العميق الهجين المدرك للزمان (TAHDL) المقترح إلى معالجة هذه القيود من خلال دمج الشبكات العصبية التلافيفية (CNNs) لاستخراج الميزات المكانية والشبكات العصبية المتكررة (RNNs) لالتقاط الاعتماديات الزمنية. لا يعزز هذا النهج المبتكر فقط الكشف عن DR في مراحله المبكرة، بل يسهل أيضًا مراقبة تقدم المرض بمرور الوقت. يظهر نموذج TAHDL أداءً متفوقًا، محققًا دقتين قدرهما 97.5% و94.04% على مجموعات البيانات المرجعية، مما يثبت فعاليته في دمج التحليلات المكانية والزمنية. علاوة على ذلك، يتضمن النموذج تقنيات ما قبل المعالجة المتقدمة واستراتيجيات التعلم التكيفية، مما يميزه عن الطرق التقليدية ويعزز قابليته للتطبيق في البيئات السريرية.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-99309-w
PMID: https://pubmed.ncbi.nlm.nih.gov/40307328
Publication Date: 2025-04-30
Author(s): Mishmala Sushith et al.
Primary Topic: Retinal Imaging and Analysis

Overview

The research presents a novel Temporal Aware Hybrid Deep Learning (TAHDL) model designed for the early detection and monitoring of diabetic retinopathy (DR) using retinal fundus images. This framework integrates Convolutional Neural Networks (CNNs) for spatial feature extraction and Recurrent Neural Networks (RNNs) with an attention mechanism to capture temporal dependencies across multiple retinal scans. By leveraging the sequential nature of disease progression, the model enhances detection accuracy, allowing it to identify subtle changes indicative of early-stage DR.

The performance of the TAHDL model was evaluated using benchmark datasets, including DRIVE and Kaggle, achieving impressive accuracy rates of 97.5% and 94.04%, respectively. The model outperformed traditional deep learning architectures such as CNN, RNN, InceptionV3, VGG19, and LSTM, demonstrating superior sensitivity and specificity. The findings suggest that the incorporation of temporal information significantly improves the analysis of retinal images, paving the way for advancements in automated DR detection and offering a framework for future research in medical image analysis. Future work may involve cross-validating the model with diverse datasets to assess its generalization capabilities.

Results

The results of the study evaluate the performance of the proposed Temporal Aware Hybrid Deep Learning (TAHDL) model using simulation analyses conducted on benchmark datasets, specifically DRIVE and Kaggle Diabetic Retinopathy. The DRIVE dataset, which contains high-resolution retinal fundus images annotated for blood vessel segmentation, is crucial for identifying microvascular changes indicative of diabetic retinopathy. The Kaggle dataset, comprising over 88,000 retinal images across five severity grades, introduces variability essential for developing robust models capable of handling real-world artifacts such as image blur and exposure inconsistencies. The inclusion of the EyePACS dataset further enhances the model’s training by providing high-quality images with expert annotations, ensuring demographic fairness in performance evaluation.

The experimental setup involved preprocessing steps such as resizing images to $224 \times 224 \times 3$, pixel normalization, and applying Contrast Limited Adaptive Histogram Equalization (CLAHE) for contrast enhancement. Data augmentation techniques were employed to increase the training sample size from 40 to 200. The TAHDL model architecture featured convolutional layers with ReLU activation, multi-scale convolutional paths, and LSTM layers with an attention mechanism to capture temporal dependencies. Training was conducted over 50 epochs with a batch size of 32, utilizing the Adam optimizer with an initial learning rate of 0.001, and included regularization techniques to prevent overfitting. Performance metrics such as accuracy, precision, recall, F1-score, and specificity were calculated to assess the model’s classification capabilities, with comparative analyses against baseline models ensuring a rigorous evaluation framework.

Discussion

In the discussion of recent advancements in diabetic retinopathy (DR) detection, various models employing deep learning (DL) techniques are reviewed. Notably, hybrid models combining architectures such as GoogleNet, ResNet, and others have shown promising results, achieving detection accuracies ranging from 82% to 94.3% across different datasets. However, these models often face challenges related to interpretability, computational efficiency, and the need for enhanced accuracy in multiclass classification of DR severity. Existing approaches primarily focus on spatial features derived from static retinal images, neglecting the temporal dynamics crucial for understanding disease progression.

The proposed Temporal Aware Hybrid Deep Learning (TAHDL) model aims to address these limitations by integrating Convolutional Neural Networks (CNNs) for spatial feature extraction and Recurrent Neural Networks (RNNs) for capturing temporal dependencies. This innovative approach not only enhances the detection of early-stage DR but also facilitates monitoring of disease progression over time. The TAHDL model demonstrates superior performance, achieving accuracies of 97.5% and 94.04% on benchmark datasets, thereby validating its effectiveness in integrating spatial and temporal analyses. Furthermore, the model incorporates advanced preprocessing techniques and adaptive learning strategies, distinguishing it from traditional methods and enhancing its applicability in clinical settings.