نموذج قائم على التعلم العميق لتصنيف اعتلال الشبكية السكري A deep learning based model for diabetic retinopathy grading

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-87171-9
PMID: https://pubmed.ncbi.nlm.nih.gov/39885230
تاريخ النشر: 2025-01-30
المؤلف: Samia Akhtar وآخرون
الموضوع الرئيسي: تصوير الشبكية والتحليل

نظرة عامة

تقدم البحث RSG-Net، وهو شبكة عصبية تلافيفية مصممة للكشف الآلي وتصنيف اعتلال الشبكية السكري (DR) إلى أربع ودرجتين من الشدة. يتناول هذا الدراسة قيود الفحص اليدوي وطرق الكشف الحالية التي تعتمد على الميزات المصنوعة يدويًا، والتي يمكن أن تعيق التكيف والدقة. باستخدام مجموعة بيانات Messidor-1، تستخدم RSG-Net تقنيات معالجة مسبقة متقدمة، بما في ذلك تعديل الهيستوجرام وزيادة البيانات (الانعكاس، الدوران، التكبير، وتعديلات اللون)، لتحسين جودة الصورة ومعالجة عدم توازن الفئات. حقق النموذج دقة اختبار ملحوظة بلغت 99.36% لتصنيف الأربع مراحل و99.37% للتصنيف الثنائي، مما يدل على أداء متفوق مقارنة بالمنهجيات الحالية.

تشير النتائج إلى أن RSG-Net لا تتفوق فقط في دقة التصنيف ولكنها تدير أيضًا التعقيدات المرتبطة بتصنيف المراحل المتعددة لاعتلال الشبكية السكري. على الرغم من أدائها العالي، تعترف الدراسة بإمكانية حدوث الإفراط في التكيف والحاجة إلى تقنيات تنظيم إضافية. ستركز الأعمال المستقبلية على التحقق من صحة RSG-Net عبر مجموعات بيانات متنوعة، ودمج النماذج المدربة مسبقًا وتقنيات التجميع، وتنقيح استراتيجيات الزيادة، وتعزيز الدقة التحليلية من خلال اختبار الفرضيات الإحصائية. تهدف هذه الجهود إلى تعزيز موثوقية النموذج وقابليته للتطبيق في الإعدادات السريرية للكشف عن اعتلال الشبكية السكري في الوقت المناسب.

طرق

يقدم قسم المنهجية في هذه الدراسة إطارًا لتصنيف اعتلال الشبكية السكري (DR) باستخدام شبكة عصبية تلافيفية (CNN) تسمى RSG-Net، والتي تؤدي مهام التصنيف متعددة الفئات (أربع فئات) والثنائية (فئتين). يسمح الإطار بالتخطيط الدقيق للعلاج من خلال التصنيف متعدد الفئات والفحص على نطاق واسع عبر التصنيف الثنائي. تم تنفيذ كلا المهمتين بشكل مستقل ضمن نفس الهيكل، مستفيدًا من نقاط القوة في RSG-Net مع الحفاظ على دقة عالية. تم تدريب النموذج واختباره على مجموعة بيانات مرجعية، حيث تم إجراء عملية التدريب باستخدام Python على دفاتر Kaggle، التي وفرت موارد حسابية كبيرة.

شمل الإعداد التجريبي ضبطًا دقيقًا للمعلمات الفائقة، بما في ذلك استخدام الانتروبيا المتقاطعة الفئوية للتصنيف متعدد الفئات والانتروبيا المتقاطعة الثنائية للتصنيف الثنائي. أظهر النموذج أداءً فعالًا، محققًا دقة تدريب بلغت 99.96% لتصنيف الأربع فئات و100% لتصنيف الفئتين. في مجموعة الاختبار، حققت RSG-Net دقتين بلغت 99.36% و99.37%، على التوالي، مع مقاييس حساسية ونوعية عالية. تم استخدام تقنيات التنظيم، مثل EarlyStopping وReduceLROnPlateau، للتخفيف من الإفراط في التكيف، إلى جانب طبقات الإسقاط وتطبيع الدفعات لتعزيز التعميم. على الرغم من الأداء القوي للنموذج، تم ملاحظة بعض الأخطاء في التصنيف، خاصة بين الدرجات المجاورة، مما يشير إلى مجالات للتحسين المستقبلي في عتبات التصنيف وتوازن مجموعة البيانات. بشكل عام، أظهرت RSG-Net قدرات قوية في تصنيف اعتلال الشبكية السكري بدقة، كما يتضح من مقاييس الأداء المقدمة في الدراسة.

النتائج

في قسم النتائج، خضعت مجموعة البيانات لعمليات المعالجة المسبقة وزيادة البيانات، مما أدى إلى إجمالي 8304 صورة لمهمة التصنيف ذات الأربع مراحل. تم تقسيم مجموعة البيانات إلى مجموعات التدريب والتحقق والاختبار بنسبة 70:10:20، مما أسفر عن 5978 صورة للتدريب (70%)، و665 صورة للتحقق (10%)، و1661 صورة للاختبار (20%). يتم تفصيل توزيع الصور عبر درجات مختلفة ضمن هذه المجموعات في الجدول 5.

بالنسبة لتصنيف المرحلتين، كانت مجموعة البيانات تتكون من 4800 صورة، حيث تحتوي مجموعة التدريب على 3456 صورة (70%)، بينما تضمنت مجموعات التحقق والاختبار 384 صورة (10%) و960 صورة (20%)، على التوالي. يتم تقديم هذا التوزيع في الجدول 6. يدعم التوزيع المنظم للصور عبر كلا مرحلتي التصنيف قوة عمليات التدريب والتقييم.

المناقشة

في قسم “المناقشة” من ورقة البحث، يقدم المؤلفون نظرة شاملة على التقدم في منهجيات الكشف عن اعتلال الشبكية السكري (DR)، مع تسليط الضوء على دراسات مختلفة ساهمت في هذا المجال. يلخصون استخدام الشبكات العصبية التلافيفية (CNNs) وأطر التعلم العميق، مشيرين إلى تحسينات كبيرة في دقة التصنيف والحساسية عبر دراسات متعددة. على سبيل المثال، حققت إحدى الدراسات دقة بلغت 89% مع حساسية 89% ونوعية 97.3% من خلال دمج نماذج CNN لتصنيف مراحل DR وتحديد موضع الآفات. وصلت دراسة أخرى تستخدم التعلم بالنقل مع DenseNet201 إلى دقة اختبار بلغت 82.7% وAUC قدره 94.1%. يناقش المؤلفون أيضًا approaches مبتكرة مثل التعلم التجميعي والنماذج الهجينة، التي أظهرت نتائج واعدة، بما في ذلك دقة تتجاوز 94%.

يؤكد القسم على أهمية تقنيات المعالجة المسبقة في تحسين جودة الصورة لأداء أفضل للنموذج. يتم تفصيل تقنيات مثل قص الصورة، وإزالة الضوضاء باستخدام ضباب غاوسي، وتعديل الهيستوجرام، وإعادة الحجم كخطوات حاسمة تحسن وضوح وموثوقية البيانات المدخلة. علاوة على ذلك، يتناول المؤلفون تحدي عدم توازن الفئات في مجموعات البيانات، خاصة في مجموعة بيانات Messidor-1، ويدعون إلى استراتيجيات زيادة البيانات لتعزيز تمثيل الفئات الأقل. من خلال استخدام تقنيات الزيادة الهندسية والفوتومترية، تمكنوا من زيادة حجم مجموعة البيانات، مما يعزز قوة النموذج وقدرات التعميم. بشكل عام، تؤكد المناقشة على التطور المستمر لمنهجيات الكشف عن DR وإمكانية البحث المستقبلي لمزيد من تحسين هذه الأساليب.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-87171-9
PMID: https://pubmed.ncbi.nlm.nih.gov/39885230
Publication Date: 2025-01-30
Author(s): Samia Akhtar et al.
Primary Topic: Retinal Imaging and Analysis

Overview

The research presents RSG-Net, a convolutional neural network designed for the automated detection and classification of diabetic retinopathy (DR) into four and two severity grades. This study addresses the limitations of manual examination and existing detection methods that rely on handcrafted features, which can hinder adaptability and accuracy. Utilizing the Messidor-1 dataset, RSG-Net employs advanced preprocessing techniques, including Histogram Equalization and data augmentation (flipping, rotation, zooming, and color adjustments), to enhance image quality and tackle class imbalance. The model achieved a remarkable testing accuracy of 99.36% for four-stage classification and 99.37% for binary classification, demonstrating superior performance compared to existing methodologies.

The findings indicate that RSG-Net not only excels in classification accuracy but also effectively manages the complexities associated with multi-class classification of DR stages. Despite its high performance, the study acknowledges potential overfitting and the need for further regularization techniques. Future work will focus on validating RSG-Net across diverse datasets, incorporating pre-trained models and ensemble techniques, refining augmentation strategies, and enhancing analytical rigor through statistical hypothesis testing. These efforts aim to solidify the model’s reliability and applicability in clinical settings for timely diabetic retinopathy detection.

Methods

The methodology section of this study presents a framework for grading diabetic retinopathy (DR) using a convolutional neural network (CNN) called RSG-Net, which performs both multi-class (four classes) and binary (two classes) classification tasks. The framework allows for precise treatment planning through multi-class classification and large-scale screening via binary classification. Both tasks were executed independently within the same architecture, leveraging the strengths of RSG-Net while maintaining high accuracy. The model was trained and tested on a benchmark dataset, with the training process conducted using Python on Kaggle Notebooks, which provided substantial computational resources.

The experimental setup involved careful tuning of hyperparameters, including the use of categorical cross-entropy for multi-class and binary cross-entropy for binary classification. The model demonstrated efficient performance, achieving a training accuracy of 99.96% for the four-class classification and 100% for the two-class classification. On the test set, RSG-Net achieved accuracies of 99.36% and 99.37%, respectively, with high sensitivity and specificity metrics. Regularization techniques, such as EarlyStopping and ReduceLROnPlateau, were employed to mitigate overfitting, alongside dropout and batch normalization layers to enhance generalization. Despite the model’s strong performance, some misclassifications were noted, particularly between adjacent grades, indicating areas for future refinement in classification thresholds and dataset balancing. Overall, RSG-Net exhibited robust capabilities in accurately grading diabetic retinopathy, as evidenced by the performance metrics presented in the study.

Results

In the results section, the dataset underwent preprocessing and augmentation, leading to a total of 8304 images for the 4-stage classification task. The dataset was divided into training, validation, and testing sets in a 70:10:20 ratio, resulting in 5978 images for training (70%), 665 images for validation (10%), and 1661 images for testing (20%). The distribution of images across different grades within these sets is detailed in Table 5.

For the 2-stage classification, the dataset comprised 4800 images, with the training set containing 3456 images (70%), while the validation and testing sets included 384 images (10%) and 960 images (20%), respectively. This distribution is presented in Table 6. The structured allocation of images across both classification stages supports the robustness of the training and evaluation processes.

Discussion

In the “Discussion” section of the research paper, the authors provide a comprehensive overview of the advancements in diabetic retinopathy (DR) detection methodologies, highlighting various studies that have contributed to the field. They summarize the use of convolutional neural networks (CNNs) and deep learning frameworks, noting significant improvements in classification accuracy and sensitivity across multiple studies. For instance, one study achieved an accuracy of 89% with a sensitivity of 89% and specificity of 97.3% by integrating CNN models for DR stage classification and lesion localization. Another study utilizing transfer learning with DenseNet201 reached a test accuracy of 82.7% and an AUC of 94.1%. The authors also discuss innovative approaches such as ensemble learning and hybrid models, which have demonstrated promising results, including accuracies exceeding 94%.

The section emphasizes the importance of preprocessing techniques in enhancing image quality for better model performance. Techniques such as image cropping, denoising using Gaussian blur, histogram equalization, and resizing are detailed as critical steps that improve the clarity and reliability of input data. Furthermore, the authors address the challenge of class imbalance in datasets, particularly in the Messidor-1 dataset, and advocate for data augmentation strategies to bolster the representation of minority classes. By employing geometric and photometric augmentation techniques, they successfully increased the dataset size, thereby enhancing model robustness and generalization capabilities. Overall, the discussion underscores the continuous evolution of DR detection methodologies and the potential for future research to further refine these approaches.