تحسين تقسيم أورام الدماغ في صور الرنين المغناطيسي باستخدام U-Net والتعلم الانتقالي A brain tumor segmentation enhancement in MRI images using U-Net and transfer learning

المجلة: BMC Medical Imaging، المجلد: 25، العدد: 1
DOI: https://doi.org/10.1186/s12880-025-01837-4
PMID: https://pubmed.ncbi.nlm.nih.gov/40745592
تاريخ النشر: 2025-07-31
المؤلف: Amin Pourmahboubi وآخرون
الموضوع الرئيسي: كشف وتصنيف أورام الدماغ

نظرة عامة

يتناول القسم أهمية تقسيم أورام الدماغ تلقائيًا من صور الرنين المغناطيسي، مشددًا على دوره الحاسم في الممارسة السريرية لتعريف حدود الورم، ومراقبة تقدم المرض، وتخطيط العلاج. بينما يُعتبر التقسيم اليدوي من قبل أطباء الأشعة هو المعيار الذهبي، إلا أنه يستغرق وقتًا طويلاً ويخضع للتفاوت، مما يبرز الحاجة إلى طرق تلقائية موثوقة. تعقيد الأورام، التي يمكن أن تختلف في الحجم والشكل والملمس، يشكل تحديات للتقسيم الدقيق، خاصة عند التمييز بين الأورام والأنسجة السليمة.

النموذج المقترح U-Net، الذي يستخدم هيكل VGG19 ووظيفة خسارة Focal Tversky المخصصة، يظهر نتائج واعدة في تحسين دقة التقسيم مقارنة بالنماذج التقليدية وأنواع U-Net الأخرى. حقق النموذج مقاييس كمية مثيرة للإعجاب، بما في ذلك AUC قدره 0.9957، وF1-Score قدره 0.9679، وDice Coefficient قدره 0.9679، وPrecision قدره 0.9541، وRecall قدره 0.9821، وIoU قدره 0.9378. تهدف الأبحاث المستقبلية إلى تعزيز قدرات النموذج من خلال دمج صور الرنين المغناطيسي متعددة الأنماط، وتوسيع مجموعة البيانات لتشمل أنواع وأحجام الأورام المتنوعة، ومعالجة قضايا القابلية للتعميم والموثوقية. يمكن أن تؤدي هذه التطورات إلى أدوات تشخيصية أكثر دقة وكفاءة، مما يحسن في النهاية نتائج المرضى في التصوير الطبي.

مقدمة

في مقدمة هذه الورقة البحثية، يتناول المؤلفون التحديات الكبيرة التي تطرحها أورام الدماغ، والتي تعتبر حالات عصبية حرجة مرتبطة بمعدلات عالية من المراضة والوفيات. يؤكدون على أهمية التشخيص المبكر والدقيق للعلاج الفعال، مشددين على التصوير بالرنين المغناطيسي (MRI) كوسيلة رئيسية لاكتشاف أورام الدماغ. تُلاحظ تقنيات التقسيم التقليدية، مثل تحديد العتبات واكتشاف الحواف، لقيودها في الاتساق والقدرة على التكيف بسبب اعتمادها على ميزات مصنوعة يدويًا ومعلمات ثابتة، والتي يمكن أن تكون حساسة للضوضاء والتغيرات في جودة الصورة.

للتغلب على هذه التحديات، يقترح المؤلفون نموذجًا جديدًا لتقسيم أورام الدماغ يدمج U-Net مع VGG-19 كهيكل مشفر. تستفيد هذه الطريقة من نقاط القوة في التعلم العميق، وخاصة الشبكات العصبية التلافيفية (CNNs)، التي تتفوق في استخراج الميزات التلقائية والتعرف على الأنماط. تم تصميم هيكل U-Net، الذي يتميز ببنيته المتماثلة بين المشفر والمفكك والاتصالات المتجاوزة، لالتقاط كل من الميزات الدلالية عالية المستوى والتفاصيل الدقيقة، مما يجعله مناسبًا للهياكل التشريحية المعقدة. من خلال دمج VGG-19، المعروف بعمقه وقدرته على استخراج ميزات مفصلة، يهدف المؤلفون إلى تعزيز أداء U-Net في تقسيم أورام الدماغ، خاصة في المناطق ذات الحدود المعقدة. يتم تقييم النموذج على مجموعة بيانات TCGA لأورام الجليوما منخفضة الدرجة، مع مقاييس الأداء بما في ذلك معامل تشابه Dice (DSC) والتقاطع على الاتحاد (IOU)، مما يظهر دقة متفوقة في تحديد حدود الورم والتعميم عبر ظروف الرنين المغناطيسي المتنوعة.

الطرق

يستعرض قسم المنهجية النهج الشامل المتبع لتطوير وتقييم نموذج التعلم العميق لتقسيم أورام الدماغ من صور الرنين المغناطيسي. يوضح عملية الحصول على البيانات، وتقنيات المعالجة المسبقة، وهيكل نموذج U-Net المقترح، الذي يستخدم VGG19 كمشفر مدرب مسبقًا. يؤكد القسم على أهمية إنشاء إطار عمل واضح وقابل للتكرار لتعزيز التقدم في تقسيم الصور الطبية.

فيما يتعلق بالإعداد التجريبي، استخدمت الدراسة وحدة معالجة الرسومات NVIDIA RTX 3050 مع 4 جيجابايت من VRAM ومعالج AMD Ryzen 7 4800H، وهو معالج ثماني النواة، بالإضافة إلى 32 جيجابايت من RAM. يتم تلخيص المعلمات الفائقة الرئيسية وبروتوكولات التدريب لمختلف النماذج في الجدول 2، مما يوفر رؤى أساسية حول إجراءات التدريب ومقاييس التقييم المستخدمة في البحث.

النتائج

في هذا القسم، يقدم المؤلفون تقييمًا شاملاً لنموذج تقسيم أورام الدماغ من صور الرنين المغناطيسي، مشددين على فعاليته مقارنةً بسبعة نماذج أخرى تستخدم شبكات عصبية تلافيفية مدربة مسبقًا كهيكل أساسي. يتضمن التقييم مقاييس كمية ملخصة في الجدول 3، الذي يوضح أداء كل نموذج. تسهل الوسائل البصرية، مثل الرسوم البيانية الشريطية (الشكل 5) والرسوم البيانية الرادارية (الشكل 6)، تحليلًا مقارنًا للنماذج عبر ستة مقاييس تقييم حاسمة، مما يظهر قوة نموذج VGG-19.

استخدم المؤلفون ستة مقاييس—معامل Dice، والتقاطع على الاتحاد (IoU)، والدقة، والاسترجاع، وF1-score، ومساحة تحت المنحنى (AUC)—لتقييم أداء نموذجهم. توضح الأشكال من 9 إلى 14 تقدم هذه المقاييس عبر فترات التدريب، مسلطة الضوء على اتجاهات مثل التحسن السريع الأولي في الأداء والتقلبات في دقة التحقق، التي يمكن أن تُعزى إلى الحجم المحدود لمجموعة بيانات التحقق أو عدم توازن الفئات. بشكل عام، تشير النتائج إلى أن النموذج المقترح يظهر قدرات تقسيم قوية، مما قد يعزز دقة التشخيص في التطبيقات السريرية.

المناقشة

يؤكد قسم المناقشة في الورقة على ضرورة تحسين الدقة والموثوقية في تقسيم أورام الدماغ تلقائيًا، خاصة من خلال تقنيات التعلم العميق المتقدمة. يبرز المؤلفون قيود هياكل U-Net الحالية في التقاط الميزات المعقدة لأورام الدماغ، مثل الأشكال غير المنتظمة والملمس المتنوع. لمعالجة هذه التحديات، يقترحون نموذج تقسيم جديد يدمج مشفر VGG-19 ضمن إطار عمل U-Net، مستفيدين من التعلم الانتقالي ووظيفة خسارة Focal Tversky لتعزيز أداء التقسيم. تهدف هذه الطريقة إلى تحسين قدرة النموذج على التعرف على الخصائص التفصيلية للورم، وبالتالي تحقيق نتائج متفوقة مقارنةً بنماذج U-Net التقليدية وأنواع مدربة مسبقًا أخرى.

يؤكد المؤلفون أن منهجيتهم لا تملأ فقط الفجوات البحثية الحالية، بل أيضًا تؤسس معيارًا عاليًا لدقة التقسيم في هذا المجال. يقدمون تقييمًا شاملاً لنموذجهم على مجموعة بيانات TCGA لأورام الجليوما منخفضة الدرجة، مما يظهر تحسينات كبيرة في مقاييس التقسيم مثل معامل Dice وAUC. توضح الورقة مزايا استخدام الشبكات المدربة مسبقًا وتناقش الآثار المترتبة على القابلية السريرية، مشيرةً إلى أن نموذجهم يمكن أن يدعم الأطباء في التشخيص وتخطيط العلاج. بشكل عام، تشمل مساهمات هذا البحث تقديم هيكل U-Net القائم على VGG-19، وتقييم شامل لأدائه، ورؤى حول الاتجاهات المستقبلية لتحليل الصور الطبية.

القيود

يظهر نموذج U-Net المقترح الذي يستخدم التعلم الانتقالي لتقسيم أورام الدماغ نتائج واعدة؛ ومع ذلك، فإنه ليس بدون قيود. تم إجراء التدريب والاختبار على مجموعة بيانات قد لا تشمل الطيف الكامل لأنواع الأورام، والأحجام، والخصائص السكانية للمرضى التي يتم مواجهتها في الممارسة السريرية. علاوة على ذلك، اعتمد النموذج فقط على شرائح الرنين المغناطيسي ثنائية الأبعاد، مما قد ي compromise دقة التقسيم المكاني. الاستخدام الحصري لصور الرنين المغناطيسي أحادية النمط يقيد أيضًا الإمكانيات لتحسين النتائج التي يمكن تحقيقها من خلال التصوير متعدد الأنماط.

لزيادة قابلية تطبيق النموذج في السيناريوهات الواقعية، يجب أن تتناول الأبحاث المستقبلية هذه القيود. على وجه الخصوص، من المخطط دمج مجموعة بيانات BraTS، بما في ذلك BraTS 2020 و2021، لتسهيل تطوير نموذج ثلاثي الأبعاد، والذي قد يؤدي إلى نتائج تقسيم أكثر قوة.

Journal: BMC Medical Imaging, Volume: 25, Issue: 1
DOI: https://doi.org/10.1186/s12880-025-01837-4
PMID: https://pubmed.ncbi.nlm.nih.gov/40745592
Publication Date: 2025-07-31
Author(s): Amin Pourmahboubi et al.
Primary Topic: Brain Tumor Detection and Classification

Overview

The section discusses the significance of automated brain tumor segmentation from MRI scans, emphasizing its critical role in clinical practice for defining tumor boundaries, monitoring disease progression, and planning treatment. While manual segmentation by radiologists is the gold standard, it is time-consuming and subject to variability, underscoring the need for reliable automated methods. The complexity of tumors, which can vary in size, shape, and texture, poses challenges for accurate segmentation, particularly when distinguishing tumors from healthy tissue.

The proposed U-Net model, utilizing a VGG19 backbone and a customized Focal Tversky loss function, demonstrates promising results in improving segmentation accuracy compared to traditional models and other U-Net variations. The model achieved impressive quantitative metrics, including an AUC of 0.9957, F1-Score of 0.9679, Dice Coefficient of 0.9679, Precision of 0.9541, Recall of 0.9821, and IoU of 0.9378. Future research aims to enhance the model’s capabilities by incorporating multimodal MRI scans, expanding the dataset to include diverse tumor types and sizes, and addressing issues of generalizability and robustness. These advancements could lead to more precise and efficient diagnostic tools, ultimately improving patient outcomes in medical imaging.

Introduction

In the introduction of this research paper, the authors address the significant challenges posed by brain tumors, which are critical neurological conditions associated with high morbidity and mortality rates. They emphasize the importance of early and accurate diagnosis for effective treatment, highlighting Magnetic Resonance Imaging (MRI) as the primary method for brain tumor detection. Traditional segmentation techniques, such as thresholding and edge detection, are noted for their limitations in consistency and adaptability due to their reliance on handcrafted features and fixed parameters, which can be sensitive to noise and variations in image quality.

To overcome these challenges, the authors propose a novel brain tumor segmentation model that integrates U-Net with VGG-19 as the encoder backbone. This approach leverages the strengths of deep learning, particularly Convolutional Neural Networks (CNNs), which excel in automated feature extraction and pattern recognition. U-Net’s architecture, characterized by its symmetrical encoder-decoder structure and skip connections, is designed to capture both high-level semantic features and fine details, making it suitable for complex anatomical structures. By incorporating VGG-19, known for its depth and ability to extract detailed features, the authors aim to enhance U-Net’s performance in segmenting brain tumors, particularly in regions with complex boundaries. The model is evaluated on the TCGA lower-grade glioma dataset, with performance metrics including the Dice Similarity Coefficient (DSC) and Intersection Over Union (IOU), demonstrating superior accuracy in tumor boundary delineation and generalization across diverse MRI conditions.

Methods

The methodology section outlines the comprehensive approach taken to develop and assess a deep learning model for brain tumor segmentation from MRI scans. It details the data acquisition process, preprocessing techniques, and the architecture of the proposed U-Net model, which utilizes VGG19 as a pre-trained encoder. The section emphasizes the importance of establishing a clear and reproducible framework to enhance advancements in medical image segmentation.

In terms of experimental setup, the study employed an NVIDIA RTX 3050 GPU with 4GB of VRAM and an AMD Ryzen 7 4800H CPU, an eight-core processor, alongside 32GB of RAM. Key hyperparameters and training protocols for various models are summarized in Table 2, providing essential insights into the training procedures and evaluation metrics utilized in the research.

Results

In this section, the authors present a thorough evaluation of their brain MRI tumor segmentation model, emphasizing its effectiveness compared to seven other models that utilize various pre-trained convolutional neural networks (CNNs) as backbones. The evaluation includes quantitative metrics summarized in Table 3, which details the performance of each model. Visual aids, such as bar charts (Figure 5) and radar charts (Figure 6), facilitate a comparative analysis of the models across six critical evaluation metrics, showcasing the robustness of the VGG-19 model.

The authors employed six metrics—Dice coefficient, Intersection over Union (IoU), precision, recall, F1-score, and Area Under the Curve (AUC)—to assess their model’s performance. Figures 9 to 14 illustrate the progression of these metrics over training epochs, highlighting trends such as the initial rapid improvement in performance and fluctuations in validation precision, potentially attributed to the limited size of the validation dataset or class imbalance. Overall, the results indicate that the proposed model demonstrates strong segmentation capabilities, which may enhance diagnostic accuracy in clinical applications.

Discussion

The discussion section of the paper emphasizes the necessity for improved accuracy and reliability in automated brain tumor segmentation, particularly through advanced deep learning techniques. The authors highlight the limitations of existing U-Net architectures in capturing the complex features of brain tumors, such as irregular shapes and varying textures. To address these challenges, they propose a novel segmentation model that integrates a VGG-19 encoder within the U-Net framework, leveraging transfer learning and the Focal Tversky loss function to enhance segmentation performance. This approach aims to improve the model’s ability to recognize detailed tumor characteristics, thereby achieving superior outcomes compared to traditional U-Net models and other pre-trained variations.

The authors assert that their methodology not only fills existing research gaps but also establishes a high standard for segmentation accuracy in the field. They provide a comprehensive evaluation of their model on the TCGA lower-grade glioma dataset, demonstrating significant improvements in segmentation metrics such as the Dice coefficient and AUC. The paper outlines the advantages of using pre-trained networks and discusses the implications for clinical applicability, suggesting that their model can support clinicians in diagnosis and treatment planning. Overall, the contributions of this research include the introduction of a VGG-19-based U-Net architecture, a thorough evaluation of its performance, and insights into future directions for medical image analysis.

Limitations

The proposed U-Net model utilizing transfer learning for brain tumor segmentation demonstrates promising results; however, it is not without limitations. The training and testing were conducted on a dataset that may not encompass the full spectrum of tumor types, sizes, and patient demographics encountered in clinical practice. Furthermore, the model relied solely on two-dimensional MRI slices, which could compromise the spatial accuracy of the segmentation. The exclusive use of single-modality MRI images also restricts the potential for improved outcomes that could be achieved through multimodal imaging.

To enhance the model’s applicability in real-world scenarios, future research should address these limitations. Specifically, the incorporation of the BraTS dataset, including BraTS 2020 and 2021, is planned to facilitate the development of a three-dimensional model, which may yield more robust segmentation results.