الكشف عن أورام الدماغ وتصنيفها في التصوير بالرنين المغناطيسي باستخدام نموذج هجين من ViT و GRU مع الذكاء الاصطناعي القابل للتفسير في جنوب بنغلاديش Brain tumor detection and classification in MRI using hybrid ViT and GRU model with explainable AI in Southern Bangladesh

المجلة: Scientific Reports، المجلد: 14، العدد: 1
DOI: https://doi.org/10.1038/s41598-024-71893-3
PMID: https://pubmed.ncbi.nlm.nih.gov/39354009
تاريخ النشر: 2024-10-01
المؤلف: Md. Mahfuz Ahmed وآخرون
الموضوع الرئيسي: كشف وتصنيف أورام الدماغ

نظرة عامة

تقدم هذه الدراسة نموذجًا هجينًا جديدًا يجمع بين بنية المحول البصري (ViT) ووحدة التكرار المغلقة (GRU) لتعزيز الكشف وتصنيف أورام الدماغ في صور الرنين المغناطيسي (MRI). باستخدام بيانات MRI الأولية من مستشفى بانجاباندو شيخ مجيب الطبي، يستخرج نموذج ViT-GRU بفعالية الميزات الحيوية ويحدد علاقاتها المتبادلة، مما يعالج مشاكل عدم توازن الفئات ويتفوق على طرق التشخيص الحالية. حقق النموذج دقة مثيرة للإعجاب بلغت 98.97% مع مُحسِّن AdamW خلال التحقق المتقاطع 10-fold، متجاوزًا نماذج التعلم المنقول الأخرى بنسبة 1.26%. بالإضافة إلى ذلك، أظهر دقة بنسبة 96.08% على مجموعة بيانات كاجل، مما يدل على قوته عبر مجموعات بيانات مختلفة.

تؤكد الدراسة على دمج تقنيات الذكاء الاصطناعي القابل للتفسير (XAI)، مثل خرائط الانتباه، SHAP، وLIME، لتعزيز قابلية تفسير توقعات النموذج، مما يعزز الثقة بين الأطباء. تسلط النتائج الضوء على إمكانيات النماذج الهجينة في تحسين دقة التشخيص من خلال الاستفادة من ميزات البيانات المكانية والزمانية. تشمل اتجاهات البحث المستقبلية توسيع مجموعة البيانات لتحسين القابلية للتعميم، ودمج النموذج في أجهزة التشخيص في الوقت الحقيقي، واستكشاف تطبيقاته في مهام التصوير الطبي الأخرى للتحقق من مرونته وقوته في البيئات السريرية.

طرق البحث

تتميز منهجية البحث الموضحة في هذه الدراسة بنهج منهجي ومفصل، كما هو موضح في الشكل 1. يقدم هذا الشكل نظرة شاملة على الطرق المستخدمة طوال عملية البحث. تضمن الطبيعة الدقيقة للمنهجية أن تكون النتائج قوية وموثوقة، مع الالتزام بالمعايير العلمية المعمول بها.

النتائج

في هذا القسم، يقدم المؤلفون نتائج ثلاث تجارب أجريت لتقييم أداء نموذجهم المقترح باستخدام مجموعة بيانات BrTMHD-2023 ومجموعة بيانات إضافية لأورام الدماغ من كاجل. في التجربة 1، باستخدام نهج التحقق المتقاطع 10-fold، حقق النموذج دقة متوسطة مثيرة للإعجاب بلغت 98.97%، مع درجة دقة بلغت 97% ومتوسط معدل استرجاع بلغ 97%. كما وصلت درجة F1 إلى 97%، مما يدل على توازن قوي بين الدقة والاسترجاع. أكدت إدخال نماذج التعلم المنقول المدربة مسبقًا على قوة النموذج المقترح، كما هو موضح من خلال منحنيات دقة التدريب والخسارة، وخرائط AUC-ROC، ومصفوفات الالتباس.

ركزت التجربة 2 على تحليل الاحتفاظ باستخدام نموذج ViT-GRU، حيث حقق مُحسِّن AdamW أعلى دقة بلغت 98.97%. أظهرت النتائج، مقارنة بالنماذج الأخرى باستخدام مُحسِّنات مختلفة، أداءً عاليًا متسقًا عبر عدة طيات، مع الحفاظ على الدقة ومعدلات الاسترجاع ودرجات F1 عند 97%. أكدت التمثيلات المرئية لمقاييس أداء النموذج، بما في ذلك منحنيات الدقة والخسارة، ومصفوفات الالتباس، وخرائط AUC-ROC، فعاليتها. في التجربة 3، تم التحقق من النموذج على مجموعة بيانات أورام الدماغ من كاجل، محققًا دقة بلغت 96.08%، إلى جانب دقة واسترجاع ودرجات F1 بلغت 97% و96% و96% على التوالي، مما يظهر قابليته للتعميم عبر مجموعات بيانات مختلفة.

المناقشة

تسلط قسم المناقشة في هذه الورقة البحثية الضوء على التحديات والتقدم في استخدام الذكاء الاصطناعي (AI) للكشف عن أورام الدماغ من خلال صور الرنين المغناطيسي. تشمل التحديات الرئيسية تعقيد دمج بيانات الرنين المغناطيسي متعددة الأنماط، والطبيعة المرهقة لعملية توضيح البيانات، وعدم توازن الفئات بسبب ندرة الأورام، والاعتبارات الأخلاقية المحيطة بخصوصية المرضى. على الرغم من هذه العقبات، تؤكد الورقة على إمكانيات تقنيات التعلم العميق (DL)، وخاصة نموذج المحول البصري الهجين (ViT) ووحدة التكرار المغلقة (GRU) المقترح، الذي حقق دقة متوسطة مثيرة للإعجاب بلغت 98.97% في تصنيف أورام الدماغ. يجمع هذا النموذج بفعالية بين استخراج الميزات المكانية من ViT مع التحليل الزمني من GRU، مما يعزز الأداء العام للتصنيف.

تقارن الدراسة أيضًا النموذج المقترح مع نماذج التعلم المنقول المدربة مسبقًا المختلفة، مما يظهر أداءً متفوقًا في الدقة والاسترجاع ودرجات F1. من الجدير بالذكر أن نموذج ViT-GRU تفوق على النماذج التقليدية من خلال الاستفادة من تقنيات ضبط المعلمات المتقدمة والمعالجة المسبقة. علاوة على ذلك، يعزز دمج طرق الذكاء الاصطناعي القابل للتفسير (XAI)، مثل LIME وSHAP، قابلية تفسير النموذج، مما يوفر رؤى قيمة حول عملية اتخاذ القرار الخاصة به. هذه الشفافية ضرورية للتطبيقات السريرية، حيث تعزز الثقة وتساعد المتخصصين في الرعاية الصحية على اتخاذ قرارات تشخيصية مستنيرة. بشكل عام، تقدم هذه الدراسة خطوة مهمة إلى الأمام في الكشف الآلي وتصنيف أورام الدماغ، مع معالجة القيود الحالية بينما تمهد الطريق للتطورات المستقبلية في هذا المجال.

Journal: Scientific Reports, Volume: 14, Issue: 1
DOI: https://doi.org/10.1038/s41598-024-71893-3
PMID: https://pubmed.ncbi.nlm.nih.gov/39354009
Publication Date: 2024-10-01
Author(s): Md. Mahfuz Ahmed et al.
Primary Topic: Brain Tumor Detection and Classification

Overview

This research presents a novel hybrid model combining Vision Transformer (ViT) and Gated Recurrent Unit (GRU) architectures to enhance the detection and classification of brain tumors in Magnetic Resonance Imaging (MRI) scans. Utilizing primary MRI data from Bangabandhu Sheikh Mujib Medical College Hospital, the ViT-GRU model effectively extracts critical features and identifies their interrelationships, addressing class imbalance issues and outperforming existing diagnostic methods. The model achieved an impressive accuracy of 98.97% with the AdamW optimizer during 10-fold cross-validation, surpassing other transfer learning models by 1.26%. Additionally, it demonstrated a 96.08% accuracy on a Kaggle dataset, indicating its robustness across different datasets.

The study emphasizes the integration of Explainable Artificial Intelligence (XAI) techniques, such as Attention Maps, SHAP, and LIME, to enhance the interpretability of the model’s predictions, fostering trust among clinicians. The findings underscore the potential of hybrid models in improving diagnostic accuracy by leveraging both spatial and temporal data features. Future research directions include expanding the dataset for better generalizability, integrating the model into real-time diagnostic devices, and exploring its application in other medical imaging tasks to validate its versatility and robustness in clinical settings.

Methods

The research methodology outlined in this study is characterized by a systematic and detailed approach, as illustrated in Figure 1. This figure serves to provide a comprehensive overview of the methods employed throughout the research process. The meticulous nature of the methodology ensures that the findings are robust and reliable, adhering to established scientific standards.

Results

In this section, the authors present the results of three experiments conducted to evaluate the performance of their proposed model using the BrTMHD-2023 dataset and an additional Brain Tumor Kaggle dataset. In Experiment 1, employing a 10-fold cross-validation approach, the model achieved an impressive average accuracy of 98.97%, with a precision score of 97% and an average recall rate of 97%. The F1-score also reached 97%, indicating a strong balance between precision and recall. The introduction of various pre-trained transfer learning models further confirmed the robustness of the proposed model, as illustrated by training accuracy and loss curves, AUC-ROC contours, and confusion matrices.

Experiment 2 focused on a holdout analysis with the ViT-GRU model, where the AdamW optimizer yielded the highest accuracy of 98.97%. The results, compared against other models using different optimizers, demonstrated consistent high performance across multiple folds, maintaining precision, recall, and F1-scores at 97%. Visual representations of the model’s performance metrics, including accuracy and loss curves, confusion matrices, and AUC-ROC curves, further emphasized its effectiveness. In Experiment 3, the model was validated on the Brain Tumor Kaggle dataset, achieving an accuracy of 96.08%, alongside precision, recall, and F1-scores of 97%, 96%, and 96%, respectively, showcasing its generalizability across different datasets.

Discussion

The discussion section of this research paper highlights the challenges and advancements in utilizing Artificial Intelligence (AI) for brain tumor detection through MRI scans. Key challenges include the complexity of multi-modal MRI data integration, the labor-intensive nature of data annotation, class imbalance due to the rarity of tumors, and ethical considerations surrounding patient privacy. Despite these hurdles, the paper emphasizes the potential of deep learning (DL) techniques, particularly the proposed hybrid Vision Transformer (ViT) and Gated Recurrent Unit (GRU) model, which achieved an impressive average accuracy of 98.97% in classifying brain tumors. This model effectively combines spatial feature extraction from ViT with temporal analysis from GRU, enhancing overall classification performance.

The study also compares the proposed model against various pre-trained transfer learning models, demonstrating superior performance in precision, recall, and F1-scores. Notably, the ViT-GRU model outperformed traditional models by leveraging advanced hyperparameter tuning and preprocessing techniques. Furthermore, the integration of Explainable Artificial Intelligence (XAI) methods, such as LIME and SHAP, enhances the model’s interpretability, providing valuable insights into its decision-making process. This transparency is crucial for clinical applications, as it fosters trust and aids healthcare professionals in making informed diagnostic decisions. Overall, the research presents a significant step forward in the automated detection and classification of brain tumors, addressing existing limitations while paving the way for future developments in the field.