محولات الرؤية الهجينة القابلة للتفسير والشبكات التلافيفية لتجزئة الأورام متعددة الأشكال في تصوير الرنين المغناطيسي للدماغ Explainable hybrid vision transformers and convolutional network for multimodal glioma segmentation in brain MRI

المجلة: Scientific Reports، المجلد: 14، العدد: 1
DOI: https://doi.org/10.1038/s41598-024-54186-7
PMID: https://pubmed.ncbi.nlm.nih.gov/38355678
تاريخ النشر: 2024-02-14
المؤلف: Ramy A. Zeineldin وآخرون
الموضوع الرئيسي: كشف وتصنيف أورام الدماغ

نظرة عامة

تقدم ورقة البحث TransXAI، وهو نموذج هجين جديد يجمع بين محولات الرؤية والشبكات العصبية التلافيفية (CNNs) لتجزئة الأورام الدبقية في مسح التصوير بالرنين المغناطيسي متعدد الوسائط (MRI). إن التحديد الدقيق للأورام الدبقية أمر حاسم للتدخلات الجراحية العصبية، ومع ذلك، غالبًا ما تعمل نماذج التعلم العميق الحالية كـ “صناديق سوداء”، مما يحد من قابليتها للاستخدام السريري. يتناول TransXAI هذه المشكلة من خلال توفير خرائط حرارية مفهومة للجراحين من خلال تقنية تفسير ما بعد الحدث، مما يسمح بالتفسير البصري لموقع الورم دون المساس بدقة النموذج أو هيكله.

تشير النتائج التجريبية إلى أن TransXAI يحقق درجات ديس المتوسطة التنافسية 0.88 و0.78 و0.75 لمجموع الورم (WT) ونواة الورم (TC) والأورام المعززة (ET) على التوالي. تؤكد هذه الأداءات على إمكانيته للاستخدام السريري جنبًا إلى جنب مع الأساليب المتطورة. تعزز قابلية تفسير النموذج الثقة بين الأطباء من خلال توافق توقعاته مع المعرفة الميدانية، مما يسهل اتخاذ قرارات أفضل في السياقات الجراحية. تهدف الأعمال المستقبلية إلى توسيع هيكل TransXAI ليشمل نماذج ثلاثية الأبعاد، مما قد يحسن دقة التجزئة من خلال التقاط العلاقات المكانية في الصور الطبية. بالإضافة إلى ذلك، يتم اقتراح دمج خرائط تنشيط المفاهيم لتوفير مزيد من الإرشادات للشبكة العصبية، مما يعزز من فائدتها في المساعدة الجراحية.

الطرق

تدمج الطريقة المقترحة لتجزئة الآفات الدماغية القابلة للتفسير هيكلًا هجينًا قائمًا على التدرج يجمع بين الشبكات العصبية التلافيفية (CNN) والمحولات. تتضمن هذه الطريقة ذات الخطوتين أولاً تجزئة حدود الأورام الدماغية من بيانات MRI متعددة الوسائط باستخدام نموذج تعلم عميق، تليها مولد تبرير ينتج تفسيرات بصرية ثنائية الأبعاد. يتم تبرير استخدام الشرائح المحورية ثنائية الأبعاد من خلال انتشارها في الممارسة السريرية، مما يساعد في تحديد حدود الأورام الدبقية بدقة مع الحفاظ على الكفاءة الحسابية.

يؤكد التصميم التجريبي على أهمية اختيار الطبقات المناسبة لتوليد خرائط البروز. على عكس الشبكات التقليدية المرمزة-المفككة التي قد تنتج خرائط منخفضة الدقة، تستخدم الطريقة المقترحة خرائط ميزات عالية الدقة من طبقة مخرجات المفكك، مما يعزز التفاصيل الملتقطة في عملية التجزئة. يعمل مولد القابلية للتفسير بعد الحدث، مما يوفر رؤى بعد أن يتم إجراء توقعات النموذج، وبالتالي يتجنب الدمج في هيكل الشبكة. تتضمن الدراسة أربعة تجارب رئيسية: تقييم كمي مقابل الأساليب المتطورة، تحليل مساهمة أنماط MRI المختلفة، تفسير طبقات CNN من خلال الذكاء الاصطناعي القابل للتفسير (XAI)، والتعليقات السريرية على الطريقة. بالإضافة إلى ذلك، يتم تحليل اكتشاف العقد الفاشلة في النموذج لفهم الأسباب الكامنة وراء أي عدم دقة.

النتائج

في قسم النتائج، يظهر نموذج TransXAI المقترح فعالية في تجزئة الأورام الدبقية عالية الدرجة (HGG) والأورام الدبقية منخفضة الدرجة (LGG) باستخدام بيانات MRI من مجموعة تدريب BraTS 2019. توضح النتائج المرئية، المعروضة في الشكل 1، قدرة النموذج على تحديد حدود الأورام الدبقية والمناطق الفرعية بدقة في كل من المنظور المحوري والجبهي. تشير مقاييس الأداء الكمية، الملخصة في الجدول 1، إلى أن TransXAI حقق متوسط معامل تشابه ديس (DSC) قدره 0.803 و مسافة هاوسدورف عند النسبة المئوية 95 (HD95) قدرها 6.19 مم عبر التحقق المتقاطع بخمسة طيات. على وجه التحديد، سجل النموذج قيم DSC قدرها 0.745 للأورام المعززة (ET)، و0.782 لنواة الورم (TC)، و0.882 لمجموع الورم (WT).

تؤكد التحليلات الإضافية على قوة نموذج TransXAI، حيث حافظ على أداء متسق عبر طيات التحقق المتقاطع المختلفة، وهي خاصية أساسية للتطبيق السريري. كانت مقاييس أداء النموذج تنافسية عند مقارنتها بالأساليب المتطورة (SOTA)، حيث تميزت بشكل خاص في DSC TC وHD95 عبر جميع المناطق الفرعية، بينما كانت قريبة من الأساليب الرائدة الأخرى في DSC ET وWT. تؤكد هذه النتائج على إمكانيات النموذج في تجزئة الأورام الدبقية بشكل موثوق، مدفوعة بالاختيارات المنهجية مثل زيادة البيانات واستراتيجية التحقق الشاملة.

المناقشة

يسلط قسم المناقشة في ورقة البحث الضوء على التحقق الخارجي متعدد المواقع لطريقة TransXAI لتجزئة الأورام الدبقية باستخدام مجموعة بيانات تحدي FeTS2022، التي تتكون من 1251 مسح دماغي متعدد الوسائط. يسمح تنوع هذه المجموعة في بروتوكولات التصوير وخصائص المرضى بتقييم قوي لقابلية تعميم TransXAI في سياق سريري. تم تصميم عملية التحقق بعناية لتقليل التداخل مع مجموعة التدريب، مما يضمن استقلالية المجموعات. أظهرت مقاييس التقييم مثل معامل تشابه ديس (DSC) ومسافة هاوسدورف (HD95) دقة عالية وقوة في تحديد حدود الورم، مما يبرز إمكانيات النموذج للتكامل السريري.

بالإضافة إلى ذلك، تناقش الورقة قابلية تفسير نموذج TransXAI من خلال تصورات Grad-CAM، التي توضح مساهمات أنماط MRI المختلفة في تحديد موقع الورم. تشير النتائج إلى أن تسلسلات T1Gd وT2 ضرورية لاكتشاف الآفات عالية الدرجة، بينما تعتبر FLAIR مهمة لتحديد الوذمة. تتماشى قدرة النموذج على تعلم الميزات الضمنية والصريحة مع الممارسات السريرية، مما يعزز الثقة ويسهل التعاون بين أنظمة الذكاء الاصطناعي والمهنيين الطبيين. على الرغم من النتائج الواعدة، يعترف المؤلفون بالقيود المتعلقة بتنوع مجموعة البيانات والحاجة إلى اختيار بروتوكولات التصوير بعناية. بشكل عام، يظهر هيكل TransXAI، مع ميزاته القابلة للتفسير، وعدًا لتحسين اتخاذ القرارات السريرية في تجزئة الأورام الدبقية.

Journal: Scientific Reports, Volume: 14, Issue: 1
DOI: https://doi.org/10.1038/s41598-024-54186-7
PMID: https://pubmed.ncbi.nlm.nih.gov/38355678
Publication Date: 2024-02-14
Author(s): Ramy A. Zeineldin et al.
Primary Topic: Brain Tumor Detection and Classification

Overview

The research paper presents TransXAI, a novel hybrid model combining vision Transformers and convolutional neural networks (CNNs) for the segmentation of gliomas in multimodal magnetic resonance imaging (MRI) scans. Accurate localization of gliomas is critical for neurosurgical interventions, yet existing deep learning models often function as “black boxes,” limiting their clinical applicability. TransXAI addresses this issue by providing surgeon-understandable heatmaps through a post-hoc explanation technique, which allows for visual interpretation of tumor localization without compromising model accuracy or architecture.

Experimental results indicate that TransXAI achieves competitive mean dice scores of 0.88, 0.78, and 0.75 for the whole tumor (WT), tumor core (TC), and enhancing tumor (ET) sub-regions, respectively. This performance underscores its potential for clinical use alongside state-of-the-art methods. The model’s explainability enhances trust among clinicians by aligning its predictions with domain knowledge, thereby facilitating better decision-making in surgical contexts. Future work aims to extend TransXAI’s architecture to include 3D models, which may improve segmentation accuracy by capturing spatial relationships in medical images. Additionally, the integration of concept activation maps is proposed to provide further guidance to the neural network, enhancing its utility in surgical assistance.

Methods

The proposed method for explainable brain lesion segmentation integrates a gradient-based hybrid architecture combining convolutional neural networks (CNN) and Transformers. This two-step approach first segments brain tumor boundaries from multimodal MRI data using a deep learning model, followed by a justification generator that produces 2D visual feature explanations. The use of 2D axial slices is justified by their prevalence in clinical practice, which aids in accurately delineating glioma boundaries while maintaining computational efficiency.

The experimental design emphasizes the importance of selecting appropriate layers for generating saliency maps. Unlike traditional encoder-decoder networks that may produce low-resolution maps, the proposed method utilizes higher-resolution feature maps from the decoder’s output layer, enhancing the detail captured in the segmentation process. The explainability generator operates post-hoc, providing insights after the model’s predictions are made, thus avoiding integration into the network architecture. The study includes four main experiments: a quantitative evaluation against state-of-the-art methods, analysis of the contribution of different MRI modalities, interpretation of CNN layers through explainable AI (XAI), and clinical feedback on the method. Additionally, the detection of failure nodes in the model is analyzed to understand the underlying reasons for any inaccuracies.

Results

In the results section, the proposed TransXAI model demonstrates effective segmentation of high-grade gliomas (HGG) and low-grade gliomas (LGG) using MRI data from the BraTS 2019 training set. Visual results, presented in Figure 1, illustrate the model’s capability in accurately delineating glioma boundaries and sub-regions in both Axial and Coronal views. Quantitative performance metrics, summarized in Table 1, indicate that TransXAI achieved an average Dice Similarity Coefficient (DSC) of 0.803 and a Hausdorff distance at the 95th percentile (HD95) of 6.19 mm across five-fold cross-validation. Specifically, the model recorded DSC values of 0.745 for the enhancing tumor (ET), 0.782 for the tumor core (TC), and 0.882 for the whole tumor (WT).

Further analysis confirms the robustness of the TransXAI model, as it maintained consistent performance across different folds of cross-validation, an essential characteristic for clinical applicability. The model’s performance metrics were competitive when compared to state-of-the-art (SOTA) methods, particularly excelling in TC DSC and HD95 across all sub-regions, while aligning closely with other leading methods for ET and WT DSC. These findings underscore the model’s potential for reliable glioma segmentation, driven by methodological choices such as data augmentation and a comprehensive validation strategy.

Discussion

The discussion section of the research paper highlights the external multi-site validation of the TransXAI method for glioma segmentation using the FeTS2022 Challenge dataset, which comprises 1251 multi-modal brain MRI scans. This dataset’s diversity in imaging protocols and patient demographics allows for a robust assessment of TransXAI’s generalizability in a clinical context. The validation process was carefully designed to minimize overlap with the training set, ensuring the independence of the datasets. Evaluation metrics such as the Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD95) demonstrated high accuracy and robustness in tumor boundary identification, underscoring the model’s potential for clinical integration.

Additionally, the paper discusses the interpretability of the TransXAI model through Grad-CAM visualizations, which elucidate the contributions of different MRI modalities to tumor localization. The findings indicate that T1Gd and T2 sequences are crucial for detecting high-grade lesions, while FLAIR is significant for identifying edema. The model’s ability to learn both implicit and explicit features aligns with clinical practices, enhancing trust and facilitating collaboration between AI systems and medical professionals. Despite the promising results, the authors acknowledge limitations related to dataset variability and the need for careful imaging protocol selection. Overall, the TransXAI architecture, with its explainability features, shows promise for improving clinical decision-making in glioma segmentation.