MDL-CA: نهج تعلم عميق متعدد الوسائط مع آلية انتباه متقاطع لتشخيص دقيق لسرطان الدماغ MDL-CA: a multimodal deep learning approach with a cross attention mechanism for accurate brain cancer diagnosis

المجلة: Frontiers in Public Health، المجلد: 13
DOI: https://doi.org/10.3389/fpubh.2025.1687335
PMID: https://pubmed.ncbi.nlm.nih.gov/41561860
تاريخ النشر: 2026-01-05
المؤلف: Sumaira Sarwar وآخرون
الموضوع الرئيسي: كشف وتصنيف أورام الدماغ

نظرة عامة

تقدم ورقة البحث MDL-CA، وهو إطار تعلم عميق متعدد الوسائط يدمج البيانات الجينومية وتصوير الرنين المغناطيسي لتعزيز تشخيص سرطان الدماغ. غالبًا ما تفشل طرق التشخيص التقليدية، مثل الخزعات الغازية وتصوير الوسائط الفردية، في الحساسية وتقييم تباين الأورام، مما يؤدي إلى تشخيصات متأخرة وغير دقيقة. يتناول MDL-CA هذه القيود من خلال استخدام آلية انتباه متقاطع تدمج تمثيلات الرسوم البيانية الجينومية من شبكة انتباه الرسوم البيانية (GAT) مع خرائط ميزات الرنين المغناطيسي من DenseNet ثلاثي الأبعاد. يسمح هذا الدمج بفهم أكثر دقة للتفاعلات البيولوجية والمكانية داخل الأورام، مما يؤدي إلى تحسين دقة التشخيص.

تظهر النتائج التجريبية عبر أربعة مجموعات بيانات مرجعية أداء MDL-CA المتفوق، حيث حقق دقة تصل إلى 96.22%، 97.14%، 98.46%، و98.21%، مع درجات F1 تتراوح من 95.95% إلى 98.40%. تسلط هذه النتائج الضوء على قوة الإطار وقابليته للتوسع، متفوقًا بشكل كبير على النماذج الحالية الرائدة. تؤكد الدراسة على إمكانية MDL-CA لدعم التشخيص المبكر وتخطيط العلاج الشخصي من خلال نمذجة التفاعل بين الميزات الجزيئية والتشريحية بشكل فعال. تهدف الأعمال المستقبلية إلى توسيع الإطار لتصنيف متعدد الفئات لأنواع الأورام الدماغية ودمجه في سير العمل السريري، مما يعزز من فائدته العملية في علم الأورام الدقيق.

مقدمة

تتناول مقدمة ورقة البحث تعقيدات سرطان الدماغ، الذي يتميز بالنمو غير الطبيعي للخلايا مما يؤدي إلى أنواع أورام مختلفة، لا سيما الأورام الدبقية والأورام الدبقية. يمكن أن تعطل هذه الأورام الوظيفة الطبيعية للدماغ، مما يؤدي إلى أعراض عصبية ويشكل تحديات كبيرة للتشخيص والعلاج بسبب طبيعتها غير المتجانسة. مع وجود حوالي 300,000 حالة على مستوى العالم سنويًا، تسلط معدلات البقاء المنخفضة لبعض الأنواع، وخاصة الأورام الدبقية، الضوء على الحاجة الملحة لتحسين تقنيات التشخيص. يقدم دمج التعلم الآلي (ML) والبيانات الجينومية طريقًا واعدًا لتعزيز دقة التشخيص، حيث يمكن أن توفر الرؤى الجينومية معلومات حول سلوك الورم واستجابات العلاج.

تؤكد الورقة على إمكانية الأساليب العميقة متعددة الوسائط التي تجمع بين البيانات الجينومية مع التصوير الطبي، وخاصة الرنين المغناطيسي. يمكن أن تستخرج التقنيات المتقدمة مثل الشبكات العصبية التلافيفية (CNNs) أنماطًا معقدة من بيانات التصوير، بينما توفر المعلومات الجينومية رؤى جزيئية حاسمة. ومع ذلك، تواجه النماذج الحالية تحديات في دمج هذه الأنواع المتنوعة من البيانات بشكل فعال، مما يؤدي غالبًا إلى أداء دون المستوى. لمعالجة هذه القيود، تقترح الدراسة MDL-CA، وهو إطار تعلم عميق متعدد الوسائط جديد يستخدم آلية دمج انتباه متقاطع لتعزيز توصيف الورم من خلال دمج التمثيلات الجينومية مع ميزات الرنين المغناطيسي. تهدف هذه الطريقة إلى تحسين دقة التصنيف وتسهيل استراتيجيات العلاج الشخصية لمرضى سرطان الدماغ.

طرق

تشمل المنهجية الموضحة في هذا البحث نهجًا شاملاً لتشخيص سرطان الدماغ باستخدام نموذج MDL-CA المقترح، والذي يتضمن مراحل مثل جمع البيانات، والمعالجة المسبقة، واستخراج الميزات ودمجها، والتحسين، والتصنيف. تم تقييم النموذج بدقة مقابل أربعة مجموعات بيانات متميزة لسرطان الدماغ—TCGA-GBM، TCGA-GBM (TCIA)، TCGA-LGG، وTCGA-LGG (TCIA)—والتي تشمل بيانات متعددة الوسائط من مسحات الرنين المغناطيسي والملفات الجينومية. أظهر نموذج MDL-CA أداءً تشخيصيًا مثيرًا للإعجاب، حيث حقق دقة تصل إلى 96.22%، 97.14%، 98.46%، و98.21% عبر مجموعات البيانات، إلى جانب دقة عالية (95.80% إلى 98.30%)، واسترجاع (96.10% إلى 98.50%)، ودرجات F1 (95.95% إلى 98.40%).

تُعزى فعالية النموذج إلى دمجه لتمثيلات الرسوم البيانية من البيانات الجينومية عبر شبكة انتباه الرسوم البيانية (GAT) مع ميزات الرنين المغناطيسي المستخرجة من DenseNet ثلاثي الأبعاد، مما يعزز قدرته على التقاط العلاقات البيولوجية المكانية المعقدة. يعزز استخدام دالة تنشيط Entmax sigmoid توزيعات الاحتمالات النادرة، مما يحسن من قابلية التفسير. يتم التحقق من أداء النموذج من خلال مقاييس مثل Log Loss، مع قيم منخفضة باستمرار تشير إلى توقعات احتمالية موثوقة. تكشف التحليلات المقارنة أن MDL-CA يتفوق على النماذج الأساسية الحالية، محققًا تحسينات كبيرة في الدقة ودرجات F1، مما يؤكد قوته وقدرته على التعميم في تصنيف سرطان الدماغ متعدد الوسائط. تشير منحنيات التدريب والتحقق إلى تقارب مستقر مع الحد الأدنى من الإفراط في التكيف، مما يعزز من قابلية تطبيق النموذج سريريًا.

نتائج

في قسم النتائج، تم تقييم أداء نموذج التحليل المقترح الذي لا يعتمد على المحاذاة، والذي يُشار إليه بإطار MDL-CA، بدقة باستخدام مقاييس قياسية لمهام التصنيف الثنائي ومتعدد الفئات. تم تقسيم مجموعة البيانات إلى 85% للتدريب و15% للاختبار. خضع النموذج للتدريب لمدة 150 دورة باستخدام مُحسّن Adam مع معدل تعلم ابتدائي قدره $1 \times 10^{-4}$ وانخفاض وزن قدره $1 \times 10^{-5}$. تم تنفيذ جدولة تذويب جيبي لتقليل معدل التعلم تدريجيًا، بينما تم استخدام أحجام دفعات صغيرة من 8 لمجموعات الرنين المغناطيسي والميزات الجينومية، مع إضافة تراكم التدرجات لمعالجة متعددة الوسائط بشكل فعال. تم تطبيق إيقاف مبكر إذا لم يتحسن فقدان التحقق على مدى 15 دورة متتالية.

شملت البيئة الحاسوبية للتجارب بطاقة رسومات NVIDIA RTX 3090 مع 24 جيجابايت من VRAM، تعمل بنظام CUDA 11.7 وcuDNN 8.5، مع تنفيذ النموذج في PyTorch 1.12 على محطة عمل تحتوي على معالج Intel Core i9-12900K و64 جيجابايت من RAM. تشمل مقاييس التقييم المستخدمة لتقييم أداء نموذج MDL-CA الدقة، والدقة، والاسترجاع (الحساسية)، ودرجة F1، والمساحة تحت منحنى التشغيل الخاص بالمستقبل (AUC-ROC)، مما يوفر نظرة شاملة على قدرات تصنيف النموذج.

مناقشة

تناقش ورقة البحث تطوير إطار تشخيصي متعدد الوسائط جديد، MDL-CA، يهدف إلى تحسين تشخيص سرطان الدماغ من خلال دمج بيانات الرنين المغناطيسي والبيانات الجينومية. تحدد الدراسة قيودًا كبيرة في الأنظمة التشخيصية الحالية التي تعتمد فقط على إما التصوير أو البيانات الجينومية، داعية إلى نهج شامل يلتقط كل من الخصائص التشريحية والجزيئية. يستخدم إطار MDL-CA آلية دمج انتباه متعددة الوسائط لتضمين تمثيلات الرسوم البيانية الجينومية مباشرة في خرائط ميزات الرنين المغناطيسي، مما يسهل التقاط العلاقات البيولوجية المكانية المعقدة. باستخدام DenseNet ثلاثي الأبعاد وشبكات انتباه الرسوم البيانية (GAT) لاستخراج الميزات الخاصة بالوسائط، يحقق الإطار أداءً رائدًا في أربعة مجموعات بيانات مرجعية، مع دقة تتراوح من 96.22% إلى 98.46% ودرجات F1 بين 95.95% و98.40%.

تسلط مراجعة الأدبيات الضوء على التقدم الأخير في تقنيات التعلم الآلي والتعلم العميق لتشخيص سرطان الدماغ، مع التأكيد على دمج البيانات الجينومية وبيانات الرنين المغناطيسي. تنتقد الدراسات الحالية لقيودها في تحليل مجموعات البيانات متعددة الوسائط وتحدد النماذج الناشئة التي تهدف إلى تعزيز دقة التشخيص. كما تؤكد المراجعة على أهمية معالجة التحديات المرتبطة بجودة البيانات وتوافرها في البيئات السريرية. تختتم الورقة بتفصيل المنهجية، والنتائج التجريبية، والآثار العملية لإطار MDL-CA، مما يمهد الطريق لاتجاهات البحث المستقبلية في تشخيصات السرطان متعددة الوسائط.

Journal: Frontiers in Public Health, Volume: 13
DOI: https://doi.org/10.3389/fpubh.2025.1687335
PMID: https://pubmed.ncbi.nlm.nih.gov/41561860
Publication Date: 2026-01-05
Author(s): Sumaira Sarwar et al.
Primary Topic: Brain Tumor Detection and Classification

Overview

The research paper introduces MDL-CA, a Multimodal Deep Learning framework that integrates genomic data and MRI imaging to enhance brain cancer diagnosis. Traditional diagnostic methods, such as invasive biopsies and single-modality imaging, often fall short in sensitivity and tumor heterogeneity assessment, leading to delayed and inaccurate diagnoses. MDL-CA addresses these limitations by employing a cross-attention mechanism that fuses genomic graph embeddings from a Graph Attention Network (GAT) with MRI feature maps from a 3D DenseNet. This integration allows for a more nuanced understanding of the biological and spatial interactions within tumors, resulting in improved diagnostic accuracy.

Experimental results across four benchmark datasets demonstrate MDL-CA’s superior performance, achieving accuracies of 96.22%, 97.14%, 98.46%, and 98.21%, with F1-scores ranging from 95.95% to 98.40%. These findings highlight the framework’s robustness and scalability, significantly outperforming existing state-of-the-art models. The study emphasizes the potential of MDL-CA to support early diagnosis and personalized treatment planning by effectively modeling the interplay between molecular and anatomical features. Future work aims to extend the framework for multi-class classification of brain tumor subtypes and integrate it into clinical workflows, enhancing its practical utility in precision oncology.

Introduction

The introduction of the research paper addresses the complexities of brain cancer, characterized by the abnormal growth of cells leading to various tumor types, notably glioblastomas and gliomas. These tumors can disrupt normal brain function, resulting in neurological symptoms and posing significant challenges for diagnosis and treatment due to their heterogeneous nature. With approximately 300,000 global cases annually, the low survival rates for certain types, particularly glioblastoma, highlight the urgent need for improved diagnostic techniques. The integration of machine learning (ML) and genomic data presents a promising avenue for enhancing diagnostic accuracy, as genomic insights can inform tumor behavior and treatment responses.

The paper emphasizes the potential of multimodal deep learning approaches that combine genomic data with medical imaging, particularly MRI. Advanced techniques like Convolutional Neural Networks (CNNs) can extract complex patterns from imaging data, while genomic information provides critical molecular insights. However, existing models face challenges in effectively integrating these heterogeneous data types, often leading to suboptimal performance. To address these limitations, the study proposes MDL-CA, a novel multimodal deep learning framework that utilizes a cross-attention fusion mechanism to enhance tumor characterization by integrating genomic embeddings with MRI features. This approach aims to improve classification accuracy and facilitate personalized treatment strategies for brain cancer patients.

Methods

The methodology outlined in this research involves a comprehensive approach for brain cancer diagnosis using the proposed MDL-CA model, which encompasses stages such as data collection, preprocessing, feature extraction and fusion, optimization, and classification. The model was rigorously evaluated against four distinct brain cancer datasets—TCGA-GBM, TCGA-GBM (TCIA), TCGA-LGG, and TCGA-LGG (TCIA)—which include multimodal data from MRI scans and genomic profiles. The MDL-CA model demonstrated impressive diagnostic performance, achieving accuracies of 96.22%, 97.14%, 98.46%, and 98.21% across the datasets, alongside high precision (95.80% to 98.30%), recall (96.10% to 98.50%), and F1-scores (95.95% to 98.40%).

The model’s effectiveness is attributed to its integration of graph embeddings from genomic data via a Graph Attention Network (GAT) with 3D DenseNet-extracted MRI features, which enhances its ability to capture complex spatial-biological relationships. The use of the Entmax sigmoid activation function promotes sparse probability distributions, improving interpretability. The model’s performance is further validated through metrics such as Log Loss, with consistently low values indicating reliable probabilistic predictions. Comparative analyses reveal that MDL-CA outperforms existing baseline models, achieving significant improvements in accuracy and F1-scores, thereby confirming its robustness and generalization capabilities in multimodal brain cancer classification. The training and validation curves indicate stable convergence with minimal overfitting, reinforcing the model’s clinical applicability.

Results

In the results section, the performance of the proposed alignment-free analysis model, referred to as the MDL-CA framework, was rigorously evaluated using standard metrics for binary and multi-class classification tasks. The dataset was partitioned into 85% for training and 15% for testing. The model underwent training for 150 epochs utilizing the Adam optimizer with an initial learning rate of $1 \times 10^{-4}$ and a weight decay of $1 \times 10^{-5}$. A cosine annealing scheduler was implemented to gradually decrease the learning rate, while mini-batch sizes of 8 were employed for MRI volumes and genomic features, supplemented by gradient accumulation for effective multimodal processing. Early stopping was applied if validation loss did not improve over 15 consecutive epochs.

The computational environment for the experiments included an NVIDIA RTX 3090 GPU with 24 GB VRAM, running CUDA 11.7 and cuDNN 8.5, with the model implemented in PyTorch 1.12 on a workstation featuring an Intel Core i9-12900K CPU and 64 GB RAM. The evaluation metrics used to assess the MDL-CA model’s performance included Accuracy, Precision, Recall (Sensitivity), F1-score, and the Area Under the Receiver Operating Characteristic Curve (AUC-ROC), providing a comprehensive overview of the model’s classification capabilities.

Discussion

The research paper discusses the development of a novel multimodal diagnostic framework, MDL-CA, aimed at improving brain cancer diagnosis by integrating MRI and genomic data. The study identifies significant limitations in existing diagnostic systems that rely solely on either imaging or genomic data, advocating for a comprehensive approach that captures both anatomical and molecular characteristics. The MDL-CA framework employs a cross-modal attention fusion mechanism to embed genomic graph representations directly into MRI feature maps, facilitating the capture of complex spatial-biological relationships. Utilizing 3D DenseNet and Graph Attention Networks (GAT) for modality-specific feature extraction, the framework achieves state-of-the-art performance across four benchmark datasets, with accuracies ranging from 96.22% to 98.46% and F1-scores between 95.95% and 98.40%.

The literature review highlights recent advancements in machine learning and deep learning techniques for brain cancer diagnosis, emphasizing the integration of genomic and MRI data. It critiques existing studies for their limitations in analyzing multimodal datasets and outlines emerging models that aim to enhance diagnostic accuracy. The review also underscores the importance of addressing the challenges associated with data quality and availability in clinical settings. The paper concludes by detailing the methodology, experimental results, and practical implications of the MDL-CA framework, setting the stage for future research directions in multimodal cancer diagnostics.