التعلم المتعدد الحالات الهندسي لتجزئة سرطان المعدة تحت إشراف ضعيف Geometric multi-instance learning for weakly supervised gastric cancer segmentation

المجلة: npj Digital Medicine، المجلد: 9، العدد: 1
DOI: https://doi.org/10.1038/s41746-025-02287-6
PMID: https://pubmed.ncbi.nlm.nih.gov/41530453
تاريخ النشر: 2026-01-13
المؤلف: Chenshen Huang وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في اكتشاف السرطان

نظرة عامة

تناقش هذه الفقرة التحديات المتعلقة بالتجزئة ذات الإشراف الضعيف للمناطق السرطانية في الصور الكاملة (WSIs) ضمن علم الأمراض الحاسوبي، وخاصة الاعتماد على التعليقات التوضيحية المكلفة على مستوى البكسل. تكافح الأطر الحالية للتعلم متعدد الحالات (MIL) لتوليد أقنعة تجزئة دقيقة لأنها تعتبر WSIs كـ ‘حقائب غير مرتبة من الرقع’، متجاهلة التركيب النسيجي الأساسي والأنماط المعمارية التي تشير إلى الخباثة. للتغلب على هذه القيود، يقدم المؤلفون التعلم متعدد الحالات الهندسي (Geo-MIL)، وهو إطار جديد قائم على الرسوم البيانية مصمم لتعزيز دقة التجزئة من خلال دمج العلاقات الهندسية بين الهياكل النسيجية.

مقدمة

يعد سرطان المعدة (GC) مساهمًا رئيسيًا في الوفيات المرتبطة بالسرطان على مستوى العالم، حيث يعتمد تشخيصه بشكل أساسي على التحليل النسيجي للعينات النسيجية. تتطلب هذه العملية، التي تشمل فحص علماء الأمراض للصور الكاملة (WSIs) للبحث عن المناطق الخبيثة، جهدًا كبيرًا بسبب الحاجة إلى تعليقات توضيحية دقيقة على مستوى البكسل. أظهرت التطورات الأخيرة في التعلم العميق، وخاصة من خلال التعلم تحت الإشراف الضعيف (WSL) والتعلم متعدد الحالات (MIL)، وعدًا في أتمتة أجزاء من هذه العملية. ومع ذلك، تواجه الأساليب التقليدية لـ MIL قيودًا من خلال معالجة رقع الصور بشكل مستقل، متجاهلة السياق المكاني الحرج اللازم لتجزئة الأورام بدقة.

لمعالجة هذه التحديات، يقدم هذا البحث إطار التعلم متعدد الحالات الهندسي (Geo-MIL) الجديد الذي يقوم بنمذجة WSIs كرسوم بيانية، حيث تكون الرقع هي العقد المتصلة بواسطة حواف تمثل العلاقات المكانية. تتضمن هذه الطريقة المبتكرة آلية بوابة طوبولوجية قابلة للتعلم، مما يسمح للنموذج بالتركيز على الأنماط المعمارية ذات الصلة بدلاً من التركيز فقط على ميزات الرقعة الفردية. يجسر إطار Geo-MIL الفجوة بين الإشراف الضعيف والتنبؤ الكثيف، مما يظهر أداءً متفوقًا في توليد أقنعة التجزئة الدقيقة باستخدام تسميات على مستوى الشريحة فقط. تؤكد التحقق التجريبي الواسع على مجموعات بيانات سرطان المعدة المتعددة أن Geo-MIL يتفوق بشكل كبير على الأساليب الحالية القائمة على MIL، مما يمثل تقدمًا كبيرًا في تطبيق التعلم العميق على علم الأمراض النسيجي.

الطرق

في هذه الدراسة، يستخدم المؤلفون نهج التعلم الذاتي لتطوير تمثيلات رقع قوية للصور الكاملة (WSIs) باستخدام هيكل Vision Transformer (ViT-S/16) المدرب مسبقًا على مجموعة بيانات TCGA للسرطان الشامل. يقومون باستخراج متجهات ميزات بُعدها 384 من رقع بحجم 256×256 مع الحفاظ على أوزان مستخرج الميزات ثابتة أثناء التدريب للحفاظ على توزيعات الميزات مستقرة. لتعزيز قوة النموذج وتقليل الإفراط في التكيف، يتم تطبيق تقنيات تعزيز البيانات أثناء التشغيل، بما في ذلك الانقلابات العشوائية، والدوران، وتغيير الألوان المخصصة للصور الملونة بصبغة H&E.

يستخدم إطار Geo-MIL المقترح هذه الميزات المستخرجة لبناء رسم بياني لـ WSI بناءً على أقرب الجيران k = 8 لمراكز الرقع. يتكون شبكة الأعصاب الرسومية الانتباه الطوبولوجي (TopoGNN) من ثلاث طبقات رسومية مع بُعد ميزات مخفية قدره 256. يتم تدريب النموذج بشكل شامل باستخدام مُحسِّن AdamW مع معدل تعلم قدره $1 \times 10^{-4}$ ووزن تآكل قدره $1 \times 10^{-5}$، جنبًا إلى جنب مع جدولة معدل تعلم تآكل جيبي. نظرًا للحجم الكبير لرسوم بيانية WSI، يقوم المؤلفون بتنفيذ حجم دفعة قدره 1 وتطبيق تراكم التدرجات على مدى 16 خطوة. يتم تحديد التدريب عند 100 دورة، مع إيقاف مبكر بناءً على درجة Dice للتحقق. عند الاستدلال، يتم توليد احتمالات الأورام على مستوى الرقعة وإعادة تجميعها في خريطة حرارة احتمالية ثنائية الأبعاد، يتم من خلالها اشتقاق قناع تجزئة ثنائي باستخدام عتبة 0.5. تم إجراء جميع التجارب على خادم مزود بـ 4 وحدات معالجة رسومية NVIDIA A100 بسعة 80 جيجابايت، باستخدام مكتبات PyTorch وPyG.

النتائج

في هذا القسم، يقدم المؤلفون نتائج تجاربهم التي تتحقق من إطار Geo-MIL لتصنيف الشرائح على مستوى الإشراف الضعيف وتجزئة الآفات. يتناولون ثلاثة أسئلة رئيسية: الأداء المقارن لـ Geo-MIL مقابل الأساليب الحديثة، ضرورة تمثيله الرسومي وآلية الانتباه الطوبولوجي، والأهمية السريرية لخرائط التجزئة المنتجة. تكشف التحليلات النوعية أن Geo-MIL يتفوق بشكل كبير على الأساليب الحالية، خاصة في تحديد الهياكل المعقدة للأورام، كما يتضح من المقارنات البصرية مع النماذج الأساسية في حالات سرطان الغدة المعدية الصعبة.

تظهر النتائج أنه بينما تفشل الأساليب المنافسة، مثل Patch-WI وAB-MIL، في التقاط النطاق الكامل للأورام بدقة، يميز Geo-MIL بفعالية بين أعشاش الأورام المنفصلة ويلتزم عن كثب بالحدود التشريحية المعقدة. تسلط الأشكال المقدمة في الدراسة الضوء على التماسك المكاني الفائق والدقة التشريحية لأقنعة التجزئة الخاصة بـ Geo-MIL، والتي تتماشى عن كثب مع الحقيقة الأرضية المعلّمة من قبل علماء الأمراض. تعتبر هذه الدقة العالية ضرورية للتحليلات الكمية الموثوقة في البيئات السريرية، مثل قياس مساحة الورم وتقييم جبهة الغزو، مما يبرز التطبيق العملي لإطار Geo-MIL في سير العمل في علم الأمراض الرقمية.

المناقشة

في هذا القسم، يناقش المؤلفون مجموعات البيانات وطرق المعالجة المسبقة المستخدمة لتقييم إطار Geo-MIL المقترح لعلم الأمراض النسيجي لسرطان المعدة. استخدموا ثلاث مجموعات بيانات متاحة للجمهور: TCGA-STAD وGasHisSDB وACDC-GastricDB، والتي تضم ما مجموعه 1,045 صورة كاملة (WSIs) من مرضى مختلفين. تم تنسيق مجموعات البيانات بعناية، مع تحديد مناطق الأورام من قبل علماء الأمراض الخبراء لضمان جودة عالية للحقيقة الأرضية للتقييم. قام المؤلفون بتنفيذ خط أنابيب معالجة مسبقة منهجية، بما في ذلك تجزئة المناطق النسيجية وتقسيمها إلى رقع، تليها تقسيم على مستوى المريض للتدريب والتحقق والاختبار لمنع تسرب البيانات.

قارن المؤلفون Geo-MIL ضد العديد من الأساليب الأساسية الحديثة عبر مهمتين رئيسيتين: تصنيف WSI والتجزئة ذات الإشراف الضعيف. تظهر نتائجهم، المقدمة في جدول شامل، أن Geo-MIL يتفوق على جميع الأساليب الأساسية، محققًا أداءً حديثًا في كلا المهمتين، وخاصة في التجزئة مع درجة Dice قدرها 0.789 على مجموعة بيانات TCGA-STAD. ينسبون هذا النجاح إلى التصميم الفريد لـ Geo-MIL، الذي يتضمن آلية انتباه طوبولوجية تلتقط بفعالية الهيكل المكاني لأنسجة الأورام، مما يؤدي إلى أقنعة تجزئة أكثر دقة وتماسكًا. تؤكد دراسات الإزالة أيضًا على أهمية كل مكون من مكونات إطار Geo-MIL، مؤكدة أن التمثيل الرسومي والبوابة الطوبولوجية هما عنصران حاسمان لتحقيق أداء عالٍ. بشكل عام، تؤكد النتائج على قوة Geo-MIL وقابليته للتعميم، مما يجعله أداة واعدة للتطبيقات السريرية في علم الأمراض الرقمية.

Journal: npj Digital Medicine, Volume: 9, Issue: 1
DOI: https://doi.org/10.1038/s41746-025-02287-6
PMID: https://pubmed.ncbi.nlm.nih.gov/41530453
Publication Date: 2026-01-13
Author(s): Chenshen Huang et al.
Primary Topic: AI in cancer detection

Overview

The section discusses the challenges of weakly supervised segmentation of cancerous regions in whole-slide images (WSIs) within computational pathology, particularly the reliance on costly pixel-level annotations. Existing Multiple Instance Learning (MIL) frameworks struggle to generate precise segmentation masks as they consider WSIs as unordered ‘bags-of-patches’, neglecting the essential tissue topology and architectural patterns indicative of malignancy. To overcome this limitation, the authors introduce Geometric Multi-Instance Learning (Geo-MIL), a novel graph-based framework designed to enhance segmentation accuracy by incorporating geometric relationships among tissue structures.

Introduction

Gastric cancer (GC) is a major contributor to cancer-related mortality globally, with its diagnosis primarily relying on histopathological analysis of tissue biopsies. This process, which involves pathologists examining whole-slide images (WSIs) for malignant regions, is labor-intensive due to the need for precise pixel-level annotations. Recent advancements in deep learning, particularly through Weakly Supervised Learning (WSL) and Multiple Instance Learning (MIL), have shown promise in automating parts of this workflow. However, traditional MIL approaches face limitations by treating image patches independently, neglecting the critical spatial context necessary for accurate tumor segmentation.

To address these challenges, this study introduces a novel Geometric Multi-Instance Learning (Geo-MIL) framework that models WSIs as graphs, where patches are nodes connected by edges representing spatial relationships. This innovative approach incorporates a learnable topological gating mechanism, allowing the model to focus on relevant architectural patterns rather than solely on individual patch features. The Geo-MIL framework bridges the gap between weak supervision and dense prediction, demonstrating superior performance in generating accurate segmentation masks using only slide-level labels. Extensive experimental validation on multiple gastric cancer datasets confirms that Geo-MIL significantly outperforms existing state-of-the-art MIL-based methods, marking a substantial advancement in the application of deep learning to histopathology.

Methods

In this study, the authors employ a self-supervised learning approach to develop robust patch representations for whole slide images (WSIs) using a Vision Transformer (ViT-S/16) backbone pre-trained on the TCGA pan-cancer cohort. They extract 384-dimensional feature vectors from 256×256 patches while keeping the feature extractor’s weights frozen during training to maintain stable feature distributions. To enhance model robustness and mitigate overfitting, on-the-fly data augmentation techniques, including random flips, rotations, and color jittering tailored for H&E-stained images, are applied.

The proposed Geo-MIL framework utilizes these extracted features to construct a WSI graph based on the k = 8 nearest neighbors of patch centroids. The Topological Attention Graph Neural Network (TopoGNN) comprises three graph layers with a hidden feature dimension of 256. The model is trained end-to-end using the AdamW optimizer with a learning rate of $1 \times 10^{-4}$ and a weight decay of $1 \times 10^{-5}$, alongside a cosine annealing learning rate scheduler. Due to the extensive size of the WSI graphs, the authors implement a batch size of 1 and apply gradient accumulation over 16 steps. The training is capped at 100 epochs, with early stopping based on the validation Dice score. At inference, patch-level tumor probabilities are generated and reassembled into a 2D probability heatmap, from which a binary segmentation mask is derived using a threshold of 0.5. All experiments were conducted on a server with 4 NVIDIA A100 80GB GPUs, utilizing PyTorch and PyG libraries.

Results

In this section, the authors present the results of their experiments validating the Geo-MIL framework for weakly supervised slide-level classification and lesion segmentation. They address three primary questions: the comparative performance of Geo-MIL against state-of-the-art methods, the necessity of its graph representation and topological attention mechanism, and the clinical relevance of the segmentation maps produced. The qualitative analysis reveals that Geo-MIL significantly outperforms existing methods, particularly in delineating complex tumor structures, as illustrated through visual comparisons with baseline models on challenging gastric adenocarcinoma cases.

The results demonstrate that while competing methods, such as Patch-WI and AB-MIL, fail to accurately capture the full extent of tumors, Geo-MIL effectively distinguishes between separate tumor nests and adheres closely to intricate anatomical boundaries. Figures presented in the study highlight the superior spatial coherence and anatomical precision of Geo-MIL’s segmentation masks, which align closely with pathologist-annotated ground truth. This high level of accuracy is essential for reliable quantitative analyses in clinical settings, such as tumor area measurement and invasion front assessment, underscoring the practical applicability of the Geo-MIL framework in digital pathology workflows.

Discussion

In this section, the authors discuss the datasets and preprocessing methods used to evaluate their proposed Geo-MIL framework for gastric cancer histopathology. They utilized three publicly available datasets: TCGA-STAD, GasHisSDB, and ACDC-GastricDB, comprising a total of 1,045 whole-slide images (WSIs) from various patients. The datasets were meticulously curated, with tumor regions annotated by expert pathologists to ensure high-quality ground truth for evaluation. The authors implemented a systematic preprocessing pipeline, including segmentation of tissue regions and tiling into patches, followed by a patient-level split for training, validation, and testing to prevent data leakage.

The authors compared Geo-MIL against several state-of-the-art baselines across two main tasks: WSI classification and weakly supervised segmentation. Their results, presented in a comprehensive table, demonstrate that Geo-MIL outperforms all baseline methods, achieving state-of-the-art performance in both tasks, particularly excelling in segmentation with a Dice score of 0.789 on the TCGA-STAD dataset. They attribute this success to Geo-MIL’s unique design, which incorporates a topological attention mechanism that effectively captures the spatial architecture of tumor tissues, leading to more accurate and coherent segmentation masks. The ablation studies further validate the importance of each component of the Geo-MIL framework, confirming that the graph representation and topological gate are critical for achieving high performance. Overall, the findings underscore Geo-MIL’s robustness and generalizability, making it a promising tool for clinical applications in digital pathology.