نموذج أساسي شامل للشرائح الرقمية في علم الأمراض من بيانات العالم الحقيقي A whole-slide foundation model for digital pathology from real-world data

المجلة: Nature، المجلد: 630، العدد: 8015
DOI: https://doi.org/10.1038/s41586-024-07441-w
PMID: https://pubmed.ncbi.nlm.nih.gov/38778098
تاريخ النشر: 2024-05-22
المؤلف: Hanwen Xu وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في اكتشاف السرطان

نظرة عامة

تقدم هذه القسم نظرة عامة على Prov-GigaPath، وهو نموذج أساسي جديد لعلم الأمراض على مستوى الشريحة بالكامل مصمم لمعالجة التحديات الحسابية المتأصلة في علم الأمراض الرقمي. غالبًا ما تعتمد النماذج التقليدية على أخذ عينات من بلاطات الصور من الشرائح ذات الدقة العالية، مما قد يؤدي إلى فقدان السياق الحيوي على مستوى الشريحة. بالمقابل، يتم تدريب Prov-GigaPath مسبقًا على مجموعة بيانات واسعة تضم 1.3 مليار بلاطة صورة مرضية بحجم 256 × 256 مستمدة من 171,189 شريحة كاملة عبر 31 نوعًا من الأنسجة، مأخوذة من أكثر من 30,000 مريض ضمن شبكة صحة Providence. يستخدم النموذج GigaPath، وهو هيكل جديد لمحول الرؤية يدمج طريقة LongNet لتسهيل التعلم الفعال على مستوى الشريحة.

يظهر Prov-GigaPath أداءً متقدمًا في 25 من أصل 26 مهمة تم تقييمها، متفوقًا بشكل كبير على الطرق الحالية في 18 من هذه المهام. تمتد قدرات النموذج إلى التدريب المسبق للرؤية واللغة من خلال دمج تقارير الأمراض، مما يبرز مرونته. تؤكد الأبحاث على إمكانية علم الأمراض الحسابي في تعزيز تشخيص السرطان من خلال تطبيقات متنوعة، بما في ذلك تصنيف أنواع السرطان والتنبؤ بالتوقعات. ومع ذلك، لا تزال هناك تحديات، مثل ندرة وتنوع بيانات الأمراض المتاحة للجمهور، والحاجة إلى هياكل نماذج فعالة تلتقط الأنماط المحلية والعالمية، وإمكانية الوصول إلى النماذج المدربة مسبقًا. تؤكد النتائج على أهمية الاستفادة من البيانات الواقعية ونمذجة الشرائح الكاملة لتقدم مجال علم الأمراض الرقمي.

طرق

في هذا القسم، يوضح المؤلفون المنهجيات المستخدمة لتقييم نموذجهم، Prov-GigaPath، مقابل أربعة نهج تنافسية: HIPT وCtransPath وREMEDIS. يستخدم HIPT، المدرب مسبقًا على 10,678 صورة شريحة كاملة ذات دقة عالية (WSIs) من TCGA، هيكل محول هرمية للصورة ويستخدم نهج التعلم الذاتي DINO لمشفّر البلاطات. يكمن الاختلاف الرئيسي بين HIPT وProv-GigaPath في آليات التجميع الخاصة بهما؛ بينما يستخدم Prov-GigaPath التعلم التمثيلي طويل التسلسل مع مشفّر الشريحة، يستخدم HIPT محول رؤية من المرحلة الثانية (ViT) للتجميع.

يدمج CtransPath شبكة عصبية تلافيفية (CNN) مع SwinTransformer متعدد المقاييس، مستفيدًا من هدف التعلم التبايني لتدريب النموذج مسبقًا. من ناحية أخرى، يستخدم REMEDIS هيكل ResNet مدرب مسبقًا باستخدام نهج SimCLR على مجموعة بيانات كبيرة من صور الأمراض. قام المؤلفون بضبط Prov-GigaPath والنماذج الأساسية على مهام مختلفة، مع تجميد مشفّر البلاطات في Prov-GigaPath وضبط مشفّر مستوى الشريحة LongNet فقط. تنتج هذه الطريقة تمثيلات بلاطات سياقية مجمعة من خلال طبقة تعلم متعددة الحالات تعتمد على الانتباه (ABMIL) لإنتاج تمثيلات الشرائح لمهام التصنيف اللاحقة. تم تحديد استراتيجيات الضبط للنماذج الأخرى أيضًا، مع تسليط الضوء على التركيز على تحسين طبقة ABMIL والمصنفات لـ CtransPath وREMEDIS.

نقاش

في هذا القسم، يقدم المؤلفون Prov-GigaPath، وهو نموذج أساسي متقدم لعلم الأمراض مصمم لتعزيز تطبيقات علم الأمراض الرقمي. يتم تدريب Prov-GigaPath مسبقًا على مجموعة بيانات Prov-Path الواسعة، التي تتكون من أكثر من 1.3 مليار بلاطة صورة من أكثر من 171,000 شريحة مرضية، متجاوزة بشكل كبير حجم مجموعات البيانات الحالية مثل TCGA. يستخدم النموذج هيكل محول رؤية جديد، GigaPath، الذي يستخدم الانتباه الذاتي المتوسع لمعالجة العدد الهائل من بلاطات الصور بكفاءة، مما يمكّن من التقاط الأنماط المحلية والعالمية داخل صور الشرائح الكاملة ذات الدقة العالية (WSIs). يذكر المؤلفون أن Prov-GigaPath يحقق أداءً متقدمًا عبر 25 من أصل 26 مهمة تم تقييمها، بما في ذلك تحسينات كبيرة في التنبؤ بالطفرات وتصنيف أنواع السرطان، مما يظهر إمكانيته في التشخيص السريري ودعم القرار.

تشير النتائج إلى أن Prov-GigaPath يتفوق بشكل خاص في التنبؤ بالطفرات، محققًا تحسينات ملحوظة في مقاييس AUROC وAUPRC مقارنة بالنماذج المنافسة، حتى تلك المدربة مسبقًا على بيانات TCGA. بالإضافة إلى ذلك، يتفوق النموذج على الطرق الحالية في تصنيف أنواع السرطان عبر تسعة أنواع رئيسية من السرطان، مما يشير إلى قدرته القوية في تمييز الميزات المرضية الدقيقة. يستكشف المؤلفون أيضًا دمج التدريب المسبق للرؤية واللغة، مستفيدين من تقارير الأمراض المرتبطة لتعزيز أداء النموذج في المهام متعددة الوسائط، مثل تصنيف الأنواع بدون تدريب مسبق والتنبؤ بالطفرات. بشكل عام، يمثل Prov-GigaPath تقدمًا كبيرًا في تطبيق التعلم الآلي على علم الأمراض، مع آثار على مجالات الطب الحيوي الأوسع.

Journal: Nature, Volume: 630, Issue: 8015
DOI: https://doi.org/10.1038/s41586-024-07441-w
PMID: https://pubmed.ncbi.nlm.nih.gov/38778098
Publication Date: 2024-05-22
Author(s): Hanwen Xu et al.
Primary Topic: AI in cancer detection

Overview

The section presents an overview of Prov-GigaPath, a novel whole-slide pathology foundation model designed to address the computational challenges inherent in digital pathology. Traditional models often rely on subsampling image tiles from gigapixel slides, which can lead to a loss of critical slide-level context. In contrast, Prov-GigaPath is pretrained on an extensive dataset of 1.3 billion 256 × 256 pathology image tiles derived from 171,189 whole slides across 31 tissue types, sourced from over 30,000 patients within the Providence health network. The model employs GigaPath, a new vision transformer architecture that integrates the LongNet method to facilitate effective slide-level learning.

Prov-GigaPath demonstrates state-of-the-art performance on 25 out of 26 evaluated tasks, significantly outperforming existing methods on 18 of these tasks. The model’s capabilities extend to vision-language pretraining by incorporating pathology reports, highlighting its versatility. The research underscores the potential of computational pathology to enhance cancer diagnostics through various applications, including cancer subtyping and prognostic prediction. However, challenges remain, such as the scarcity and variability of publicly available pathology data, the need for effective model architectures that capture both local and global patterns, and the accessibility of pretrained models. The findings emphasize the importance of leveraging real-world data and whole-slide modeling to advance the field of digital pathology.

Methods

In this section, the authors detail the methodologies employed to evaluate their model, Prov-GigaPath, against four competing approaches: HIPT, CtransPath, and REMEDIS. HIPT, pretrained on 10,678 gigapixel whole slide images (WSIs) from TCGA, utilizes a hierarchical image pyramid transformer architecture and employs a DINO self-supervised learning approach for its tile encoder. The primary distinction between HIPT and Prov-GigaPath lies in their aggregation mechanisms; while Prov-GigaPath utilizes long-sequence representation learning with a slide encoder, HIPT employs a second-stage Vision Transformer (ViT) for aggregation.

CtransPath integrates a convolutional neural network (CNN) with a multi-scale SwinTransformer, leveraging a contrastive-learning objective to pretrain the model. REMEDIS, on the other hand, uses a ResNet backbone pretrained with the SimCLR approach on a large dataset of pathology images. The authors fine-tuned Prov-GigaPath and the baseline models on various downstream tasks, with Prov-GigaPath’s tile encoder frozen and only the LongNet slide-level encoder fine-tuned. This approach generates contextualized tile embeddings aggregated through a shallow Attention-Based Multiple Instance Learning (ABMIL) layer to produce slide embeddings for downstream classification tasks. The fine-tuning strategies for the other models were also specified, highlighting the focus on optimizing the ABMIL layer and classifiers for CtransPath and REMEDIS.

Discussion

In this section, the authors present Prov-GigaPath, an advanced pathology foundation model designed to enhance digital pathology applications. Prov-GigaPath is pretrained on the extensive Prov-Path dataset, which comprises over 1.3 billion image tiles from more than 171,000 pathology slides, significantly surpassing the size of existing datasets like TCGA. The model employs a novel vision transformer architecture, GigaPath, which utilizes dilated self-attention to efficiently process the vast number of image tiles, enabling the capture of both local and global patterns within gigapixel whole slide images (WSIs). The authors report that Prov-GigaPath achieves state-of-the-art performance across 25 out of 26 evaluated tasks, including significant improvements in mutation prediction and cancer subtyping, demonstrating its potential utility in clinical diagnostics and decision support.

The results indicate that Prov-GigaPath excels particularly in mutation prediction, achieving notable improvements in AUROC and AUPRC metrics compared to competing models, even those pretrained on TCGA data. Additionally, the model outperforms existing approaches in cancer subtyping across nine major cancer types, suggesting its robust capability in distinguishing subtle pathological features. The authors also explore the integration of vision-language pretraining, leveraging associated pathology reports to enhance the model’s performance in multimodal tasks, such as zero-shot subtyping and mutation prediction. Overall, Prov-GigaPath represents a significant advancement in the application of machine learning to pathology, with implications for broader biomedical domains.