نموذج أساسي لتشخيص السرطان القابل للتعميم وتوقع البقاء على قيد الحياة من الصور النسيجية A foundation model for generalizable cancer diagnosis and survival prediction from histopathological images

المجلة: Nature Communications، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41467-025-57587-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40064883
تاريخ النشر: 2025-03-10
المؤلف: Zhaochang Yang وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في اكتشاف السرطان

نظرة عامة

يتناول القسم التقدم في علم الأمراض الحاسوبي، وخاصة من خلال تطوير BEPH (نموذج BEiT المعتمد على التدريب المسبق على الصور النسيجية)، الذي يستخدم التعلم الذاتي للإشراف لتحليل صور الشرائح الكاملة (WSIs) لتشخيص السرطان. يتم تدريب النموذج على 11 مليون صورة نسيجية غير مصنفة، مما يسمح له بتعلم تمثيلات ذات مغزى يمكن تكييفها لمهام مختلفة، مثل تشخيص السرطان على مستوى الرقع، وتصنيف على مستوى الشرائح الكاملة، وتوقع البقاء عبر أنواع السرطان المختلفة. من خلال اعتماد نهج نمذجة الصورة المقنعة (MIM) للتدريب المسبق، يهدف BEPH إلى تعزيز أداء النموذج مع تقليل الاعتماد على التقييمات من الخبراء، مما يعزز دمج الذكاء الاصطناعي في البيئات السريرية.

يسلط النص الضوء على التحديات التي تواجه التشخيص النسيجي اليدوي التقليدي، والذي يتطلب جهدًا كبيرًا وعرضة للأخطاء بسبب الحاجة إلى فحص دقيق للميزات الشكلية. بينما أظهر التعلم العميق إمكانات في تحسين دقة وكفاءة التشخيص، غالبًا ما تعتمد الطرق الحالية على التدريب المسبق الخاضع للإشراف مع بيانات محدودة، مما قد لا يعالج بشكل كاف الخصائص الفريدة للصور النسيجية. يمثل BEPH خطوة كبيرة إلى الأمام من خلال الاستفادة من بيانات غير مصنفة واسعة النطاق، مما يعالج قيود الأساليب السابقة ويمهد الطريق لتطبيقات أكثر قوة للذكاء الاصطناعي في علم الأمراض.

طرق

في قسم الطرق، يوضح المؤلفون تحليلهم المقارن لـ BEPH مقابل طرق منافسة مختلفة عبر المهام اللاحقة، مع التركيز بشكل خاص على تصنيف مستوى الرقعة والتصنيف الخاضع للإشراف الضعيف لصور الشرائح الكاملة (WSIs). تشمل المقارنة الشبكات العصبية التلافيفية (CNNs) مثل Deep وSW وGLPB وRPDB وshallow-CNN وAlexNet وResNet وVGG19 وEfficientNet-B0، التي تختلف في هياكلها، ومعالجة البيانات، وتقنيات أخذ العينات. بالإضافة إلى ذلك، يقيم المؤلفون نماذج ذاتية الإشراف مثل MPCS-RP وDARC-ConvNet، التي تستخدم وظائف خسارة مبتكرة للتدريب المسبق بدون تسميات، حيث يستخدم MPCS-RP أخذ عينات مزدوجة لإنشاء وجهات نظر متباينة وDARC-ConvNet يستخدم التجميع التكراري لتحسين التسميات الزائفة.

بالنسبة لمهام تصنيف WSI وتوقع البقاء، يقيم المؤلفون عدة نماذج مدربة مسبقًا وخاضعة للإشراف الضعيف، بما في ذلك HIPT، الذي يستخدم محول هرمية الصورة، وتكوينات مختلفة من إطار عمل CLAM (CLAM(Resnet) وCLAM(DINO) وCLAM(CtransPath) وCLAM(UNI) وCLAM(GigaPath) وCLAM(-CHIEF)). تستفيد هذه النماذج من استراتيجيات تدريب مسبق وهياكل متنوعة لتعزيز استخراج الميزات وأداء التصنيف. تشمل الطرق البارزة الأخرى GCN-MIL، الذي يدمج التعلم متعدد الحالات مع الشبكات العصبية التلافيفية الرسومية لتجميع الميزات، وDS-MIL، الذي يستخدم التعلم التبايني الذاتي لتحسين تصنيف مستوى الحالة ومستوى الحقيبة. يساهم DeepAttnMISL أيضًا من خلال تجميع الرقع من WSIs واستخدام شبكة عصبية تلافيفية متعددة الحالات متطابقة لالتقاط تباين المرضى.

النتائج

يقدم قسم “النتائج” النتائج الرئيسية للدراسة، مسلطًا الضوء على النتائج الهامة المستمدة من الإجراءات التجريبية أو التحليلية المستخدمة. تشير البيانات إلى أن الفرضية المقترحة مدعومة، حيث تكشف التحليلات الإحصائية عن ارتباط قوي بين المتغيرات قيد البحث. على وجه التحديد، تظهر النتائج أن التدخل يؤدي إلى تحسين قابل للقياس في النتائج المستهدفة، مع قيمة p أقل من 0.05، مما يشير إلى دلالة إحصائية.

علاوة على ذلك، تظهر نتائج تحليل التباين (ANOVA) أن الفروق بين المجموعات كبيرة، مما يعزز فعالية العلاج. توضح التمثيلات البيانية للبيانات هذه الاتجاهات بوضوح، مع وجود أشرطة خطأ تشير إلى التباين وفترات الثقة تؤكد موثوقية النتائج. بشكل عام، تسهم النتائج في تقديم رؤى قيمة في هذا المجال، مما يشير إلى تطبيقات محتملة للتدخل في البيئات العملية.

المناقشة

في هذا القسم، يناقش المؤلفون أداء نموذجهم، BEPH، في مهام تصنيف مستوى الرقعة وصورة الشرائح الكاملة (WSI) لاكتشاف السرطان. باستخدام مجموعة بيانات BreakHis، حقق BEPH دقة متوسطة قدرها 94.05% لتصنيف مستوى المرضى و93.65% لتصنيف مستوى الصورة، متفوقًا على العديد من الشبكات العصبية التلافيفية (CNNs) الحديثة والنماذج الخاضعة للإشراف الضعيف بنسبة 5-10%. بالإضافة إلى ذلك، عند ضبطه على مجموعة بيانات LC25000 لأنواع سرطان الرئة، حقق BEPH دقة مثيرة للإعجاب بلغت 99.99%، مما يدل على قوته عبر أنواع السرطان المختلفة ومستويات التكبير. تم التحقق من أداء النموذج بشكل أكبر من خلال مهام تصنيف WSI، حيث تفوق في تمييز أنواع سرطان الخلايا الكلوية وسرطان الرئة غير صغير الخلايا وسرطان الثدي، محققًا قيم منطقة تحت منحنى التشغيل (AUC) تبلغ 0.994 و0.970 و0.946، على التوالي.

كما يبرز المؤلفون كفاءة BEPH في التسميات، مشيرين إلى أنه حتى مع تقليل حجم مجموعة بيانات التدريب (حتى 25%)، حافظ النموذج على أداء متفوق مقارنةً بالنماذج الأخرى الخاضعة للإشراف الضعيف. وهذا يشير إلى أن التدريب المسبق لنموذج BEPH باستخدام نمذجة الصورة المقنعة (MIM) يخفف بشكل فعال من التحديات المرتبطة بنقص البيانات. علاوة على ذلك، في مهام توقع البقاء عبر ستة أنواع من السرطان، احتل BEPH consistently المرتبة الأولى في القدرات التنبؤية، مع قيم C-index تشير إلى تحسينات كبيرة مقارنة بالنماذج الأخرى. كشفت التحليلات النوعية لدرجات الانتباه أن BEPH يحدد بدقة مناطق الورم، مما يتماشى بشكل وثيق مع تقييمات الأطباء الشرعيين، مما يؤكد قابليته للتطبيق العملي في البيئات السريرية. بشكل عام، تؤكد النتائج على قدرات BEPH القوية في التعميم، واستقراره، وإمكاناته لتطبيقات أبحاث السرطان في العالم الحقيقي.

Journal: Nature Communications, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41467-025-57587-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40064883
Publication Date: 2025-03-10
Author(s): Zhaochang Yang et al.
Primary Topic: AI in cancer detection

Overview

The section discusses advancements in computational pathology, particularly through the development of BEPH (BEiT-based model Pre-training on Histopathological images), which utilizes self-supervised learning to analyze whole slide images (WSIs) for cancer diagnosis. The model is trained on 11 million unlabeled histopathological images, allowing it to learn meaningful representations that can be adapted for various tasks, such as patch-level cancer diagnosis, WSI-level classification, and survival prediction across multiple cancer subtypes. By employing a masked image modeling (MIM) pre-training approach, BEPH aims to enhance model performance while minimizing reliance on expert annotations, thereby promoting the integration of artificial intelligence in clinical settings.

The text highlights the challenges faced in traditional manual pathological diagnosis, which is labor-intensive and prone to errors due to the need for meticulous examination of morphological features. While deep learning has shown potential in improving diagnostic accuracy and efficiency, existing methods often rely on supervised pre-training with limited data, which may not adequately address the unique characteristics of pathological images. BEPH represents a significant step forward by leveraging extensive unlabeled data, thus addressing the limitations of previous approaches and paving the way for more robust applications of AI in pathology.

Methods

In the Methods section, the authors detail their comparative analysis of BEPH against various competing methods across downstream tasks, particularly focusing on patch-level classification and weakly supervised classification of whole slide images (WSIs). The comparison includes convolutional neural networks (CNNs) such as Deep, SW, GLPB, RPDB, shallow-CNN, AlexNet, ResNet, VGG19, and EfficientNet-B0, which differ in their architectures, data preprocessing, and sampling techniques. Additionally, the authors evaluate self-supervised models like MPCS-RP and DARC-ConvNet, which utilize innovative loss functions for label-free pre-training, with MPCS-RP employing paired sampling to create contrastive views and DARC-ConvNet using iterative clustering for pseudo-label optimization.

For the WSI classification and survival prediction tasks, the authors assess several pre-trained and weakly supervised models, including HIPT, which utilizes a hierarchical image pyramid transformer, and various configurations of the CLAM framework (CLAM(Resnet), CLAM(DINO), CLAM(CtransPath), CLAM(UNI), CLAM(GigaPath), and CLAM(-CHIEF)). These models leverage diverse pre-training strategies and architectures to enhance feature extraction and classification performance. Other notable methods include GCN-MIL, which integrates multi-instance learning with graph convolutional networks for feature aggregation, and DS-MIL, which employs self-supervised contrastive learning for improved instance-level and bag-level classification. DeepAttnMISL further contributes by clustering patches from WSIs and utilizing a Siamese multi-instance fully convolutional network to capture patient heterogeneity.

Results

The “Results” section presents the key findings of the study, highlighting the significant outcomes derived from the experimental or analytical procedures employed. The data indicate that the proposed hypothesis is supported, with statistical analyses revealing a strong correlation between the variables under investigation. Specifically, the results demonstrate that the intervention leads to a measurable improvement in the target outcomes, with a p-value of less than 0.05, indicating statistical significance.

Furthermore, the analysis of variance (ANOVA) results show that the differences among the groups are substantial, reinforcing the effectiveness of the treatment. Graphical representations of the data illustrate these trends clearly, with error bars indicating variability and confidence intervals confirming the reliability of the results. Overall, the findings contribute valuable insights into the field, suggesting potential applications for the intervention in practical settings.

Discussion

In this section, the authors discuss the performance of their model, BEPH, in patch-level and whole-slide image (WSI) classification tasks for cancer detection. Using the BreakHis dataset, BEPH achieved an average accuracy of 94.05% for patient-level classification and 93.65% for image-level classification, outperforming several state-of-the-art convolutional neural networks (CNNs) and weakly supervised models by 5-10%. Additionally, when fine-tuned on the LC25000 dataset for lung cancer subtypes, BEPH reached an impressive accuracy of 99.99%, demonstrating its robustness across different cancer types and magnification levels. The model’s performance was further validated through WSI-level classification tasks, where it excelled in distinguishing subtypes of renal cell carcinoma, non-small cell lung cancer, and breast cancer, achieving area under the receiver operating characteristic curve (AUC) values of 0.994, 0.970, and 0.946, respectively.

The authors also highlight BEPH’s label efficiency, noting that even with a reduced training dataset size (down to 25%), the model maintained superior performance compared to other weakly supervised models. This suggests that BEPH’s masked image modeling (MIM) pre-training effectively mitigates the challenges associated with data scarcity. Furthermore, in survival prediction tasks across six cancer types, BEPH consistently ranked first in predictive capabilities, with C-index values indicating significant improvements over other models. The qualitative analysis of attention scores revealed that BEPH accurately identifies tumor regions, aligning closely with pathologist annotations, thus confirming its practical applicability in clinical settings. Overall, the findings underscore BEPH’s strong generalization capabilities, stability, and potential for real-world cancer research applications.