نموذج أساسي متعدد الأنماط للشرائح الكاملة في علم الأمراض A multimodal whole-slide foundation model for pathology

المجلة: Nature Medicine، المجلد: 31، العدد: 11
DOI: https://doi.org/10.1038/s41591-025-03982-3
PMID: https://pubmed.ncbi.nlm.nih.gov/41193692
تاريخ النشر: 2025-11-01
المؤلف: Tong Ding وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في اكتشاف السرطان

نظرة عامة

يقدم هذا القسم حالة من سرطان الخلايا الحرشفية النقيلي يتميز بورم يبلغ حجمه 0.8 سم في أكبر أبعاده. يتم تمثيل بيانات التصوير بدقة 512 بكسل × 512 بكسل، مما يشير إلى مستوى التفاصيل الملتقطة في التحليل. تؤكد هذه النتيجة على أهمية القياس الدقيق في تقييم حجم الورم، وهو أمر حاسم لتحديد مرحلة السرطان وإبلاغ قرارات العلاج.

الطرق

يحدد قسم “الطرق” الإجراءات التجريبية والتحليلية المستخدمة في الدراسة. يوضح معايير اختيار المشاركين، وتصميم التجارب، والتقنيات الإحصائية المستخدمة في تحليل البيانات. تشمل المنهجية بروتوكولات محددة لجمع البيانات، مما يضمن موثوقية وصلاحية النتائج.

بالإضافة إلى ذلك، يصف القسم الأدوات والتقنيات المستخدمة، مثل البرمجيات لتحليل البيانات الإحصائية وأي معدات ذات صلة بالمهام التجريبية. تم تصميم الطرق لمعالجة أسئلة البحث بفعالية، مما يسمح بفحص شامل للفرضيات المطروحة في الدراسة. بشكل عام، يعزز صرامة الطرق مصداقية النتائج المقدمة في الأقسام اللاحقة.

النتائج

يقدم قسم “النتائج” النتائج الرئيسية للدراسة، مع تسليط الضوء على نتائج التجارب أو التحليلات التي تم إجراؤها. يوضح البيانات الكمية والنوعية التي تم الحصول عليها، مما يوضح فعالية الطرق أو التدخلات المقترحة. يتم تقييم الأهمية الإحصائية، مع تقديم مقاييس ذات صلة مثل قيم p أو فترات الثقة لدعم الاستنتاجات المستخلصة.

بالإضافة إلى ذلك، قد يتضمن القسم تمثيلات بصرية للبيانات، مثل الرسوم البيانية أو الجداول، التي تسهل فهم الاتجاهات والأنماط الملاحظة في النتائج. يتم مناقشة تداعيات هذه النتائج فيما يتعلق بالفرضيات الأصلية، مما يشير إلى ما إذا كانت النتائج تدعم أو تنفي النظريات المقترحة. بشكل عام، يخدم هذا القسم لتأكيد ادعاءات البحث بأدلة تجريبية، مما يمهد الطريق لمزيد من المناقشة والتفسير في الأقسام اللاحقة.

المناقشة

تناقش الورقة البحثية تطوير وتقييم TITAN، وهو نموذج متعدد الوسائط للرؤية واللغة مصمم لعلم الأمراض النسيجي. يستخدم TITAN بنية Vision Transformer (ViT) وتم تدريبه مسبقًا على مجموعة بيانات متنوعة، Mass-340K، التي تشمل 335,645 صورة شريحة كاملة (WSIs) و182,862 تقرير طبي عبر 20 نوعًا من الأعضاء. تتكون عملية التدريب المسبق من ثلاث مراحل: التدريب الأولي على الصور فقط على قصاصات منطقة الاهتمام (ROI)، تليها محاذاة متعددة الوسائط مع تسميات اصطناعية وتقارير سريرية. يسمح هذا النهج لـ TITAN بتوليد تمثيلات شريحة قوية تلتقط كل من الدلالات الهيكلية والمعلومات السياقية من تقارير علم الأمراض.

تظهر تقييمات TITAN أدائه المتفوق في مهام تشخيصية متنوعة، بما في ذلك التصنيف الهيكلي، التصنيف الجزيئي، وتوقع البقاء، متفوقًا على أجهزة ترميز الشرائح الحالية مثل PRISM وGigaPath. من الجدير بالذكر أن TITAN يحقق تحسينات كبيرة في الدقة عبر مجموعات متعددة من البيانات والمهام، مما يبرز قدرته على التعميم من بيانات التدريب المسبق الواسعة. يعالج تصميم النموذج التحديات المرتبطة بتسلسلات الإدخال الطويلة والسياق المكاني، مستخدمًا تقنيات متقدمة مثل الانتباه مع التحيز الخطي (ALiBi) لاستخراج السياق الطويل. بشكل عام، يمثل TITAN تقدمًا كبيرًا في مجال الذكاء الاصطناعي في علم الأمراض، مقدمًا حلاً قابلاً للتوسع وفعالًا للتطبيقات السريرية.

Journal: Nature Medicine, Volume: 31, Issue: 11
DOI: https://doi.org/10.1038/s41591-025-03982-3
PMID: https://pubmed.ncbi.nlm.nih.gov/41193692
Publication Date: 2025-11-01
Author(s): Tong Ding et al.
Primary Topic: AI in cancer detection

Overview

The section presents a case of metastatic squamous cell carcinoma characterized by a tumor measuring 0.8 cm at its greatest dimension. The imaging data is represented in a resolution of 512 px by 512 px, indicating the level of detail captured in the analysis. This finding underscores the significance of precise measurement in the assessment of tumor size, which is crucial for determining the stage of cancer and informing treatment decisions.

Methods

The “Methods” section outlines the experimental and analytical procedures employed in the study. It details the selection criteria for participants, the design of the experiments, and the statistical techniques used for data analysis. The methodology includes specific protocols for data collection, ensuring reliability and validity in the results.

Additionally, the section describes the tools and technologies utilized, such as software for statistical analysis and any relevant equipment for experimental tasks. The methods are designed to address the research questions effectively, allowing for a comprehensive examination of the hypotheses posed in the study. Overall, the rigor of the methods enhances the credibility of the findings presented in subsequent sections.

Results

The “Results” section presents the key findings of the study, highlighting the outcomes of the experiments or analyses conducted. It details the quantitative and qualitative data obtained, illustrating the effectiveness of the proposed methods or interventions. Statistical significance is assessed, with relevant metrics such as p-values or confidence intervals provided to support the conclusions drawn.

Additionally, the section may include visual representations of the data, such as graphs or tables, which facilitate the understanding of trends and patterns observed in the results. The implications of these findings are discussed in relation to the original hypotheses, indicating whether the results support or refute the proposed theories. Overall, this section serves to substantiate the research claims with empirical evidence, laying the groundwork for further discussion and interpretation in subsequent sections.

Discussion

The research paper discusses the development and evaluation of TITAN, a multimodal whole-slide vision-language model designed for histopathology. TITAN employs a Vision Transformer (ViT) architecture and is pretrained on a diverse dataset, Mass-340K, which includes 335,645 whole-slide images (WSIs) and 182,862 medical reports across 20 organ types. The pretraining process consists of three stages: initial vision-only training on region-of-interest (ROI) crops, followed by cross-modal alignment with synthetic captions and clinical reports. This approach allows TITAN to generate robust slide representations that capture both histomorphological semantics and contextual information from pathology reports.

The evaluation of TITAN demonstrates its superior performance in various diagnostic tasks, including morphological subtyping, molecular classification, and survival prediction, outperforming existing slide encoders such as PRISM and GigaPath. Notably, TITAN achieves significant improvements in accuracy across multiple cohorts and tasks, showcasing its ability to generalize from extensive pretraining data. The model’s design effectively addresses challenges associated with long input sequences and spatial context, utilizing advanced techniques such as attention with linear bias (ALiBi) for long-context extrapolation. Overall, TITAN represents a significant advancement in the field of pathology AI, offering a scalable and efficient solution for clinical applications.

كلمات مفتاحية: أساس (دليل)، أورام، التصوير السريري، التعلم العميق، العلوم السريرية، بشر، تعلم الآلة، تيتان (عائلة الصواريخ)، علم الأمراض، علم الأمراض الرقمي، قواعد توليدية، مجال (رياضيات)، معالجة الصور، بمساعدة الكمبيوتر، ميزة (لغويات)