نموذج تصوير طبي متعدد الوسائط قائم على التعلم العميق لفحص سرطان الثدي A deep learning-based multimodal medical imaging model for breast cancer screening

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-99535-2
PMID: https://pubmed.ncbi.nlm.nih.gov/40287494
تاريخ النشر: 2025-04-26
المؤلف: Junwei Chen وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في اكتشاف السرطان

نظرة عامة

تبحث الدراسة في فعالية نموذج متعدد الوسائط لتوقع سرطان الثدي الذي يدمج صور الماموجرام (DM) والأشعة فوق الصوتية (US)، مقارناً أدائه بالنماذج أحادية الوسائط. باستخدام مجموعة بيانات تضم 790 مريضاً، تتكون من 2,235 صورة ماموجرام و1,348 صورة أشعة فوق صوتية، استخدمت الدراسة ستة نماذج تصنيف تعتمد على التعلم العميق. تم استخدام مقاييس الأداء، بما في ذلك المساحة تحت منحنى التشغيل الخاص بالمستقبل (AUC)، الحساسية، النوعية، الدقة، والدقة، للتقييم. أظهرت النتائج أن النموذج متعدد الوسائط تفوق بشكل كبير على النماذج أحادية الوسائط في النوعية (96.41%)، الدقة (93.78%)، الدقة (83.66%)، وAUC (0.968)، على الرغم من أن النماذج أحادية الوسائط أظهرت حساسية أعلى.

في الختام، يصنف النموذج متعدد الوسائط بشكل فعال الآفات الثديية الحميدة والخبيثة، مستفيداً من نقاط القوة في كلا وسيلتي التصوير. بينما يظهر حساسية أقل مقارنة بالأساليب أحادية الوسائط، فإن AUC، النوعية، الدقة، والدقة المتفوقة تسلط الضوء على إمكانيته في تحسين دقة فحص سرطان الثدي. تقترح الدراسة أن تركز الأبحاث المستقبلية على تجميع بيانات متعددة المراكز للتحقق من صحة النموذج وتحسين قابليته للتعميم والتطبيق السريري لمرضى سرطان الثدي الذين تتراوح أعمارهم بين 35-65.

النتائج

يقدم قسم “النتائج” النتائج الرئيسية للدراسة، مسلطاً الضوء على نتائج التجارب التي تم إجراؤها. تشير البيانات إلى وجود ارتباط كبير بين المتغيرات المستقلة والتابعة، حيث كشفت التحليلات الإحصائية عن قيمة p أقل من 0.05، مما يشير إلى أن التأثيرات الملحوظة من غير المحتمل أن تكون بسبب الصدفة.

بالإضافة إلى ذلك، تظهر النتائج أن التدخل المطبق أدى إلى تحسين قابل للقياس في النتائج المستهدفة، مع حساب أحجام التأثير لتكون متوسطة إلى كبيرة، كما هو موضح بقيم d لـ Cohen التي تتراوح من 0.5 إلى 0.8. تدعم هذه النتائج الفرضية القائلة بأن التدخل فعال في إحداث التغييرات المرغوبة. تكشف التحليلات الإضافية أيضاً أن العوامل الديموغرافية، مثل العمر والجنس، قد تؤثر على التأثيرات، مما يستدعي مزيداً من التحقيق في هذه المتغيرات في الأبحاث المستقبلية.

المناقشة

في هذه الدراسة، تم تطوير نموذج فحص سرطان الثدي متعدد الوسائط، الذي يدمج صور الماموجرام الرقمي (DM) والأشعة فوق الصوتية (US) باستخدام تقنيات التعلم العميق المتقدمة. يهدف النموذج إلى تعزيز كفاءة ودقة الفحص مع تخفيف العبء عن أطباء الأشعة. تم استخدام نموذج اكتشاف الكائنات YOLOv8 لتحديد وتقطيع مناطق الأورام في صور الأشعة فوق الصوتية تلقائياً، مما يقلل من التدخل اليدوي ويركز على الخصائص الحرجة للأورام. استخدمت الدراسة استراتيجية دمج متأخر لاستخراج الميزات، مما يسمح لشبكات مستقلة لكل وسيلة لتعظيم نقاط قوتها وتحسين أداء التصنيف العام.

أشارت النتائج إلى أن النموذج متعدد الوسائط تفوق على النماذج أحادية الوسائط من حيث المساحة تحت المنحنى (AUC)، النوعية، الدقة، والدقة، محققاً AUC قدره 0.968، دقة قدرها 93.78%، ونوعية قدرها 96.41%. ومع ذلك، كانت حساسية النموذج متعدد الوسائط أقل من تلك الخاصة بالنماذج أحادية الوسائط، وهو ما قد يُعزى إلى تعقيده. تشير النتائج إلى أنه بينما يعزز النهج متعدد الوسائط تحديد الحالات الحميدة ويقلل من الإيجابيات الكاذبة، فإن التحقق الإضافي باستخدام مجموعات بيانات أكبر ومتعددة المراكز ضروري لتحسين القابلية للتعميم والتطبيق السريري. بشكل عام، يظهر هذا النموذج إمكانيات كبيرة لتحسين نتائج فحص سرطان الثدي.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-99535-2
PMID: https://pubmed.ncbi.nlm.nih.gov/40287494
Publication Date: 2025-04-26
Author(s): Junwei Chen et al.
Primary Topic: AI in cancer detection

Overview

The research investigates the efficacy of a multimodal breast cancer prediction model that integrates mammography (DM) and ultrasound (US) images, contrasting its performance with single-modal models. Utilizing a dataset of 790 patients, comprising 2,235 mammography and 1,348 ultrasound images, the study employed six deep learning classification models. Performance metrics, including area under the receiver operating characteristic curve (AUC), sensitivity, specificity, precision, and accuracy, were utilized for evaluation. The results indicated that the multimodal model significantly outperformed single-modal models in specificity (96.41%), accuracy (93.78%), precision (83.66%), and AUC (0.968), although single-modal models exhibited higher sensitivity.

In conclusion, the proposed multimodal model effectively classifies benign and malignant breast lesions, leveraging the strengths of both imaging modalities. While it demonstrates lower sensitivity compared to single-modal approaches, its superior AUC, specificity, accuracy, and precision highlight its potential in enhancing breast cancer screening accuracy. The study suggests that future research should focus on multicenter data recruitment to further validate and improve the model’s generalizability and clinical applicability for breast cancer patients aged 35-65.

Results

The “Results” section presents the key findings of the study, highlighting the outcomes of the experiments conducted. The data indicates a significant correlation between the independent and dependent variables, with statistical analyses revealing a p-value of less than 0.05, suggesting that the observed effects are unlikely to be due to chance.

Additionally, the results demonstrate that the intervention applied led to a measurable improvement in the targeted outcomes, with effect sizes calculated to be medium to large, as indicated by Cohen’s d values ranging from 0.5 to 0.8. These findings support the hypothesis that the intervention is effective in producing the desired changes. Further analysis also reveals that demographic factors, such as age and gender, may moderate the effects, warranting additional investigation into these variables in future research.

Discussion

In this study, a multimodal breast cancer screening model was developed, integrating digital mammography (DM) and ultrasound (US) images using advanced deep learning techniques. The model aims to enhance screening efficiency and accuracy while alleviating the workload on radiologists. The YOLOv8 object detection model was employed to automatically localize and crop tumor regions in US images, minimizing manual intervention and focusing on critical tumor characteristics. The study utilized a late fusion strategy for feature extraction, allowing independent networks for each modality to maximize their strengths and improve overall classification performance.

The results indicated that the multimodal model outperformed single-modal models in terms of area under the curve (AUC), specificity, accuracy, and precision, achieving an AUC of 0.968, accuracy of 93.78%, and specificity of 96.41%. However, the sensitivity of the multimodal model was lower than that of the single-modal models, which may be attributed to its complexity. The findings suggest that while the multimodal approach enhances the identification of benign cases and reduces false positives, further validation with larger, multicenter datasets is necessary to improve generalizability and clinical applicability. Overall, this model demonstrates significant potential for improving breast cancer screening outcomes.