الذكاء الاصطناعي القابل للتفسير لاكتشاف سرطان الرئة عبر شبكة عصبية مخصصة على صور الأشعة المقطعية Explainable AI for lung cancer detection via a custom CNN on CT images

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-97645-5
PMID: https://pubmed.ncbi.nlm.nih.gov/40223153
تاريخ النشر: 2025-04-13
المؤلف: Mohamed Hammad وآخرون
الموضوع الرئيسي: الرياضيّات والتعلم الآلي في التصوير الطبي

نظرة عامة

تتناول الأبحاث التحدي الحاسم للكشف المبكر عن سرطان الرئة، وهو أحد الأسباب الرئيسية للوفيات المرتبطة بالسرطان، حيث يسجل حوالي 1.8 مليون حالة وفاة سنويًا. تعيق الطرق التقليدية لتحليل صور الأشعة المقطعية طبيعتها المستهلكة للوقت، وقابليتها للأخطاء، واعتمادها على التقييمات الذاتية. لتعزيز دقة التشخيص وقابلية التفسير، يقترح المؤلفون شبكة عصبية تلافيفية مخصصة (CNN) مدمجة مع تقنيات الذكاء الاصطناعي القابل للتفسير (XAI)، وبالتحديد خرائط تنشيط الفئة الموزونة بالتدرج (Grad-CAM). لا يحقق هذا النهج المبتكر دقة تصنيف عالية تبلغ 93.06% عبر أنواع سرطان الرئة الفرعية—سرطان الخلايا الحرشفية، سرطان الخلايا الكبيرة، وسرطان الغدد—فحسب، بل يوفر أيضًا تصورات قابلة للتفسير تدعم اتخاذ القرارات السريرية.

تسلط الدراسة الضوء على أهمية الكشف المبكر في تحسين نتائج المرضى وتقليل معدلات الوفيات. من خلال معالجة قيود الأنظمة التشخيصية الحالية، يقدم النموذج المقترح حلاً قويًا يجمع بين دقة متطورة وشفافية، مما يعزز الثقة بين الأطباء. يسمح دمج Grad-CAM بتصور المناطق الحرجة في صور الأشعة المقطعية التي تؤثر على التنبؤات، مما يعزز موثوقية النموذج في التطبيقات السريرية. بشكل عام، تُظهر هذه الأبحاث إمكانيات التقنيات الحاسوبية المتقدمة في إحداث ثورة في الكشف عن سرطان الرئة وتحسين معدلات البقاء على قيد الحياة.

الطرق

تناقش قسم الطرق تقنيات التصنيف الثنائي والمتعدد الفئات المختلفة للكشف عن سرطان الرئة باستخدام التعلم العميق (DL) والشبكات العصبية التلافيفية (CNNs). في التصنيف الثنائي، حققت نماذج مثل شبكة CNN عميقة مع خرائط البروز دقة مثيرة للإعجاب تبلغ 99.89% على مجموعة بيانات LIDC-IDRI. تشمل الأساليب الملحوظة الأخرى شبكة عصبية عميقة مدمجة متعددة الوسائط (MFDNN) تدمج البيانات الجينية والسريرية، محققة دقة تبلغ 92.5%، وشبكة CNN تعتمد على تحسين سرب الجسيمات الضبابية التي تعزز اختيار الميزات وتقلل من التعقيد الحسابي. ومع ذلك، تعيق التحديات مثل عدم توازن مجموعة البيانات ونقص القابلية للتفسير في هذه النماذج اعتمادها السريري.

في التصنيف متعدد الفئات، تهدف الطرق إلى تحديد أنواع فرعية محددة من سرطان الرئة، وهو أمر حاسم للعلاج الشخصي. على سبيل المثال، حقق نموذج DenseNet201 المعدل دقة متوسطة تبلغ 95% على مجموعة بيانات الأشعة المقطعية للصدر من كاجل، بينما أظهر نموذج DCNN-GRU معدلات دقة عالية تبلغ 99.30% و98.97% لمجموعات بيانات COVID-19 وسرطان الرئة، على التوالي. تم تسليط الضوء على EfficientNetB3 لأدائه المتفوق، حيث حقق دقة تبلغ 97.78% ومؤشرات دقة واسترجاع ممتازة. على الرغم من هذه التقدمات، تواجه النماذج متعددة الفئات تحديات مثل التباين العالي داخل الفئة وتعقيد التفسير، مما يعقد تطبيقها السريري. يختتم القسم بإطار شامل للكشف الآلي عن سرطان الرئة، موضحًا خط أنابيب معالجة البيانات من الاكتساب إلى تقييم النموذج.

النتائج

في هذا القسم، تُعرض نتائج نموذج التعلم الآلي المصمم للكشف عن سرطان الرئة، مع تسليط الضوء على مقاييس أدائه ومجالات التحسين. تم تطوير النموذج باستخدام MATLAB R2019b على جهاز مزود بمعالج Intel Core i5 بسرعة 2.40 جيجاهرتز وذاكرة وصول عشوائي 16 جيجابايت NVIDIA GeForce GTX 1650، وتم تدريبه على مدى 100 عصر بحجم دفعة صغيرة يبلغ 16 ومعدل تعلم أولي قدره $1 \times 10^{-4}$. حقق النموذج دقة إجمالية تبلغ 93.06%، مع تصنيف مثالي لسرطان الغدد والخلايا الطبيعية (100% حساسية وخصوصية). ومع ذلك، واجه أخطاء تصنيف في سرطان الخلايا الكبيرة وسرطان الخلايا الحرشفية، مع حساسية بلغت 85.7% و86.7%، على التوالي. أشار تحليل الأخطاء باستخدام تصورات Grad-CAM إلى أن النموذج واجه صعوبة في التمييز بين هذه الأنواع من السرطان بسبب تداخل الميزات الشكلية وتباين جودة صورة الأشعة المقطعية.

تُظهر مقاييس الأداء، الملخصة في الجدول 3، قيم دقة واسترجاع قوية عبر الفئات، خاصة بالنسبة لسرطان الغدد والخلايا الطبيعية. كانت دقة النموذج بالنسبة لسرطان الخلايا الكبيرة وسرطان الخلايا الحرشفية أيضًا مثالية، لكن معدلات استرجاعها تشير إلى بعض الحالات المفقودة. أظهرت منحنيات ROC قيم AUC عالية (0.96 لسرطان الغدد وسرطان الخلايا الكبيرة، و0.97 لسرطان الخلايا الحرشفية)، مع AUC مثالي قدره 1.00 للحالات الطبيعية، مما يبرز فعالية النموذج في التمييز بين الخلايا السرطانية وغير السرطانية. قدمت نتائج القابلية للتفسير، خاصة من خلال Grad-CAM، رؤى حول عملية اتخاذ القرار للنموذج، كاشفة عن المناطق في صور الأشعة المقطعية التي أثرت على التنبؤات ومشيرة إلى مجالات للتحسين المحتمل، مثل تعزيز تمييز الميزات وتوسيع مجموعة بيانات التدريب. يُقترح العمل المستقبلي لتحسين التوازن بين الحساسية والخصوصية والتحقق من صحة النموذج عبر بيئات سريرية متنوعة.

المناقشة

تسلط قسم المناقشة في ورقة البحث الضوء على فعالية نموذج شبكة عصبية تلافيفية مخصصة (CNN) للكشف عن سرطان الرئة، مع التأكيد على مقاييس أدائه القوية ودمج تقنيات القابلية للتفسير. حقق النموذج دقة إجمالية تبلغ 93.06%، مع تصنيف مثالي لسرطان الغدد والخلايا الطبيعية، على الرغم من أنه واجه تحديات في التمييز بين سرطان الخلايا الكبيرة وسرطان الخلايا الحرشفية. أظهرت ديناميات التدريب تحسينات كبيرة في كل من دقة التدريب والتحقق، مع منحنيات خسارة تشير إلى تقليل فعال لأخطاء التنبؤ. تؤكد دقة النموذج العالية (95.53%) واسترجاعه (93.09%) موثوقيته في تحديد الإيجابيات الحقيقية، بينما تشير قيم منطقة تحت منحنى التشغيل (AUC-ROC) إلى قدرات تمييز قوية عبر أنواع السرطان المختلفة.

بالإضافة إلى ذلك، يعزز دمج Grad-CAM للقابلية للتفسير من قابلية تفسير النموذج، مما يسمح للأطباء بتصور المناطق في الأشعة المقطعية التي تؤثر على التنبؤات. تعتبر هذه الشفافية ضرورية للقبول السريري، حيث تساعد في التحقق من تركيز النموذج على الهياكل التشريحية ذات الصلة. تناقش الورقة أيضًا الإمكانية للتطبيق في العالم الحقيقي، مشيرة إلى أنه بينما يظهر النموذج وعدًا للكشف المبكر وتقليل الأخطاء التشخيصية، يجب معالجة التحديات مثل الأخطاء في التصنيف بين بعض أنواع السرطان وتباين ظروف التصوير. بشكل عام، تؤكد النتائج على إمكانية استخدام النموذج في البيئات السريرية، خاصة في دعم أطباء الأشعة وتحسين نتائج المرضى من خلال تعزيز دقة التشخيص.

القيود

تقدم الطريقة المقترحة للكشف عن سرطان الرئة تقدمًا كبيرًا مقارنة بالأساليب التصنيفية الحالية، لا سيما من خلال تطوير بنية شبكة عصبية تلافيفية مخصصة (CNN) ودمج تقنيات الذكاء الاصطناعي القابل للتفسير (XAI) مثل Grad-CAM. تهدف هذه الابتكارات إلى تعزيز دقة التصنيف وقابلية تعميم النموذج مع معالجة عدم توازن البيانات من خلال تقنيات تعزيز متقدمة. ومع ذلك، فإن الطريقة ليست خالية من القيود. تشمل التحديات الرئيسية الاعتماد على جودة وتنوع مجموعات بيانات التدريب، مما قد يعيق قدرة النموذج على التعميم عبر مجموعات مرضى وظروف تصوير متنوعة. بالإضافة إلى ذلك، بينما يوفر Grad-CAM رؤى قيمة حول توقعات النموذج، فإنه لا يوضح بالكامل تعقيدات عملية اتخاذ القرار.

يجب أن يركز العمل المستقبلي على توسيع مجموعة البيانات لتشمل مجموعة أوسع من تقديمات سرطان الرئة وتنوعات التصوير، مما يحسن من قابلية التعميم. علاوة على ذلك، فإن تعزيز القابلية للتفسير من خلال تقنيات XAI الأكثر تقدمًا أمر ضروري لفهم أعمق لسلوك النموذج. تشكل المتطلبات الحاسوبية لبنية CNN المخصصة أيضًا عائقًا أمام النشر العملي في البيئات ذات الموارد المحدودة. يمكن أن تعزز استراتيجيات مثل تقليم النموذج، والتكميم، وتطوير إصدارات خفيفة من النموذج من قابلية الاستخدام. أخيرًا، لتسهيل التنفيذ السريري الناجح، يجب استخدام استراتيجيات التكيف لأخذ في الاعتبار التباينات في بروتوكولات التصوير وخصائص المرضى، مما يضمن أداءً موثوقًا عبر بيئات الرعاية الصحية المتنوعة.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-97645-5
PMID: https://pubmed.ncbi.nlm.nih.gov/40223153
Publication Date: 2025-04-13
Author(s): Mohamed Hammad et al.
Primary Topic: Radiomics and Machine Learning in Medical Imaging

Overview

The research addresses the critical challenge of early lung cancer detection, a leading cause of cancer-related mortality, with approximately 1.8 million deaths annually. Traditional methods of CT image analysis are hindered by their time-consuming nature, susceptibility to errors, and reliance on subjective assessments. To enhance diagnostic accuracy and interpretability, the authors propose a custom convolutional neural network (CNN) integrated with explainable AI (XAI) techniques, specifically gradient-weighted class activation mapping (Grad-CAM). This innovative approach not only achieves a high classification accuracy of 93.06% across lung cancer subtypes—squamous cell carcinoma, large cell carcinoma, and adenocarcinoma—but also provides interpretable visualizations that support clinical decision-making.

The study highlights the importance of early detection in improving patient outcomes and reducing mortality rates. By addressing the limitations of existing diagnostic systems, the proposed model offers a robust solution that combines state-of-the-art accuracy with transparency, thereby fostering trust among clinicians. The integration of Grad-CAM allows for the visualization of critical regions in CT images that influence predictions, enhancing the model’s reliability in clinical applications. Overall, this research demonstrates the potential of advanced computational techniques to revolutionize lung cancer detection and improve survival rates.

Methods

The section on methods discusses various binary and multiclass classification techniques for lung cancer detection using deep learning (DL) and convolutional neural networks (CNNs). In binary classification, models such as a deep CNN with saliency maps achieved an impressive accuracy of 99.89% on the LIDC-IDRI dataset. Other notable approaches include a multimodal fusion deep neural network (MFDNN) that integrates genetic and clinical data, achieving 92.5% accuracy, and a fuzzy particle swarm optimization-based CNN that enhances feature selection and reduces computational complexity. However, challenges such as dataset imbalance and the lack of interpretability in these models hinder their clinical adoption.

In multiclass classification, methods aim to identify specific lung cancer subtypes, which is crucial for personalized treatment. For instance, a modified DenseNet201 model achieved an average accuracy of 95% on the Kaggle chest CT-scan dataset, while the DCNN-GRU model demonstrated high accuracy rates of 99.30% and 98.97% for COVID-19 and lung cancer datasets, respectively. EfficientNetB3 was highlighted for its superior performance, achieving 97.78% accuracy and excellent precision and recall metrics. Despite these advancements, multiclass models face challenges such as high intraclass variability and the complexity of interpretation, which complicates their clinical application. The section concludes with a comprehensive framework for automated lung cancer detection, detailing the data processing pipeline from acquisition to model evaluation.

Results

In this section, the results of a machine learning model designed for lung cancer detection are presented, highlighting its performance metrics and areas for improvement. The model, developed using MATLAB R2019b on a machine equipped with a 2.40 GHz Intel Core i5 CPU and a 16 GB NVIDIA GeForce GTX 1650 GPU, was trained over 100 epochs with a mini-batch size of 16 and an initial learning rate of $1 \times 10^{-4}$. The model achieved an overall accuracy of 93.06%, with perfect classification for adenocarcinoma and normal cells (100% sensitivity and specificity). However, it encountered misclassifications in large cell carcinoma and squamous cell carcinoma, with sensitivities of 85.7% and 86.7%, respectively. The analysis of misclassifications using Grad-CAM visualizations indicated that the model struggled to differentiate between these cancer types due to overlapping morphological features and variability in CT image quality.

The performance metrics, summarized in Table 3, reveal strong precision and recall values across the classes, particularly for adenocarcinoma and normal cells. The model’s precision for large cell and squamous cell carcinoma was also perfect, but their recall rates indicated some missed cases. The ROC curves demonstrated high AUC values (0.96 for adenocarcinoma and large cell carcinoma, and 0.97 for squamous cell carcinoma), with a perfect AUC of 1.00 for normal cases, underscoring the model’s effectiveness in distinguishing between cancerous and non-cancerous cells. The explainability results, particularly through Grad-CAM, provided insights into the model’s decision-making process, revealing the regions of CT images that influenced predictions and highlighting areas for potential refinement, such as enhancing feature differentiation and expanding the training dataset. Future work is suggested to optimize the balance between sensitivity and specificity and to validate the model across diverse clinical settings.

Discussion

The discussion section of the research paper highlights the effectiveness of a custom convolutional neural network (CNN) model for lung cancer detection, emphasizing its strong performance metrics and the integration of explainability techniques. The model achieved an overall accuracy of 93.06%, with perfect classification for adenocarcinoma and normal cells, although it faced challenges in distinguishing between large cell carcinoma and squamous cell carcinoma. The training dynamics demonstrated significant improvements in both training and validation accuracy, with loss curves indicating effective minimization of prediction errors. The model’s high precision (95.53%) and recall (93.09%) further confirm its reliability in identifying true positives, while the area under the receiver operating characteristic curve (AUC-ROC) values indicate robust discrimination capabilities across different cancer types.

Additionally, the incorporation of Grad-CAM for explainability enhances the model’s interpretability, allowing clinicians to visualize the areas of CT scans that influence predictions. This transparency is crucial for clinical acceptance, as it helps validate the model’s focus on relevant anatomical structures. The paper also discusses the potential for real-world application, noting that while the model shows promise for early detection and reducing diagnostic errors, challenges such as misclassification between certain cancer types and variability in imaging conditions must be addressed. Overall, the findings underscore the model’s potential utility in clinical settings, particularly in supporting radiologists and improving patient outcomes through enhanced diagnostic accuracy.

Limitations

The proposed method for lung cancer detection presents significant advancements over existing classification approaches, particularly through the development of a custom convolutional neural network (CNN) architecture and the integration of explainable artificial intelligence (XAI) techniques like Grad-CAM. These innovations aim to enhance classification accuracy and model generalizability while addressing data imbalance through advanced augmentation techniques. However, the method is not without limitations. Key challenges include the reliance on the quality and diversity of training datasets, which may hinder the model’s ability to generalize across varied patient populations and imaging conditions. Additionally, while Grad-CAM provides valuable insights into model predictions, it does not fully elucidate the complexities of the decision-making process.

Future work should focus on expanding the dataset to encompass a broader range of lung cancer presentations and imaging variations, thereby improving generalizability. Moreover, enhancing interpretability through more advanced XAI techniques is essential for a deeper understanding of model behavior. The computational demands of the custom CNN architecture also pose a barrier to practical deployment in resource-limited settings. Strategies such as model pruning, quantization, and the development of lightweight versions of the model could enhance usability. Finally, to facilitate successful clinical implementation, adaptation strategies must be employed to account for variations in imaging protocols and patient demographics, ensuring reliable performance across diverse healthcare environments.