أتمتة تشخيص السرطان باستخدام تقنيات التعلم العميق المتقدمة لتصنيف صور متعددة للسرطان Automating cancer diagnosis using advanced deep learning techniques for multi-cancer image classification

المجلة: Scientific Reports، المجلد: 14، العدد: 1
DOI: https://doi.org/10.1038/s41598-024-75876-2
PMID: https://pubmed.ncbi.nlm.nih.gov/39443621
تاريخ النشر: 2024-10-23
المؤلف: Yogesh Kumar وآخرون
الموضوع الرئيسي: كشف وتصنيف أورام الدماغ

نظرة عامة

تتناول ورقة البحث التحدي الحاسم في الكشف عن السرطان، الذي لا يزال السبب الرئيسي للوفيات العالمية. غالبًا ما تتضمن الطرق التقليدية إجراءات جراحية وغالبًا ما تستغرق وقتًا طويلاً في التحليل، مما يبرز الحاجة إلى حلول أكثر كفاءة ودقة. تستفيد هذه الدراسة من الكشف الآلي عن السرطان من خلال تقنيات تعتمد على الذكاء الاصطناعي، حيث تستخدم نماذج التعلم العميق المختلفة، بما في ذلك الشبكات العصبية التلافيفية (CNNs) مثل DenseNet121 وDenseNet201 وغيرها، لتحليل مجموعات بيانات الصور لسبعة أنواع من السرطان: سرطان الدماغ، سرطان الفم، سرطان الثدي، سرطان الكلى، اللوكيميا اللمفاوية الحادة، سرطان الرئة، سرطان القولون، وسرطان عنق الرحم. تشمل المنهجية تقسيم الصور واستخراج ميزات الكنتور، حيث حقق DenseNet121 أعلى دقة تحقق بلغت 99.94%، إلى جانب خسارة قدرها 0.0017 وقيم منخفضة من خطأ الجذر التربيعي المتوسط (RMSE).

تؤكد النتائج على إمكانيات التعلم العميق في تعزيز الكشف عن السرطان وتصنيفه باستخدام الصور النسيجية. بينما أظهرت DenseNet121 وResNet152V2 وInceptionV3 وMobileNetV2 أداءً قويًا، تكمن قيود الدراسة في تركيزها على سبعة أنواع فقط من السرطان، مما قد يحد من قابلية تعميم النموذج. بالإضافة إلى ذلك، لم يتم تقييم الجدوى الاقتصادية للنموذج، مما يشكل تحديًا للتبني السريري. تهدف الأبحاث المستقبلية إلى توسيع مجموعة البيانات لتشمل المزيد من أنواع السرطان والصور، واستكشاف النشر في العالم الحقيقي، ودمج نماذج متقدمة مثل محولات الرؤية (ViT) لتحسين النهج. يعد تطوير نماذج فعالة من حيث الموارد أمرًا حيويًا لجعل الكشف عن السرطان المدعوم بالذكاء الاصطناعي عمليًا ومتاحة في البيئات الطبية.

مقدمة

تقدم مقدمة ورقة البحث هذه نظرة شاملة على تقنيات الذكاء الاصطناعي (AI) المختلفة المستخدمة في تشخيص أنواع السرطان المختلفة. تسلط مراجعة الأدبيات الضوء على التقدم الكبير في تصنيف السرطانات مثل اللوكيميا النقوية الحادة (AML) وأورام الدماغ وسرطان الثدي وسرطان عنق الرحم وأورام الكلى، باستخدام منهجيات التعلم العميق (DL) والتعلم الآلي (ML). على سبيل المثال، استخدم أنيلكومار وآخرون (2022) الشبكات العصبية التلافيفية (CNNs) لتصنيف خلايا اللوكيميا، بينما عزز سعيدي وآخرون (2023) الكشف المبكر عن الأورام في تصوير الدماغ من خلال بنية CNN ثنائية الأبعاد. ومن الجدير بالذكر أن محمود وآخرين (2023) وأبوناصر وآخرين (2023) ركزوا على تحديد أورام الدماغ وتصنيف سرطان الثدي على التوالي، مما يبرز أهمية الكشف المبكر.

تناقش هذه القسم أيضًا مجموعة من الأساليب المبتكرة، مثل دمج الشبكات العصبية العميقة مع تقنيات التعلم الآلي التقليدية، كما هو موضح في عمل محسن وآخرين (2018) لتصنيف تصوير الرنين المغناطيسي للدماغ، ودمج تقنيات توازن الفئات من قبل غلوشينا وآخرين (2023) لمعالجة عدم توازن البيانات. علاوة على ذلك، أظهرت التقدمات في هياكل النماذج، مثل EfficientNetv2 ومحولات الرؤية متعددة المحاور (MaxViT)، تحسينات كبيرة في دقة التصنيف لمجموعات البيانات المعقدة، حيث حققت دقة اختبار ملحوظة بلغت 99.76% و99.48% على التوالي. تختتم المقدمة بالتأكيد على الحاجة إلى توسيع مجموعات البيانات وإجراء المزيد من الأبحاث لتعزيز دقة التشخيص ونتائج المرضى عبر أنواع السرطان المختلفة.

الطرق

ت outlines قسم المنهجية عملية تطوير الخلفية لإنشاء نموذج يهدف إلى تحديد وتصنيف أشكال السرطان المختلفة بدقة باستخدام هياكل التعلم الانتقالي وصور النسيج. يركز تصميم النموذج على تحقيق نتائج تصنيف موثوقة، كما هو موضح في الشكل 1.

لتحسين الأداء، تحدد الدراسة متطلبات النظام، بما في ذلك معالج Intel i7 لتسهيل التشغيل السلس أثناء المهام الحسابية المكثفة. يتطلب الأمر حد أدنى من 32 جيجابايت من الذاكرة العشوائية لإدارة مجموعات البيانات الكبيرة والطلبات العالية للتعلم العميق ومعالجة الصور. بالإضافة إلى ذلك، يعد وجود وحدة معالجة الرسوميات (GPU) بذاكرة لا تقل عن 4 جيجابايت أمرًا ضروريًا لتسريع تدريب النموذج واستنتاجه، خاصة عند استخدام الشبكات العصبية المعقدة. تم تصميم النظام ليكون متوافقًا مع إصدارات Windows 8 و10 أو 11، مما يضمن تشغيلًا مستقرًا وفعالًا لتطبيقات الذكاء الاصطناعي والتعلم العميق.

النتائج

في قسم النتائج، يتم تقييم أداء نماذج الشبكات العصبية التلافيفية (CNN) المختلفة على مجموعة بيانات متعددة السرطانات بشكل شامل باستخدام مقاييس أداء متعددة. تشمل التحليلات كل من مجموعات بيانات التدريب والتحقق، مما يكشف أن DenseNet121 وNASNetMobile حققتا أعلى دقة تدريب بلغت 99.95% و99.95% على التوالي، مع خسائر تدريب طفيفة قدرها 0.019 و0.0059. كما أظهرت نماذج أخرى، مثل DenseNet201 وInception V3، دقة جديرة بالثناء تتجاوز 99.5%. على جبهة التحقق، تفوقت DenseNet121 على الآخرين بدقة تحقق بلغت 99.94% وخسارة قدرها 0.0017، بينما أظهرت VGG19 وInception V3 دقة تحقق أقل قليلاً بلغت 99.46% و99.51% على التوالي.

أشارت تقييمات أداء الانحدار، التي تم قياسها بواسطة خطأ الجذر التربيعي المتوسط (RMSE)، إلى أن DenseNet201 كانت أفضل نموذج أداءً بقيم RMSE للتدريب والتحقق بلغت 0.036056 و0.045826. في المقابل، أظهرت VGG19 وDenseNet121 قيم RMSE أعلى، مما يشير إلى ملاءمات أقل دقة. عموماً، عرضت النماذج منحنيات تعلم مثالية، مما يدل على التعلم الفعال والتكيف مع البيانات. بالإضافة إلى ذلك، تم تقييم الدقة والاسترجاع ودرجات F1، حيث حققت DenseNet121 دقة عالية عبر جميع الفئات، متفوقة بشكل خاص في الكشف عن سرطان الثدي والرئة. اختلفت أوقات التدريب بشكل كبير بين النماذج، حيث كانت DenseNet121 الأسرع بحوالي 3 ساعات و20 دقيقة، بينما كانت NASNetLarge الأبطأ، حيث استغرقت أكثر من 6 ساعات و40 دقيقة. بشكل عام، تؤكد النتائج على فعالية هذه النماذج في مهام تصنيف السرطان، مما يبرز أهمية اختيار النموذج بناءً على خصائص السرطان المحددة.

المناقشة

تتكون مجموعة البيانات المستخدمة في هذه الدراسة من سبع قواعد بيانات سرطان متميزة، تشمل مجموعة متنوعة من أنواع السرطان وطرق التصوير. تتضمن مجموعة بيانات اللوكيميا اللمفاوية الحادة 3,256 صورة لطخة دم محيطية من 89 مريضًا، مصنفة إلى ثلاثة أنواع فرعية وفئتين (حميدة وخبيثة). تتكون مجموعة بيانات أورام الدماغ من 3,064 صورة معززة بالتباين T1 من 233 مريضًا، بينما تحتوي مجموعة بيانات سرطان الرئة والقولون على 25,000 صورة نسيجية عبر خمس فئات. تشمل مجموعات البيانات الإضافية 12,446 صورة DICOM لأورام الكلى، و4,049 صورة لسرطان عنق الرحم، و7,909 صور نسيجية لسرطان الثدي. يتم تقسيم كل مجموعة بيانات بدقة إلى مجموعات تدريب والتحقق والاختبار لتسهيل تقييم النموذج بشكل قوي، مع تخصيص نموذجي يبلغ 70% للتدريب و15% لكل من التحقق والاختبار.

يعد معالجة البيانات أمرًا حيويًا لضمان دقة الكشف عن السرطان المتعدد وتصنيفه. تشمل خط أنابيب المعالجة تحويل الصور إلى تدرج الرمادي، وتطبيق ثنائية أوزو لتحقيق أفضل عتبة، واستخدام تقنيات إزالة الضوضاء مثل التصفية الغاوسية. ثم يتم استخدام تحويل المسافة وتقسيم المياه لتحديد المناطق السرطانية بشكل فعال. يركز استخراج الميزات على خصائص الكنتور، حيث تعتبر قياسات المحيط أوصافًا كمية لأنواع السرطان المختلفة. يتم استخدام مجموعة متنوعة من المصنفات القائمة على التعلم العميق، بما في ذلك DenseNet وInceptionResNetV2 وMobileNetV2، لتحليل مجموعات البيانات، مع تحسين المعلمات لمواجهة عدم توازن الفئات ومنع الإفراط في التكيف. تستفيد الهياكل من تقنيات متقدمة مثل الاتصالات المتبقية والتفافات القابلة للفصل لتعزيز استخراج الميزات ودقة التصنيف، مما يسهم في تحسين القدرات التشخيصية في تطبيقات التصوير الطبي.

القيود

تسلط الأبحاث الضوء على القدرات الواعدة لتقنيات التعلم العميق المعتمدة على الذكاء الاصطناعي في تعزيز الكشف عن السرطان وتصنيفه عبر سبعة أنواع محددة من السرطان. ومع ذلك، تم تحديد عدة قيود قد تؤثر على قابلية تعميم وفعالية النتائج. أولاً، يقتصر تركيز الدراسة على عدد محدود من أنواع السرطان على تطبيق النماذج على نطاق أوسع من السرطانات. بالإضافة إلى ذلك، تثير الاعتماد على مجموعات البيانات المتاحة للجمهور مخاوف بشأن حجمها وتنوعها، مما قد لا يعكس بشكل كافٍ السيناريوهات الواقعية. إن الاعتماد على تقنيات التقسيم للمعالجة المسبقة، على الرغم من فائدته في الدقة، يقدم تعقيدًا وعبءًا حسابيًا، مما يترك ضرورة التقسيم غير مختبرة.

علاوة على ذلك، كانت استكشافات الدراسة لنماذج التعلم العميق مقيدة باستبعاد هياكل أكثر حداثة، مثل محولات الرؤية (ViTs)، التي قد تقدم أداءً محسنًا. شكلت المتطلبات الحسابية للنماذج التي تم تقييمها، بما في ذلك DenseNet وResNet، تحديات من حيث وقت التدريب وإدارة الموارد، مما استلزم استخدام أجهزة متقدمة وتقنيات تحسين للتخفيف من هذه المشكلات. يجب أن تهدف الأبحاث المستقبلية إلى تضمين مجموعة متنوعة أوسع من أنواع السرطان، واستخدام مجموعات بيانات أكبر وأكثر تنوعًا، واستكشاف هياكل النماذج المعاصرة لتعزيز مجال الكشف عن السرطان وتصنيفه بشكل أكبر.

Journal: Scientific Reports, Volume: 14, Issue: 1
DOI: https://doi.org/10.1038/s41598-024-75876-2
PMID: https://pubmed.ncbi.nlm.nih.gov/39443621
Publication Date: 2024-10-23
Author(s): Yogesh Kumar et al.
Primary Topic: Brain Tumor Detection and Classification

Overview

The research paper addresses the critical challenge of cancer detection, which remains the leading cause of global mortality. Traditional methods often involve invasive procedures and lengthy analyses, highlighting the need for more efficient and accurate solutions. This study leverages automated cancer detection through AI-based techniques, specifically employing various deep learning models, including Convolutional Neural Networks (CNNs) such as DenseNet121, DenseNet201, and others, to analyze image datasets for seven cancer types: brain, oral, breast, kidney, Acute Lymphocytic Leukemia, lung, colon, and cervical cancer. The methodology includes image segmentation and contour feature extraction, with DenseNet121 achieving the highest validation accuracy of 99.94%, alongside a loss of 0.0017 and low Root Mean Square Error (RMSE) values.

The findings underscore the potential of deep learning in enhancing cancer detection and classification using histopathological images. While DenseNet121, ResNet152V2, InceptionV3, and MobileNetV2 demonstrated strong performance, the study’s limitation lies in its focus on only seven cancer types, which may restrict the model’s generalizability. Additionally, the cost-effectiveness of the model was not evaluated, posing a challenge for clinical adoption. Future research aims to broaden the dataset to include more cancer types and images, explore real-world deployment, and incorporate advanced models like Vision Transformers (ViT) to refine the approach. Developing resource-efficient models is crucial for making AI-driven cancer detection practical and accessible in medical settings.

Introduction

The introduction of this research paper provides a comprehensive overview of various artificial intelligence (AI) techniques employed in the diagnosis of different cancer types. The literature survey highlights significant advancements in the classification of cancers such as Acute Myelogenous Leukaemia (AML), brain tumors, breast cancer, cervical cancer, and kidney tumors, utilizing deep learning (DL) and machine learning (ML) methodologies. For instance, Anilkumar et al. (2022) utilized convolutional neural networks (CNNs) to classify leukemic cells, while Saeedi et al. (2023) enhanced early tumor detection in brain imaging through a 2D CNN architecture. Notably, Mahmud et al. (2023) and Abunasser et al. (2023) focused on brain tumor identification and breast cancer staging, respectively, emphasizing the importance of early detection.

The section also discusses various innovative approaches, such as the hybridization of deep neural networks with traditional ML techniques, as seen in the work of Mohsen et al. (2018) for brain MRI classification, and the integration of class balancing techniques by Glučina et al. (2023) to address data imbalances. Furthermore, advancements in model architectures, such as EfficientNetv2 and the Multi-Axis Vision Transformer (MaxViT), have demonstrated significant improvements in classification accuracy for complex datasets, achieving remarkable test accuracies of 99.76% and 99.48%, respectively. The introduction concludes by underscoring the need for expanded datasets and further research to enhance diagnostic accuracy and patient outcomes across various cancer types.

Methods

The methodology section outlines the backend development process for creating a model aimed at accurately identifying and classifying various forms of cancer using transfer learning architectures and histopathological images. The model’s design emphasizes achieving reliable classification results, as illustrated in Figure 1.

To optimize performance, the study specifies system requirements, including an Intel i7 processor to facilitate smooth operation during intensive computational tasks. A minimum of 32GB RAM is necessary to manage the large datasets and high demands of deep learning and image processing. Additionally, a GPU with at least 4GB of memory is essential for accelerating model training and inference, particularly when utilizing complex neural networks. The system is designed to be compatible with Windows versions 8, 10, or 11, ensuring stable and efficient operation for AI and deep learning applications.

Results

In the results section, the performance of various convolutional neural network (CNN) models on a multi-cancer dataset is thoroughly evaluated using multiple performance metrics. The analysis includes both training and validation datasets, revealing that DenseNet121 and NASNetMobile achieved the highest training accuracies of 99.95% and 99.95%, respectively, with minimal training losses of 0.019 and 0.0059. Other models, such as DenseNet201 and Inception V3, also demonstrated commendable accuracies exceeding 99.5%. On the validation front, DenseNet121 outperformed others with a validation accuracy of 99.94% and a loss of 0.0017, while VGG19 and Inception V3 showed slightly lower validation accuracies of 99.46% and 99.51%, respectively.

The assessment of regression performance, measured by root mean square error (RMSE), indicated that DenseNet201 was the best-performing model with training and validation RMSE values of 0.036056 and 0.045826. In contrast, VGG19 and DenseNet121 exhibited higher RMSE values, suggesting less accurate fits. The models generally displayed ideal learning curves, indicating effective learning and adaptation to the data. Additionally, precision, recall, and F1 scores were evaluated, with DenseNet121 achieving high precision across all classes, particularly excelling in breast and lung cancer detection. The training times varied significantly among models, with DenseNet121 being the fastest at approximately 3 hours and 20 minutes, while NASNetLarge was the slowest, taking over 6 hours and 40 minutes. Overall, the results underscore the effectiveness of these deep learning models in cancer classification tasks, highlighting the importance of model selection based on specific cancer characteristics.

Discussion

The dataset utilized in this study comprises seven distinct cancer databases, encompassing a variety of cancer types and imaging modalities. The Acute Lymphocytic Leukemia dataset includes 3,256 peripheral blood smear images from 89 patients, categorized into three subtypes and two classes (benign and malignant). The brain tumor dataset consists of 3,064 T1-weighted contrast-enhanced images from 233 patients, while the lung and colon cancer dataset features 25,000 histopathological images across five classes. Additional datasets include 12,446 DICOM images for kidney tumors, 4,049 images for cervical cancer, and 7,909 histopathology images for breast cancer. Each dataset is meticulously split into training, validation, and testing sets to facilitate robust model evaluation, with a typical allocation of 70% for training and 15% each for validation and testing.

Data preprocessing is critical for ensuring the accuracy of multi-cancer detection and classification. The preprocessing pipeline includes converting images to grayscale, applying Otsu Binarization for optimal thresholding, and employing noise removal techniques such as Gaussian filtering. Distance transformation and watershed segmentation are then utilized to delineate cancerous regions effectively. Feature extraction focuses on contour characteristics, with perimeter measurements serving as quantitative descriptors for different cancer types. Various deep learning classifiers, including DenseNet, InceptionResNetV2, and MobileNetV2, are employed to analyze the datasets, with hyperparameters optimized to address class imbalances and prevent overfitting. The architectures leverage advanced techniques such as residual connections and depthwise separable convolutions to enhance feature extraction and classification accuracy, ultimately contributing to improved diagnostic capabilities in medical imaging applications.

Limitations

The research highlights the promising capabilities of AI-based deep learning techniques for enhancing cancer detection and classification across seven specific cancer types. However, several limitations were identified that could impact the generalizability and effectiveness of the findings. Firstly, the study’s focus on a limited number of cancer types restricts the applicability of the models to a broader spectrum of cancers. Additionally, the reliance on publicly available datasets raises concerns regarding their size and variability, which may not adequately reflect real-world scenarios. The dependency on segmentation techniques for preprocessing, while beneficial for accuracy, introduces complexity and computational overhead, leaving the necessity of segmentation untested.

Moreover, the study’s exploration of deep learning models was constrained by the exclusion of more recent architectures, such as Vision Transformers (ViTs), which may offer enhanced performance. The computational demands of the evaluated models, including DenseNet and ResNet variants, posed challenges in terms of training time and resource management, necessitating the use of advanced hardware and optimization techniques to mitigate these issues. Future research should aim to include a wider variety of cancer types, utilize larger and more diverse datasets, and explore contemporary model architectures to further advance the field of cancer detection and classification.