Mpox-XDE: نموذج جماعي يستخدم الشبكات العصبية العميقة والذكاء الاصطناعي القابل للتفسير لاكتشاف وتصنيف جدري القرود Mpox-XDE: an ensemble model utilizing deep CNN and explainable AI for monkeypox detection and classification

المجلة: BMC Infectious Diseases، المجلد: 25، العدد: 1
DOI: https://doi.org/10.1186/s12879-025-10811-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40133816
تاريخ النشر: 2025-03-25
المؤلف: Dip Kumar Saha وآخرون
الموضوع الرئيسي: أبحاث فيروس الجدري وتفشيه

نظرة عامة

تتناول البحث الحاجة الملحة لتحسين أدوات التشخيص لجدري القرود البشري (Mpox)، خاصة في ضوء التفشي العالمي الأخير. يقترح المؤلفون نموذج تعلم عميق جماعي، يسمى Mpox-XDE، والذي يدمج ثلاثة نماذج محسنة: Xception وDenseNet201 وEfficient-NetB7. باستخدام مجموعة بيانات صور جلد Mpox (MSID) التي تتكون من 770 صورة، يحقق النموذج الجماعي مقاييس أداء ملحوظة، بما في ذلك دقة اختبار تبلغ 98.70%، ودقة تبلغ 98.90%، واسترجاع يبلغ 98.80%، ودرجة F1 تبلغ 98.80%. يستخدم النموذج طبقة Softmax، وطبقة كثيفة، وطبقة مسطحة، وآلية إسقاط لتصنيف الصور إلى أربع فئات: جدري الماء، والحصبة، وعادي، وMpox. بالإضافة إلى ذلك، يتم استخدام تقنية الذكاء الاصطناعي القابلة للتفسير Gradient-weighted Class Activation Mapping (Grad-CAM) لتعزيز القابلية للتفسير من خلال تصور المناطق المتأثرة على الجلد.

على الرغم من النتائج الواعدة، يعترف البحث بالقيود مثل مجموعة البيانات الصغيرة نسبيًا، والتي قد تؤدي إلى الإفراط في التكيف وتقليل قدرات التعميم. كما أن تعقيد بنية Mpox-XDE يزيد من وقت التدريب وقد يعيق مساهمات الطبقات الفردية في التنبؤات. ستركز الأعمال المستقبلية على توسيع مجموعة البيانات مع عينات أكثر تنوعًا لتعزيز متانة النموذج وموثوقيته، بالإضافة إلى تبسيط بنية النموذج لتحسين الكفاءة الحسابية. كما يهدف المؤلفون إلى تحسين مكونات الذكاء الاصطناعي القابلة للتفسير لتعزيز الشفافية والثقة بين المهنيين الصحيين، مع تطبيقات محتملة تمتد إلى تشخيص أمراض أخرى.

مقدمة

تناقش مقدمة ورقة البحث التفشي العالمي الأخير لجدري القرود (Mpox)، وهو عدوى فيروسية كانت تاريخيًا متوطنة في إفريقيا، والتي زادت بشكل كبير في عام 2022، مما أثر على 116 دولة مع أكثر من 92,783 حالة تم الإبلاغ عنها و171 حالة وفاة حتى نوفمبر 2023. يتميز Mpox بأعراض مثل الحمى، وتضخم الغدد الليمفاوية، وطفح جلدي مميز يتطور من حبوب مسطحة إلى بثور مملوءة بالسوائل. عادةً ما يتضمن التشخيص اختبار تفاعل البوليميراز المتسلسل (PCR)، والذي يكون مكلفًا ويستغرق وقتًا طويلاً، مما يبرز الحاجة الملحة لطرق الكشف الفعالة.

لمعالجة ذلك، تقترح الورقة نموذج تعلم عميق جماعي، يسمى Mpox-XDE، والذي يدمج ثلاثة نماذج متقدمة—Xception وDenseNet201 وEfficientNetB7—إلى جانب نموذج المحول SwinVit. يهدف هذا النموذج إلى تعزيز تصنيف Mpox باستخدام مجموعة بيانات آفات جلد Mpox (MSLD) ويشمل تقنيات الذكاء الاصطناعي القابلة للتفسير (XAI)، وتحديدًا Grad-CAM، لتصور المناطق المتأثرة على الجلد. تم هيكلة المنهجية المقترحة للتخفيف من مشاكل مثل الإفراط في التكيف وتحسين دقة التصنيف من خلال بنية قوية تشمل طبقات كثيفة وطبقات إسقاط، مما يسهل في النهاية الكشف الأفضل وفهم آفات Mpox. توضح الورقة مساهماتها وتضع الأساس للمنهجية التفصيلية والنتائج التجريبية في الأقسام التالية.

طرق

في هذا القسم، يقدم المؤلفون منهجيتهم للكشف وتصنيف الفئات الأربع ضمن مجموعة بيانات MSID، كما هو موضح في الشكل 1. تتضمن الطريقة مجموعة من نماذج التعلم العميق (DL) المصممة خصيصًا لتصنيف صور Mpox، والتي يتم تقييمها مقارنةً بهياكل الشبكات العصبية التلافيفية (CNN) المعروفة. تستخدم الدراسة SwinViT كنموذج المحول الرئيسي وتقارن فعاليته مع نماذج DL الأخرى، بما في ذلك Xception وDenseNet201 وEfficientNetB7.

قبل التدريب، خضعت مجموعة البيانات لعمليات المعالجة المسبقة وتنفيذ طبقات نموذج مخصصة لتعزيز الأداء. تعتبر هذه الخطوات التحضيرية حاسمة لضمان تصنيف موثوق ودقيق لمجموعة بيانات MSID، مما يساهم في إنشاء إطار عمل قوي للتقييم التجريبي الذي يلي.

نتائج

تظهر نتائج هذه الدراسة الأداء المتفوق لنموذج Mpox-XDE في الكشف عن M-Pox مقارنةً بنماذج التعلم العميق الأخرى التي تم تقييمها، بما في ذلك Xception وEfficientNetB7 وDenseNet201. أظهر نموذج Mpox-XDE قدرة كبيرة على التكيف مع البيانات الجديدة وتقليل الإفراط في التكيف من خلال استخدام أوزان المجموعة، مما أدى إلى تحسين الدقة. كانت الطريقة الجماعية المستخدمة في هذا النموذج فعالة في التخفيف من أخطاء النماذج الفردية، مما أسفر عن تحسين مقاييس الأداء العامة.

كشفت التحليلات الكمية أن Mpox-XDE حقق درجة AUC متوسطة مثيرة للإعجاب تبلغ 1.00، مما يدل على قدرة تصنيف استثنائية. من حيث الدقة، والاسترجاع، ودرجة F1، سجل Mpox-XDE قيمًا تبلغ 98.90% و98.80% و98.80%، على التوالي، متفوقًا على جميع النماذج الأخرى المختبرة. تلا نموذج Xception بدقة تبلغ 96.50%، مما يبرز فعالية المنهجية المقترحة. يتم تقديم تقييمات الأداء الشاملة، بما في ذلك منحنيات ROC ومقاييس مفصلة، في الأشكال والجداول المرفقة، مما يبرز قوة نموذج Mpox-XDE في مهام الكشف عن M-Pox.

مناقشة

في قسم المناقشة، تستعرض الورقة دراسات مختلفة تركزت على الكشف وتصنيف Mpox باستخدام تقنيات التعلم العميق (DL)، وخاصة الشبكات العصبية التلافيفية (CNNs) وطرق الذكاء الاصطناعي القابلة للتفسير (XAI). تشمل التقدمات الملحوظة تقييم Sitaula et al. لـ 13 نموذج DL مدرب مسبقًا، محققًا دقة قصوى تبلغ 87.13%، ونموذج MobileNetV2 القائم على الانتباه من Raha et al.، الذي حسن الكشف عن Mpox للأجهزة الطرفية ودمج مجموعة أوسع من الأمراض الجلدية لتحسين التشخيص المبكر. تشمل المساهمات الأخرى المهمة تطوير MonkeyNet بواسطة Bala et al.، الذي حقق دقة تشخيص تبلغ 93.19% و98.91% على مجموعات البيانات الأصلية والموسعة، على التوالي، وتقديم إطار عمل للتعلم الفيدرالي بواسطة Kundu et al.، الذي حقق معدل دقة يبلغ 97.90%.

تؤكد الورقة على إمكانيات نماذج DL، وخاصة الأساليب الجماعية مثل Mpox-XDE المقترح، الذي يجمع التنبؤات من عدة هياكل (Xception وDenseNet201 وEfficientNetB7) لتعزيز دقة التصنيف وقابلية التفسير. يهدف النموذج الجماعي إلى معالجة القيود التي لوحظت في الدراسات السابقة، مثل نقص الطرق الجماعية الفعالة والتطبيق المحدود لتقنيات القابلية للتفسير مثل Grad-CAM وLIME. من خلال الاستفادة من مجموعة بيانات شاملة وتقنيات معالجة مسبقة متقدمة، يسعى النموذج المقترح إلى تحسين القدرات التشخيصية لجدري القرود، خاصة في البيئات ذات الموارد المحدودة، مما يساهم في اتخاذ قرارات سريرية أكثر فعالية واستراتيجيات الكشف المبكر.

Journal: BMC Infectious Diseases, Volume: 25, Issue: 1
DOI: https://doi.org/10.1186/s12879-025-10811-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40133816
Publication Date: 2025-03-25
Author(s): Dip Kumar Saha et al.
Primary Topic: Poxvirus research and outbreaks

Overview

The research addresses the urgent need for improved diagnostic tools for human monkeypox (Mpox), particularly in light of the recent global outbreak. The authors propose an ensemble deep learning model, termed Mpox-XDE, which integrates three enhanced models: Xception, DenseNet201, and Efficient-NetB7. Utilizing the Mpox Skin Images Dataset (MSID) comprising 770 images, the ensemble model achieves remarkable performance metrics, including a testing accuracy of 98.70%, precision of 98.90%, recall of 98.80%, and an F1-score of 98.80%. The model employs a Softmax layer, a dense layer, a flattened layer, and a dropout mechanism to classify images into four categories: chickenpox, measles, normal, and Mpox. Additionally, the explainable AI technique Gradient-weighted Class Activation Mapping (Grad-CAM) is utilized to enhance interpretability by visualizing affected skin areas.

Despite the promising results, the study acknowledges limitations such as the relatively small dataset, which may lead to overfitting and reduced generalization capabilities. The complexity of the Mpox-XDE architecture also increases training time and may obscure the contributions of individual layers to predictions. Future work will focus on expanding the dataset with more diverse samples to enhance model robustness and reliability, as well as simplifying the model architecture to improve computational efficiency. The authors also aim to refine the explainable AI components to foster greater transparency and trust among healthcare professionals, with potential applications extending to the diagnosis of other diseases.

Introduction

The introduction of the research paper discusses the recent global outbreak of Mpox, a viral infection historically endemic to Africa, which surged in 2022, affecting 116 countries with over 92,783 reported cases and 171 deaths as of November 2023. Mpox is characterized by symptoms such as fever, lymphadenopathy, and a distinctive rash that evolves from flat papules to fluid-filled blisters. The diagnosis typically involves polymerase chain reaction (PCR) testing, which is costly and time-consuming, highlighting the urgent need for efficient detection methods.

To address this, the paper proposes an ensemble deep learning model, termed Mpox-XDE, which integrates three advanced models—Xception, DenseNet201, and EfficientNetB7—alongside the transformer model SwinVit. This model aims to enhance the classification of Mpox using the Mpox Skin Lesion Dataset (MSLD) and incorporates Explainable Artificial Intelligence (XAI) techniques, specifically Grad-CAM, to visualize affected areas on the skin. The proposed methodology is structured to mitigate issues like overfitting and improve classification accuracy through a robust architecture that includes dense and dropout layers, ultimately facilitating better detection and understanding of Mpox lesions. The paper outlines its contributions and sets the stage for detailed methodology and experimental results in subsequent sections.

Methods

In this section, the authors present their methodology for detecting and classifying the four classes within the MSID dataset, as depicted in Figure 1. The approach involves an ensemble of deep learning (DL) models specifically tailored for Mpox image classification, which is benchmarked against established convolutional neural network (CNN) architectures. The study employs SwinViT as the primary transformer model and compares its efficacy with other DL models, including Xception, DenseNet201, and EfficientNetB7.

Prior to training, the dataset underwent preprocessing and the implementation of customized model layers to enhance performance. These preparatory steps are crucial for ensuring a reliable and accurate classification of the MSID dataset, thereby establishing a robust framework for the experimental evaluation that follows.

Results

The results of this study demonstrate the superior performance of the Mpox-XDE model in detecting M-Pox compared to other evaluated deep learning models, including Xception, EfficientNetB7, and DenseNet201. The Mpox-XDE model exhibited significant adaptability to new data and reduced overfitting through the use of group weights, leading to enhanced accuracy. The ensemble approach employed in this model effectively mitigated individual model errors, resulting in improved overall performance metrics.

Quantitative analysis revealed that the Mpox-XDE achieved an impressive micro average AUC score of 1.00, indicating outstanding classification capability. In terms of precision, recall, and F1 score, Mpox-XDE recorded values of 98.90%, 98.80%, and 98.80%, respectively, outperforming all other models tested. The Xception model followed with a precision of 96.50%, highlighting the efficacy of the proposed methodology. Comprehensive performance evaluations, including ROC curves and detailed metrics, are presented in the accompanying figures and tables, underscoring the robustness of the Mpox-XDE model in M-Pox detection tasks.

Discussion

In the discussion section, the paper reviews various studies that have focused on the detection and classification of Mpox using deep learning (DL) techniques, particularly convolutional neural networks (CNNs) and explainable artificial intelligence (XAI) methods. Notable advancements include Sitaula et al.’s evaluation of 13 pretrained DL models, achieving a top accuracy of 87.13%, and Raha et al.’s attention-based MobileNetV2 model, which optimized Mpox detection for edge devices and incorporated a broader range of skin diseases for improved early diagnosis. Other significant contributions include the development of MonkeyNet by Bala et al., which achieved diagnostic accuracies of 93.19% and 98.91% on original and augmented datasets, respectively, and the introduction of a federated learning framework by Kundu et al., which attained a 97.90% accuracy rate.

The paper emphasizes the potential of DL models, particularly ensemble approaches like the proposed Mpox-XDE, which combines predictions from multiple architectures (Xception, DenseNet201, and EfficientNetB7) to enhance classification accuracy and interpretability. The ensemble model aims to address limitations observed in previous studies, such as the lack of efficient ensemble methods and the confined application of interpretability techniques like Grad-CAM and LIME. By leveraging a comprehensive dataset and advanced preprocessing techniques, the proposed model seeks to improve diagnostic capabilities for Mpox, particularly in resource-limited settings, thereby contributing to more effective clinical decision-making and early detection strategies.