نموذج تجميعي قائم على التعلم العميق لتصنيف أمراض أوراق الطماطم بدقة من خلال الاستفادة من هياكل ResNet50 وMobileNetV2 Deep learning based ensemble model for accurate tomato leaf disease classification by leveraging ResNet50 and MobileNetV2 architectures

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-98015-x
PMID: https://pubmed.ncbi.nlm.nih.gov/40263518
تاريخ النشر: 2025-04-22
المؤلف: Jatin Sharma وآخرون
الموضوع الرئيسي: الزراعة الذكية والذكاء الاصطناعي

نظرة عامة

تقدم هذه الدراسة نموذجًا قائمًا على التعلم العميق لتصنيف أمراض أوراق الطماطم، يجمع بين هياكل MobileNetV2 و ResNet50. تم تدريب النموذج على مجموعة بيانات تحتوي على 11,000 صورة موضحة تمثل عشرة فئات من الأمراض، محققًا دقة اختبار مثيرة للإعجاب تبلغ 99.91%. التحسينات التي أُدخلت على النماذج، بما في ذلك استخدام GlobalAverage Pooling2D، وBatch Normalization، وDropout، وDense layers، حسنت بشكل كبير استخراج الميزات. أظهر النموذج دقة عالية (99.92%)، واسترجاع (99.90%)، ودرجة F1 تبلغ 99.91%، مما يدل على فعاليته في أتمتة تشخيص الأمراض ودعم الممارسات الزراعية المستدامة.

على الرغم من نتائجه الواعدة، تعترف الدراسة بالقيود مثل تدريب النموذج على مجموعة بيانات محكومة، والتي قد لا تعكس تمامًا الظروف الواقعية مثل الانسدادات والإضاءة المتغيرة. بالإضافة إلى ذلك، قد تعيق تعقيدات النموذج التنفيذ على الأجهزة ذات الطاقة المنخفضة، ولا يعالج عدم توازن الفئات، مما قد يؤثر على الأداء على مجموعات البيانات المنحازة. تشمل اتجاهات البحث المستقبلية استكشاف تقنيات التعلم الذاتي والتعلم القليل، وتعزيز التعميم لمحاصيل أخرى، وتطوير تطبيق جوال في الوقت الحقيقي لاكتشاف الأمراض. تهدف هذه التطورات إلى تعزيز الأمن الغذائي وتعزيز طرق الزراعة الصديقة للبيئة من خلال حلول زراعية مدفوعة بالذكاء الاصطناعي.

الطرق

تشمل المنهجية المقترحة في هذه الدراسة تطوير نموذج قائم على التعلم العميق لتصنيف أمراض أوراق الطماطم باستخدام مجموعة بيانات متوازنة من 11,000 صورة، مصنفة إلى 10 فئات من الأمراض وفئة صحية، مع 1,100 صورة لكل فئة. يجمع النموذج بين ResNet50، الذي يتفوق في استخراج الميزات الهرمية العميقة، وMobileNetV2، المعروف بقدراته الفعالة والخفيفة في استخراج الميزات. يتم معالجة متجهات الميزات المجمعة من كلا النموذجين من خلال طبقات متصلة بالكامل، مع إجراء التدريب باستخدام خسارة الانتروبيا المتقاطعة الفئوية ومُحسِّن آدم لتقليل الإفراط في التكيف. يتم تقييم أداء النموذج باستخدام مقاييس مثل الدقة، والدقة، والاسترجاع، ودرجة F1، ومصفوفة الالتباس، مما يظهر فعاليته في تصنيف مختلف أمراض أوراق الطماطم.

تشمل المنهجية خطوات المعالجة المسبقة حيث يتم تغيير حجم الصور إلى أبعاد قياسية تبلغ 256 × 256 و 244 × 244، مما يؤدي إلى مجموعات تدريب (80%)، والتحقق (10%)، والاختبار (10%). يتم ضبط كل من ResNet50 وMobileNetV2 باستخدام أوزان مدربة مسبقًا للتكيف مع مجموعة بيانات أمراض أوراق الطماطم. تستخدم عملية استخراج الميزات تقنيات مثل Global Average Pooling2D، وBatch Normalization، وDropout، وDense layers، مما يؤدي إلى طبقة كثيفة نهائية مع دالة تنشيط Softmax لإنتاج توزيع احتمالي عبر فئات الأمراض. يتم تدريب النموذج لمدة خمسين دورة مع حجم دفعة يبلغ اثنين وثلاثين، باستخدام جدولة معدل التعلم للتعديل الديناميكي. لا تعزز هذه الطريقة المجمعة الدقة والموثوقية فحسب، بل تظهر أيضًا إمكانية تطبيقها في مهام الزراعة والتصوير الطبي الأخرى، مما يظهر وعدًا كبيرًا للتعرف الآلي على أمراض أوراق الطماطم.

النتائج

يقدم قسم النتائج أداء التصنيف للنماذج المعززة بالتعلم العميق – ResNet50، وMobileNetV2، ونموذج مجمع – على مجموعة بيانات أمراض أوراق الطماطم. تشير مصفوفات الالتباس لكل من ResNet50 وMobileNetV2 إلى دقة عالية، حيث تتماشى معظم التصنيفات على طول القطر، مما يشير إلى تحديد فعال للأمراض عبر جميع الفئات العشر، بما في ذلك البقعة البكتيرية، والعفن المبكر، والأوراق الصحية. حقق ResNet50 دقة تحقق تبلغ 90.23%، بينما وصلت MobileNetV2 إلى 92.47%، مع ظهور بعض الأخطاء الطفيفة في التصنيف التي تسلط الضوء على مجالات التحسين المحتملة من خلال ضبط المعلمات الفائقة وزيادة البيانات.

أظهر النموذج المجمع فعالية أكبر، محققًا تصنيفًا مثاليًا لجميع الفئات باستثناء خطأ واحد في فئة “البقعة البكتيرية للطماطم”. تظهر مصفوفة الالتباس لهذا النموذج هيمنة قطرية كبيرة، مما يدل على قدرة قوية على تقليل الإيجابيات الكاذبة والسلبية، وهو أمر حاسم للتدخلات الزراعية في الوقت المناسب. تؤكد النتائج على فعالية الطريقة المجمعة في الاستفادة من نقاط القوة لعدة نماذج، مما يعزز دقة التصنيف وموثوقيته في تشخيص أمراض أوراق الطماطم، مما يجعلها أداة قيمة للتطبيقات الزراعية العملية.

المناقشة

يسلط قسم المناقشة في الورقة الضوء على التأثير الكبير لأمراض أوراق الطماطم على الإنتاجية الزراعية والآثار المالية لهذه الأمراض. غالبًا ما تكون طرق التعرف التقليدية، التي تعتمد على الفحص اليدوي، غير فعالة وعرضة للأخطاء. في المقابل، مكنت التقدمات في التعلم الآلي (ML) والتعلم العميق (DL) من تصنيف أكثر دقة وكفاءة تلقائيًا لأمراض النباتات. تغطي المراجعة مجموعة متنوعة من الأساليب، بما في ذلك تقنيات ML الكلاسيكية مثل آلات الدعم الناقل (SVM) والغابات العشوائية، التي أظهرت نتائج واعدة ولكنها مقيدة باعتمادها على الميزات التي أنشأها الإنسان وقدراتها على التعميم في مجموعات البيانات المعقدة.

ظهر التعلم العميق، وخاصة من خلال الشبكات العصبية التلافيفية (CNNs)، كبديل متفوق، قادر على تعلم الميزات تلقائيًا من الصور. تم تطوير نماذج CNN خفيفة الوزن للتطبيقات في الوقت الحقيقي، محققة معدلات دقة عالية (مثل 99.3% مع نموذج الله وأخرون). تم أيضًا استكشاف نماذج هجينة تجمع بين هياكل متعددة، مع نجاحات ملحوظة في الدقة ولكن تحديات تتعلق باعتماد البيانات والحاجة إلى مجموعات بيانات واسعة. يبرز القسم أهمية طرق استخراج الميزات القوية وإمكانات تقنيات التعلم المجمعة لتعزيز أداء التصنيف، على الرغم من أنها قد تتطلب موارد حسابية كبيرة. بشكل عام، تؤكد النتائج على الحاجة إلى البحث المستمر في تحسين تعميم النموذج عبر ظروف بيئية متنوعة وأهمية مجموعات البيانات عالية الجودة لتصنيف الأمراض بشكل فعال.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-98015-x
PMID: https://pubmed.ncbi.nlm.nih.gov/40263518
Publication Date: 2025-04-22
Author(s): Jatin Sharma et al.
Primary Topic: Smart Agriculture and AI

Overview

This research presents a deep learning-based ensemble model for the classification of tomato leaf diseases, integrating MobileNetV2 and ResNet50 architectures. The model was trained on a dataset of 11,000 annotated images representing ten disease categories, achieving an impressive test accuracy of 99.91%. The enhancements made to the models, including the use of GlobalAverage Pooling2D, Batch Normalization, Dropout, and Dense layers, significantly improved feature extraction. The model demonstrated high precision (99.92%), recall (99.90%), and an F1-score of 99.91%, indicating its effectiveness in automating disease diagnosis and supporting sustainable agricultural practices.

Despite its promising results, the study acknowledges limitations such as the model’s training on a controlled dataset, which may not fully capture real-world conditions like occlusions and varying illumination. Additionally, the model’s complexity could hinder implementation on low-power devices, and it does not address class imbalance, which may affect performance on skewed datasets. Future research directions include exploring self-supervised and few-shot learning techniques, enhancing generalization to other crops, and developing a real-time mobile application for disease detection. These advancements aim to bolster food security and promote environmentally friendly farming methods through AI-driven agricultural solutions.

Methods

The proposed methodology in this study involves the development of a deep learning-based ensemble model for classifying tomato leaf diseases using a balanced dataset of 11,000 images, categorized into 10 disease classes and a healthy class, with 1,100 images per class. The model integrates ResNet50, which excels in extracting deep hierarchical features, and MobileNetV2, known for its lightweight and efficient feature extraction capabilities. The concatenated feature vectors from both models are processed through fully connected layers, with training conducted using categorical cross-entropy loss and the Adam optimizer to mitigate overfitting. The model’s performance is evaluated using metrics such as accuracy, precision, recall, F1-score, and a confusion matrix, demonstrating its effectiveness in classifying various tomato leaf diseases.

The methodology includes pre-processing steps where images are resized to standard dimensions of 256 × 256 and 244 × 244, resulting in training (80%), validation (10%), and testing (10%) subsets. Both ResNet50 and MobileNetV2 are fine-tuned with pre-trained weights to adapt to the tomato leaf disease dataset. The feature extraction process employs techniques such as Global Average Pooling2D, Batch Normalization, Dropout, and Dense layers, culminating in a final dense layer with a Softmax activation function to produce a probability distribution across the disease classes. The model is trained for fifty epochs with a batch size of thirty-two, utilizing a learning rate scheduler for dynamic adjustment. This ensemble approach not only enhances accuracy and robustness but also demonstrates potential applicability in other agricultural and medical imaging tasks, showcasing significant promise for automated identification of tomato leaf diseases.

Results

The results section presents the classification performance of fine-tuned deep learning models—ResNet50, MobileNetV2, and an ensemble model—on the Tomato Leaf Disease dataset. The confusion matrices for both ResNet50 and MobileNetV2 indicate high accuracy, with most classifications aligning along the diagonal, suggesting effective disease identification across all ten classes, including bacterial spot, early blight, and healthy leaves. ResNet50 achieved a validation accuracy of 90.23%, while MobileNetV2 reached 92.47%, with both models exhibiting slight misclassifications that highlight areas for potential improvement through hyperparameter tuning and data augmentation.

The ensemble model demonstrated even greater efficacy, achieving perfect classification for all classes except for one misclassification in the “Tomato Bacterial Spot” category. This model’s confusion matrix shows significant diagonal dominance, indicating a strong ability to minimize false positives and negatives, which is crucial for timely agricultural interventions. The results underscore the ensemble method’s effectiveness in leveraging the strengths of multiple models, thereby enhancing classification accuracy and reliability in diagnosing tomato leaf diseases, making it a valuable tool for practical agricultural applications.

Discussion

The discussion section of the paper highlights the significant impact of tomato leaf diseases on agricultural productivity and the financial implications of these diseases. Traditional identification methods, reliant on manual inspection, are often inefficient and error-prone. In contrast, advancements in machine learning (ML) and deep learning (DL) have enabled more accurate and efficient automatic classification of plant diseases. The review covers various approaches, including classic ML techniques such as Support Vector Machines (SVM) and Random Forests, which have shown promising results but are limited by their reliance on human-engineered features and their generalization capabilities in complex datasets.

Deep learning, particularly through Convolutional Neural Networks (CNNs), has emerged as a superior alternative, capable of automatically learning features from images. Lightweight CNN models have been developed for real-time applications, achieving high accuracy rates (e.g., 99.3% with Ullah et al.’s model). Hybrid models combining multiple architectures have also been explored, with notable successes in accuracy but challenges related to data dependency and the need for extensive datasets. The section emphasizes the importance of robust feature extraction methods and the potential of ensemble learning techniques to enhance classification performance, although they may require significant computational resources. Overall, the findings underscore the need for ongoing research into improving model generalization across varying environmental conditions and the importance of high-quality datasets for effective disease classification.