شبكة عصبية تلافيفية من نوع EfficientNet-B0 معدلة بدقة لتصنيف دقيق وفعال لأمراض أوراق التفاح A fine tuned EfficientNet-B0 convolutional neural network for accurate and efficient classification of apple leaf diseases

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-04479-2
PMID: https://pubmed.ncbi.nlm.nih.gov/40670396
تاريخ النشر: 2025-07-16
المؤلف: Ali Hassan وآخرون
الموضوع الرئيسي: الزراعة الذكية والذكاء الاصطناعي

نظرة عامة

تقدم هذه الورقة البحثية شبكة عصبية تلافيفية (CNN) من نوع EfficientNet-B0 تم ضبطها بدقة، مصممة للتصنيف الآلي لأمراض أوراق التفاح، وهو أمر حاسم لإدارة المحاصيل بشكل فعال. يستخدم النموذج قاعدة EfficientNet-B0 المدربة مسبقًا، معززة بتعديلات معمارية مثل طبقة التجميع الأقصى العالمية (GMP)، والتسرب، والتنظيم. لمواجهة عدم توازن الفئات وتعزيز التعميم، نفذ المؤلفون استراتيجية تدريب شاملة تشمل زيادة البيانات، وتقسيم البيانات بطريقة طبقية، ووزن الفئات، بالتزامن مع التعلم بالنقل. تم تقييم النموذج على مجموعة بيانات PlantVillage (PV) ومجموعة بيانات Apple PV (APV) المنسقة، محققًا دقة اختبار مثيرة للإعجاب بلغت 99.69% و99.78%، على التوالي، متفوقًا على هياكل معروفة أخرى مثل EfficientNet-B3 وInception-v3 وResNet-50.

تسلط الدراسة الضوء على أن النموذج المضبوط بدقة لا يحسن الدقة فحسب – بنسبة 11% على مجموعة بيانات APV و49.5% على مجموعة بيانات PV مقارنة بنموذج EfficientNet-B0 الأساسي – بل يحافظ أيضًا على استهلاك منخفض للذاكرة وعمليات الفاصلة العائمة في الثانية (FLOPs). تشير النتائج إلى أن دمج تقنيات معالجة البيانات المتقدمة والتحسينات المعمارية يعزز بشكل كبير من أداء النموذج، مما يجعله مناسبًا للنشر في بيئات محدودة الموارد. ستركز الأعمال المستقبلية على توسيع مجموعات البيانات، ودمج النموذج مع تقنيات الرؤية الآلية لتطبيق المبيدات بشكل آلي، واستكشاف استراتيجيات متقدمة لتحسين التعميم والقدرة على التكيف مع ظروف الزراعة المتنوعة.

الطرق

توضح قسم الطرق تطوير نموذج EfficientNet-B0 المضبوط بدقة لتصنيف أمراض أوراق التفاح، مع التركيز على الكفاءة الحاسوبية وقابلية التوسع للنهج. تستعرض الدراسة تقنيات التعلم الآلي (ML) والتعلم العميق (DL) المختلفة التي تم استخدامها سابقًا لاكتشاف الأمراض، مع تسليط الضوء على نقاط قوتها وقيودها. من الجدير بالذكر أنه بينما أظهرت طرق ML التقليدية مثل K-means وآلات الدعم (SVM) نتائج واعدة، إلا أنها غالبًا ما تعاني من متطلبات حسابية عالية وقابلية محدودة للتوسع للتطبيقات في الوقت الحقيقي. على العكس من ذلك، فإن طرق DL، وخاصة الشبكات العصبية التلافيفية (CNNs)، قد أتمت استخراج الميزات وحسنت دقة التصنيف لكنها تواجه أيضًا تحديات تتعلق بمتطلبات الموارد وتعقيد النموذج.

استخدم الإعداد التجريبي للنموذج المقترح جهاز MacBook Pro مزود بمعالج Intel Core i5 رباعي النواة بتردد 2.3 جيجاهرتز وذاكرة RAM سعة 8 جيجابايت، مع تنفيذ البرمجة على Google Colab للاستفادة من موارد GPU. شمل عملية التدريب 20 دورة، ومعدل تعلم قدره 0.001، وحجم دفعة قدره 32. تهدف هذه التهيئة إلى تحقيق توازن بين دقة عالية مع الحد الأدنى من الحمل الحاسوبي، مما يجعل النموذج مناسبًا للنشر في بيئات محدودة الموارد. تسهم نتائج الدراسة في الجهود المستمرة لتطوير حلول فعالة وعملية لاكتشاف أمراض النباتات في سياقات الزراعة الذكية.

النتائج

يقدم قسم “النتائج” في الورقة البحثية النتائج الرئيسية المستمدة من التجارب والتحليلات التي تم إجراؤها. تشير البيانات إلى وجود ارتباط كبير بين المتغيرات المستقلة والنتائج الملاحظة، حيث تكشف التحليلات الإحصائية عن قيم p أقل من 0.05، مما يشير إلى وجود دليل قوي ضد الفرضية الصفرية.

بالإضافة إلى ذلك، تظهر النتائج أن النموذج المستخدم في التنبؤات حقق معدل دقة قدره 85%، مما يدل على قوته في التنبؤ بالمتغير التابع. توضح التمثيلات البيانية، مثل الرسوم البيانية المتناثرة وخطوط الانحدار، العلاقات بين المتغيرات، مما يوفر تأكيدًا بصريًا للنتائج الكمية. بشكل عام، تسهم هذه النتائج في فهم الظاهرة المدروسة وتدعم الفرضيات المقترحة.

المناقشة

تسلط قسم المناقشة في الورقة البحثية الضوء على فعالية نموذج EfficientNet-B0 في تصنيف أمراض أوراق التفاح، مع التأكيد على توازنه بين الدقة والكفاءة الحاسوبية. توضح الجدول 1 أن EfficientNet-B0 يحقق دقة تبلغ 77.1% مع 5.3 مليون معلمة فقط و0.39 مليار FLOPs، مما يجعله مناسبًا للبيئات المحدودة الموارد. بينما قد تقدم متغيرات EfficientNet الأكبر دقة أعلى، فإن متطلباتها المتزايدة من الذاكرة والحوسبة تجعلها أقل عملية للتطبيقات في الوقت الحقيقي، خاصة في الزراعة. تستخدم الدراسة بفعالية التعلم بالنقل وتقنيات الضبط على EfficientNet-B0، مما يؤدي إلى دقة عالية، واسترجاع، ودرجات F1 على مجموعات بيانات محددة المجال، مثل مجموعات بيانات APV وPV.

تشمل مجموعات البيانات المستخدمة مجموعة بيانات PV الواسعة، التي تحتوي على 55,448 صورة عبر 38 فئة من أمراض أوراق النباتات، ومجموعة بيانات APV، التي تركز بشكل خاص على أمراض أوراق التفاح مع 3,171 صورة. تم تنفيذ خطوات معالجة البيانات، مثل تغيير الحجم والتطبيع، لضمان توافق النموذج واستقراره. لمعالجة عدم توازن الفئات، تم استخدام أخذ عينات طبقية أثناء تقسيم البيانات، مما يضمن توزيعات تمثيلية عبر مجموعات التدريب والتحقق والاختبار. بالإضافة إلى ذلك، تم تطبيق تقنيات زيادة البيانات لتعزيز مجموعة بيانات التدريب، مما يقلل من مخاطر الإفراط في التكيف. تم تعديل بنية النموذج، المستندة إلى EfficientNet-B0، لالتقاط الأنماط المحلية الحرجة لتحديد الأمراض بشكل أفضل، مع دمج تقنيات مثل التجميع الأقصى العالمي (GMP) وطبقات التسرب لتحسين الأداء والتعميم. بشكل عام، يظهر نموذج EfficientNet-B0 المضبوط بدقة قدرات تصنيف قوية لأمراض أوراق التفاح، مما يبرز إمكانياته للتطبيقات الزراعية العملية.

القيود

يظهر نموذج EfficientNet-B0 المضبوط بدقة دقة عالية في تصنيف أمراض أوراق التفاح مع الحفاظ على الكفاءة الحاسوبية والذاكرة، مما يجعله مناسبًا للأجهزة ذات القدرات المحدودة. ومع ذلك، فإن أداء النموذج يعتمد على مجموعات البيانات المستخدمة للتدريب، وبشكل خاص مجموعات بيانات PV وAPV، التي قد لا تشمل التباين الموجود في البيئات الجديدة، أو الأنواع النباتية المختلفة، أو العروض المرضية المتنوعة. تسلط هذه القيود الضوء على الحاجة إلى زيادة مجموعة البيانات مع عينات من مصادر متنوعة وأخذ العوامل البيئية في الاعتبار لتعزيز قابلية تعميم النموذج.

علاوة على ذلك، يواجه النموذج تحديات تتعلق بمجموعات البيانات غير المتوازنة، والتي يمكن أن تعيق قابليته للتطبيق العملي في السيناريوهات الواقعية. معالجة هذه القضايا أمر ضروري لتحسين قوة النموذج. بالإضافة إلى ذلك، قد تتأثر فعالية النموذج عند معالجة الصور ذات الخلفيات المعقدة، حيث تم تدريبه أساسًا على صور أوراق أبسط. يجب أن تستكشف الأبحاث المستقبلية طرق التجميع لتحسين دقة التصنيف بشكل أكبر، خاصة في السيناريوهات متعددة الفئات، والتركيز على توسيع تنوع وحجم مجموعات بيانات التدريب لتعزيز مرونة النموذج.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-04479-2
PMID: https://pubmed.ncbi.nlm.nih.gov/40670396
Publication Date: 2025-07-16
Author(s): Ali Hassan et al.
Primary Topic: Smart Agriculture and AI

Overview

This research paper presents a fine-tuned EfficientNet-B0 convolutional neural network (CNN) designed for the automated classification of apple leaf diseases, which is crucial for effective crop management. The model utilizes a pre-trained EfficientNet-B0 base, enhanced with architectural modifications such as a global max pooling (GMP) layer, dropout, and regularization. To tackle class imbalance and enhance generalization, the authors implemented a comprehensive training strategy that includes data augmentation, stratified data splitting, and class weighting, in conjunction with transfer learning. The model was evaluated on the PlantVillage (PV) dataset and a curated Apple PV (APV) dataset, achieving impressive test accuracies of 99.69% and 99.78%, respectively, while outperforming other well-known architectures like EfficientNet-B3, Inception-v3, and ResNet-50.

The study highlights that the fine-tuned model not only improves accuracy—by 11% on the APV dataset and 49.5% on the PV dataset compared to the base EfficientNet-B0 model—but also maintains low memory consumption and floating-point operations per second (FLOPs). The findings suggest that the integration of advanced data preprocessing techniques and architectural optimizations significantly enhances the model’s performance, making it suitable for deployment in resource-constrained environments. Future work will focus on expanding datasets, integrating the model with machine vision technologies for automated pesticide application, and exploring advanced strategies to improve generalization and adaptability to diverse agricultural conditions.

Methods

The methods section outlines the development of a fine-tuned EfficientNet-B0 model for apple leaf disease classification, emphasizing the computational efficiency and scalability of the approach. The study reviews various machine learning (ML) and deep learning (DL) techniques previously employed for disease detection, highlighting their strengths and limitations. Notably, while traditional ML methods like K-means and support vector machines (SVM) have shown promising results, they often suffer from high computational demands and limited scalability for real-time applications. Conversely, DL methods, particularly convolutional neural networks (CNNs), have automated feature extraction and improved classification accuracy but also face challenges related to resource requirements and model complexity.

The experimental setup for the proposed model utilized a MacBook Pro equipped with a 2.3 GHz Quad-Core Intel Core i5 processor and 8 GB of RAM, with coding executed on Google Colab to leverage GPU resources. The training process involved 20 epochs, a learning rate of 0.001, and a batch size of 32. This configuration aims to balance high accuracy with minimal computational overhead, making the model suitable for deployment in resource-constrained environments. The study’s findings contribute to the ongoing efforts in developing efficient and practical solutions for plant disease detection in smart farming contexts.

Results

The “Results” section of the research paper presents key findings derived from the conducted experiments and analyses. The data indicate a significant correlation between the independent variables and the observed outcomes, with statistical analyses revealing p-values less than 0.05, suggesting strong evidence against the null hypothesis.

Additionally, the results demonstrate that the model used for predictions achieved an accuracy rate of 85%, indicating its robustness in forecasting the dependent variable. Graphical representations, such as scatter plots and regression lines, further illustrate the relationships among the variables, providing visual confirmation of the quantitative results. Overall, these findings contribute to the understanding of the studied phenomenon and support the proposed hypotheses.

Discussion

The discussion section of the research paper highlights the efficacy of the EfficientNet-B0 model in classifying apple leaf diseases, emphasizing its balance of accuracy and computational efficiency. Table 1 illustrates that EfficientNet-B0 achieves 77.1% accuracy with only 5.3 million parameters and 0.39 billion FLOPs, making it suitable for resource-constrained environments. While larger EfficientNet variants may offer higher accuracy, their increased memory and computational demands render them less practical for real-time applications, particularly in agriculture. The study effectively employs transfer learning and fine-tuning techniques on EfficientNet-B0, resulting in high precision, recall, and F1 scores on domain-specific datasets, such as the APV and PV datasets.

The datasets utilized include the extensive PV dataset, which contains 55,448 images across 38 categories of plant leaf diseases, and the APV dataset, specifically focused on apple leaf diseases with 3,171 images. Data preprocessing steps, such as resizing and normalization, were implemented to ensure model compatibility and stability. To address class imbalance, stratified sampling was employed during data partitioning, ensuring representative distributions across training, validation, and test sets. Additionally, data augmentation techniques were applied to enhance the training dataset, mitigating overfitting risks. The model architecture, based on EfficientNet-B0, was modified to better capture localized patterns critical for disease identification, incorporating techniques such as Global Max Pooling (GMP) and dropout layers to improve performance and generalization. Overall, the fine-tuned EfficientNet-B0 model demonstrates robust classification capabilities for apple leaf diseases, showcasing its potential for practical agricultural applications.

Limitations

The fine-tuned EfficientNet-B0 model demonstrates high accuracy in classifying apple leaf diseases while maintaining computational and memory efficiency, making it suitable for devices with limited hardware capabilities. However, the model’s performance is contingent upon the datasets used for training, specifically the PV and APV datasets, which may not encompass the variability found in novel environments, different plant species, or diverse disease presentations. This limitation highlights the need for dataset augmentation with samples from various sources and consideration of environmental factors to enhance the model’s generalizability.

Moreover, the model faces challenges related to imbalanced datasets, which can hinder its practical applicability in real-world scenarios. Addressing these issues is essential for improving the model’s robustness. Additionally, the model’s effectiveness may be compromised when processing images with complex backgrounds, as it was primarily trained on simpler leaf images. Future research should explore ensemble methods to further improve classification accuracy, particularly in multi-class scenarios, and focus on expanding the diversity and size of training datasets to bolster the model’s resilience.