إطار هجين لاكتشاف وتصنيف أمراض أوراق النباتات باستخدام الشبكات العصبية التلافيفية ومحولات الرؤية A hybrid Framework for plant leaf disease detection and classification using convolutional neural networks and vision transformer

المجلة: Complex & Intelligent Systems، المجلد: 11، العدد: 2
DOI: https://doi.org/10.1007/s40747-024-01764-x
تاريخ النشر: 2025-01-15
المؤلف: Sherihan Aboelenin وآخرون
الموضوع الرئيسي: الزراعة الذكية والذكاء الاصطناعي

نظرة عامة

تقدم هذه الورقة البحثية إطارًا هجينًا يدمج الشبكات العصبية التلافيفية (CNNs) ومحولات الرؤية (ViT) لتعزيز الكشف وتصنيف أمراض أوراق النباتات. يستخدم النموذج مجموعة من ثلاث هياكل CNN مدربة مسبقًا—VGG16 وInception-V3 وDenseNet201—لاستخراج ميزات عالمية قوية من صور الأوراق. بعد ذلك، يتم استخدام نموذج ViT لالتقاط الميزات المحلية، مما يسهل الكشف الدقيق عن الأمراض. تم تقييم الإطار على مجموعتين من البيانات متاحة للجمهور (التفاح والذرة)، كل منها يحتوي على أربع فئات من الأمراض، محققًا معدلات دقة مثيرة للإعجاب تبلغ 99.24% لمجموعة بيانات التفاح و98% لمجموعة بيانات الذرة، متفوقًا بذلك على الطرق الحالية.

تؤكد الدراسة على فعالية دمج تقنيات التعلم العميق لتحسين الإنتاجية الزراعية من خلال تمكين التعرف المبكر والدقيق على أمراض النباتات. تم تقييم أداء النموذج المقترح بدقة باستخدام مقاييس مثل الدقة، والوضوح، ودرجة F1، والاسترجاع، مما يوضح إمكاناته كأداة موثوقة في الزراعة الذكية. تسهم هذه العمل في زيادة الأبحاث التي تهدف إلى الاستفادة من الذكاء الاصطناعي لمعالجة التحديات في إدارة المحاصيل ومكافحة الأمراض.

مقدمة

تسلط المقدمة الضوء على التهديد الكبير الذي تمثله أمراض النباتات على الأمن الغذائي العالمي، والذي تفاقم بسبب عوامل مثل تغير المناخ والتحديات التي يواجهها المزارعون في تشخيص هذه الأمراض بدقة. تؤكد الورقة على إمكانات الذكاء الاصطناعي، وخاصة تقنيات التعلم الآلي (ML) والتعلم العميق (DL)، لتعزيز الكشف عن الأمراض من خلال تحليل الصور الرقمية. من الجدير بالذكر أن الشبكات العصبية التلافيفية (CNNs) قد اكتسبت زخمًا لقدرتها على استخراج الميزات ذات الصلة تلقائيًا من الصور، على الرغم من أنها تواجه قيودًا في تحليل العلاقات بين البكسلات البعيدة وغالبًا ما تتطلب مجموعات بيانات واسعة وموارد حسابية.

لمعالجة هذه التحديات، يقترح المؤلفون إطارًا هجينًا يدمج CNNs مع محولات الرؤية (ViTs) لتحسين الكشف وتصنيف أمراض أوراق النباتات. يهدف هذا الإطار إلى التقاط ميزات مميزة من أجل تصنيف متعدد الفئات بدقة وقد تم اختباره على مجموعات بيانات لأمراض الذرة والتفاح، مما يظهر أداءً متفوقًا مقارنة بالنماذج الحالية. تمهد المقدمة الطريق للأقسام التالية من الورقة، التي ستتناول الأعمال ذات الصلة، والمنهجيات، والنتائج، والآثار الإدارية، مما يبرز في النهاية أهمية تقنيات الكشف المتقدمة عن الأمراض لاستدامة الزراعة والاستقرار الاقتصادي.

طرق

توضح قسم “المواد والطرق” تصميم التجارب والإجراءات المستخدمة في الدراسة. تفصل المواد المحددة المستخدمة، بما في ذلك أي مواد كيميائية، ومعدات، وعينات بيولوجية، لضمان إمكانية تكرار التجارب. يتم وصف المنهجية بطريقة منهجية، مع تسليط الضوء على التقنيات المستخدمة لجمع البيانات وتحليلها، مثل الاختبارات الإحصائية أو النماذج الحسابية المطبقة.

بالإضافة إلى ذلك، قد يتضمن القسم معلومات عن حجم العينة، وظروف التحكم، وأي اعتبارات أخلاقية ذات صلة بالبحث. بشكل عام، يخدم هذا الجزء من الورقة لتوفير إطار واضح لفهم كيفية إجراء البحث، مما يسمح بالتقييم النقدي للنتائج المقدمة في الأقسام التالية.

نتائج

يقدم قسم النتائج النتائج المستخلصة من التجارب التي تقيم فعالية الهيكل الهجين المقترح للكشف وتصنيف أمراض أوراق النباتات. قارن الدراسة أداء هذا النموذج الهجين مع هياكل الشبكات العصبية التلافيفية (CNN) ومحولات الرؤية (ViT) المعروفة، مع تقييم النماذج المدربة مسبقًا مثل VGG16 وInception-V3 وDenseNet201. شملت مقاييس التقييم المستخدمة الدقة، والوضوح، والاسترجاع، ودرجة F1، مما يوفر تحليلًا شاملاً لأداء النموذج.

أشارت النتائج إلى أن DenseNet201 حقق أعلى أداء، مع درجة 97% عبر جميع المقاييس لمجموعة بيانات التفاح، متجاوزًا VGG16 (96%) وInception-V3 (94%). أسفرت مجموعة VGG16 وInception-V3 وDenseNet201 عن دقة محسنة بلغت 97.6%، مع وضوح واسترجاع كلاهما عند 98%. استخدمت التجارب معدل تعلم قدره 0.0001 على مدى 50 دورة، مع تضمين إيقاف مبكر مع صبر قدره 10. بالنسبة لنموذج ViT، تم استخدام حجم قطعة 2، ومعدل تسرب 0.01، و8 رؤوس انتباه، وأبعاد مدمجة قدرها 64، و256 بيرسيبترون متعدد الخطوط، مع إعادة تحجيم جميع صور أوراق النباتات إلى 128 × 128 بكسل. تم تنفيذ العمل على منصة Google Colab، مما يوضح إمكانات النموذج الهجين في تعزيز دقة تصنيف الأمراض.

مناقشة

تدمج بنية التعلم العميق الهجين المقترحة للكشف عن أمراض أوراق النباتات ثلاث شبكات عصبية تلافيفية مدربة مسبقًا (CNNs)—VGG16 وInception-v3 وDenseNet201—مع كتلة محول رؤية (ViT). يستفيد هذا النموذج من نقاط القوة في CNNs لاستخراج الميزات المحلية وآلية الانتباه الذاتي لـ ViT لالتقاط العلاقات المكانية بعيدة المدى بين البكسلات، مما يعزز دقة التصنيف. تعالج البنية صور أوراق النباتات المعاد تحجيمها إلى 128×128 بكسل، مع استخدام زيادة البيانات لزيادة حجم مجموعة البيانات. تم تقييم أداء النموذج باستخدام مجموعتين من البيانات متاحة للجمهور من PlantVillage، محققة دقة ملحوظة تبلغ 99.24% لأمراض أوراق التفاح و98% لأمراض أوراق الذرة، متفوقة بشكل كبير على النماذج الحالية.

تسلط الدراسة الضوء على فعالية التعلم الجماعي في تطبيقات التعلم العميق لتصنيف أمراض النباتات، مما يظهر أن دمج الميزات من عدة هياكل CNN يمكن أن يؤدي إلى نتائج متفوقة. تشير مصفوفات الالتباس إلى دقة عالية في تصنيف فئات الأمراض المختلفة، حيث حقق النموذج الهجين درجة F1 تبلغ 97%. على الرغم من نجاحه، تعترف الدراسة بالقيود مثل الحاجة إلى مجموعات بيانات متنوعة والتحديات التي تطرحها أعراض الأمراض المتشابهة، والتي قد تعيق التعميم. يجب أن تركز الأعمال المستقبلية على تعزيز قابلية تفسير النموذج وقابليته للتوسع لمعالجة هذه التحديات، مما يسهم في تحسين الممارسات الزراعية من خلال أنظمة الكشف الآلي عن الأمراض.

Journal: Complex & Intelligent Systems, Volume: 11, Issue: 2
DOI: https://doi.org/10.1007/s40747-024-01764-x
Publication Date: 2025-01-15
Author(s): Sherihan Aboelenin et al.
Primary Topic: Smart Agriculture and AI

Overview

This research paper presents a hybrid framework that integrates Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) to enhance the detection and classification of plant leaf diseases. The model employs an ensemble of three pre-trained CNN architectures—VGG16, Inception-V3, and DenseNet201—to extract robust global features from leaf images. Subsequently, a ViT model is utilized to capture local features, facilitating precise disease detection. The framework was evaluated on two publicly available datasets (Apple and Corn), each containing four classes of diseases, achieving impressive accuracy rates of 99.24% for the apple dataset and 98% for the corn dataset, thereby outperforming existing methods.

The study underscores the effectiveness of combining deep learning techniques to improve agricultural productivity by enabling early and accurate identification of plant diseases. The proposed model’s performance was rigorously assessed using metrics such as accuracy, precision, F1-score, and recall, demonstrating its potential as a reliable tool in intelligent agriculture. This work contributes to the growing body of research aimed at leveraging AI to address challenges in crop management and disease control.

Introduction

The introduction highlights the significant threat that plant diseases pose to global food security, exacerbated by factors such as climate change and the challenges farmers face in accurately diagnosing these diseases. The paper emphasizes the potential of artificial intelligence, particularly Machine Learning (ML) and Deep Learning (DL) techniques, to enhance disease detection through digital image analysis. Notably, Convolutional Neural Networks (CNNs) have gained traction for their ability to automatically extract relevant features from images, although they face limitations in analyzing relationships between distant pixels and often require extensive datasets and computational resources.

To address these challenges, the authors propose a hybrid framework that integrates CNNs with Vision Transformers (ViTs) to improve the detection and classification of plant leaf diseases. This framework aims to capture distinct features for precise multi-class classification and has been tested on datasets for corn and apple diseases, demonstrating superior performance compared to existing models. The introduction sets the stage for the subsequent sections of the paper, which will delve into related works, methodologies, results, and managerial implications, ultimately underscoring the importance of advanced disease detection technologies for agricultural sustainability and economic stability.

Methods

The section on “Materials and Methods” outlines the experimental design and procedures employed in the study. It details the specific materials used, including any reagents, equipment, and biological samples, ensuring reproducibility of the experiments. The methodology is described in a systematic manner, highlighting the techniques for data collection and analysis, such as statistical tests or computational models applied.

Additionally, the section may include information on the sample size, control conditions, and any ethical considerations relevant to the research. Overall, this part of the paper serves to provide a clear framework for understanding how the research was conducted, allowing for critical evaluation of the findings presented in subsequent sections.

Results

The results section presents the findings from experiments evaluating the efficacy of a proposed hybrid architecture for detecting and classifying plant leaf diseases. The study compared the performance of this hybrid model against established convolutional neural network (CNN) and Vision Transformer (ViT) architectures, specifically assessing pre-trained models such as VGG16, Inception-V3, and DenseNet201. The evaluation metrics employed included Accuracy, Precision, Recall, and F1-score, providing a comprehensive analysis of model performance.

The results indicated that DenseNet201 achieved the highest performance, with a 97% score across all metrics for the Apple dataset, surpassing VGG16 (96%) and Inception-V3 (94%). The combination of VGG16, Inception-V3, and DenseNet201 yielded an improved accuracy of 97.6%, with precision and recall both at 98%. The experiments utilized a learning rate of 0.0001 over 50 epochs, incorporating early stopping with a patience of 10. For the ViT model, a patch size of 2, dropout rate of 0.01, 8 attention heads, an embedded dimension of 64, and 256 multi-linear perceptrons were employed, with all plant leaf images resized to 128 by 128 pixels. The implementation was conducted on the Google Colab platform, demonstrating the hybrid model’s potential in enhancing disease classification accuracy.

Discussion

The proposed hybrid deep learning (DL) architecture for plant leaf disease detection integrates three pre-trained convolutional neural networks (CNNs)—VGG16, Inception-v3, and DenseNet201—with a Vision Transformer (ViT) block. This model leverages the strengths of CNNs for local feature extraction and the self-attention mechanism of ViT to capture long-range spatial relationships among pixels, enhancing classification accuracy. The architecture processes plant leaf images resized to 128×128 pixels, employing data augmentation to increase dataset size. The model’s performance was evaluated using two publicly available datasets from PlantVillage, achieving remarkable accuracies of 99.24% for apple leaf diseases and 98% for corn leaf diseases, significantly outperforming existing models.

The study highlights the effectiveness of ensemble learning in deep learning applications for plant disease classification, demonstrating that combining features from multiple CNN architectures can yield superior results. The confusion matrices indicate high precision in classifying various disease classes, with the hybrid model achieving an F1-score of 97%. Despite its success, the research acknowledges limitations such as the need for diverse datasets and the challenges posed by similar disease symptoms, which could hinder generalization. Future work should focus on enhancing model interpretability and scalability to address these challenges, ultimately contributing to improved agricultural practices through automated disease detection systems.