التنبؤ بعائد محصول البطاطس باستخدام التعلم الآلي والتعلم العميق من أجل الزراعة المستدامة Predicting Potato Crop Yield with Machine Learning and Deep Learning for Sustainable Agriculture

المجلة: Potato Research
DOI: https://doi.org/10.1007/s11540-024-09753-w
تاريخ النشر: 2024-07-13
المؤلف: El‐Sayed M. El‐kenawy وآخرون
الموضوع الرئيسي: الزراعة الذكية والذكاء الاصطناعي

نظرة عامة

تسلط ورقة البحث الضوء على أهمية التنبؤ الدقيق بعائد البطاطس لتعزيز الممارسات الزراعية وضمان الأمن الغذائي. تقيم نماذج التعلم الآلي (ML) والتعلم العميق (DL) المختلفة، بما في ذلك أقرب الجيران (KNN)، وتعزيز التدرج، وXGBoost، والشبكات العصبية متعددة الطبقات، والشبكات العصبية الرسومية (GNNs)، ووحدات التكرار المغلقة (GRUs)، وشبكات الذاكرة طويلة وقصيرة الأمد (LSTMs). تجد الدراسة أنه بينما تظهر تقنيات تعزيز التدرج وXGBoost قدرات تنبؤية جيدة، فإن GNNs وLSTMs تتفوق في التقاط الأنماط المكانية والزمانية المعقدة، محققة أقل متوسط خطأ تربيعي (MSE) قدره 0.02363 ومعامل تحديد ($R^2$) قدره 0.51719 لـ GNNs.

في الختام، يؤكد المؤلفون على إمكانية نماذج ML وDL في تحسين توقعات العائد ودعم اتخاذ القرارات الزراعية المستدامة. يقترحون عدة مجالات للبحث المستقبلي، بما في ذلك تحسين النماذج، ودمج مصادر البيانات المتنوعة، وتعزيز القابلية للتفسير، وقدرات التنبؤ في الوقت الحقيقي، وضمان قابلية التوسع عبر مناطق مختلفة. تدعو الورقة إلى جهود تعاونية بين علماء البيانات، وخبراء الزراعة، وصانعي السياسات للاستفادة من هذه التقنيات المتقدمة في معالجة الجوع العالمي وتقدم الزراعة الحديثة.

مقدمة

تؤكد مقدمة هذه الورقة البحثية على الدور الحاسم للإنتاجية الزراعية في ضمان الأمن الغذائي العالمي والتنمية الاقتصادية، خاصة مع توقع أن يصل عدد سكان العالم إلى حوالي 9.7 مليار بحلول عام 2050. تسلط الورقة الضوء على الحاجة الملحة لتحسين عائد المحاصيل لتلبية الطلب المتزايد على الغذاء مع الحفاظ على الموارد الطبيعية. تعتبر طرق التنبؤ التقليدية، التي تعتمد بشكل كبير على البيانات التاريخية وآراء الخبراء، غير كافية بشكل متزايد بسبب التفاعلات المعقدة لعوامل الزراعة الحديثة مثل تغير المناخ، واستنفاد خصوبة التربة، وتطور ديناميات الآفات. يتم التأكيد على أهمية البطاطس كمحصول أساسي، نظرًا لقيمتها الغذائية وقدرتها على التكيف مع المناخات والتربة المختلفة، مع الإشارة أيضًا إلى تعرضها للأمراض والضغوط البيئية.

تدعو الورقة إلى دمج تقنيات التعلم الآلي (ML) والتعلم العميق (DL) المتقدمة لتعزيز دقة توقعات العائد وتحسين الممارسات الزراعية. من خلال الاستفادة من مصادر البيانات المتنوعة – بما في ذلك صور الأقمار الصناعية، وأجهزة استشعار التربة، وسجلات الطقس – يمكن لهذه التقنيات تحديد الأنماط والعلاقات المعقدة التي تتجاهلها النماذج التقليدية. تشمل الفوائد المحتملة زيادة الإنتاجية، وكفاءة الموارد، والقدرة على التكيف في زراعة البطاطس. ومع ذلك، يجب معالجة التحديات مثل جودة البيانات، وتعقيد النماذج، والتحيزات لضمان تطبيقات عادلة وموثوقة لـ ML وDL في الزراعة. تهدف الدراسة إلى تقييم نماذج التنبؤ المختلفة، وتعزيز إدارة الموارد، واستكشاف الآثار الأوسع لهذه التقنيات على الممارسات الزراعية المستدامة وتطوير السياسات.

الطرق

في قسم الطرق، توضح الورقة مصادر البيانات وتقنيات المعالجة المسبقة المستخدمة لتنبؤات عائد المحاصيل، جنبًا إلى جنب مع تكوينات نماذج التعلم الآلي والتعلم العميق المستخدمة في الدراسة. تشمل مقاييس التقييم المطبقة لتقييم أداء هذه النماذج متوسط الخطأ التربيعي (MSE)، وجذر متوسط الخطأ التربيعي (RMSE)، ومتوسط الخطأ المطلق (MAE)، ومتوسط خطأ التحيز (MBE)، ومعامل ارتباط بيرسون ($R$)، ومعامل التحديد ($R^2$)، ومتوسط الخطأ التربيعي النسبي (RRMSE)، وكفاءة ناش-سوتكليف (NSE)، ومؤشر ويلموط (WI).

توفر هذه المقاييس إطارًا شاملاً لقياس ومقارنة دقة وموثوقية وكفاءة نماذج التنبؤ المختلفة، مما يسهل تحليلًا قويًا لجدواها في توقع عائد المحاصيل. تعتبر النتائج من هذه التقييمات حاسمة لفهم فعالية المنهجيات المطبقة في التنبؤ الزراعي.

النتائج

يقدم قسم النتائج في الدراسة تقييمًا شاملاً لمختلف تقنيات التعلم الآلي والتعلم العميق لتنبؤ عائد المحاصيل. تلخص الجدول 2 مقاييس الأداء لعدة نماذج تعلم آلي، بما في ذلك أقرب الجيران (KNN)، وتعزيز التدرج، وXGBoost، والشبكات العصبية متعددة الطبقات، باستخدام مؤشرات مثل متوسط الخطأ التربيعي (MSE)، وجذر متوسط الخطأ التربيعي (RMSE)، ومتوسط الخطأ المطلق (MAE)، ومعامل ارتباط بيرسون ($R$). تشير النتائج إلى أن KNN حقق أقل MSE قدره 0.03437، مما يشير إلى دقة تنبؤية متفوقة مقارنة بالنماذج الأخرى. توضح الأشكال 9 و10 توزيع أخطاء التنبؤ والتباين بين النماذج، مما يوفر رؤى حول موثوقيتها ودقتها.

في قسم التعلم العميق، يقارن الجدول 3 الشبكات العصبية الرسومية (GNNs) مع وحدات التكرار المغلقة (GRUs) وشبكات الذاكرة طويلة وقصيرة الأمد (LSTMs) باستخدام مقاييس أداء مماثلة. تبرز نتائج MSE، الموضحة في الشكل 12، دقة كل نموذج، حيث تشير القيم الأقل لـ MSE إلى أداء أفضل. يسمح استخدام مخططات الكمان في الشكل 13 بتصور تفصيلي لتوزيعات أخطاء التنبؤ، مما يساعد في تقييم استقرار النموذج والتحيزات المحتملة. بالإضافة إلى ذلك، تسهل مخططات المتبقيات في الشكل 14 فحص دقة التنبؤ، كاشفة عن أي انحرافات منهجية قد تتطلب مزيدًا من تحسين النموذج. بشكل عام، تؤكد النتائج على فعالية أساليب التعلم الآلي والتعلم العميق في تعزيز توقعات عائد المحاصيل، مع آثار على التخطيط الزراعي وإدارة الموارد.

المناقشة

تؤكد قسم المناقشة في ورقة البحث على أهمية التنبؤ الدقيق بعائد المحاصيل، خاصة بالنسبة للبطاطس، في تعزيز التخطيط الزراعي، وإدارة الموارد، والأمن الغذائي. أظهرت منهجيات مختلفة، بما في ذلك الشبكات العصبية الاصطناعية (ANNs) مثل الشبكات العصبية ذات دالة الأساس الشعاعي (RBFNN) والشبكات العصبية العامة للتراجع (GRNN)، فعاليتها في توقع عوائد البطاطس بناءً على مؤشرات حاسمة مثل مؤشر مساحة الورقة والكتلة الحيوية. يتفوق GRNN، المعروف بقدرات التعلم السريعة، على RBFNN في توقعات العائد. علاوة على ذلك، أظهر دمج تقنيات التعلم الآلي، بما في ذلك خوارزميات التعلم تحت الإشراف لتوقع إمكانات المياه في التربة، وعدًا في تحسين اتخاذ القرارات الزراعية.

تسلط الورقة أيضًا الضوء على تحديات نقل نتائج التعلم الآلي عبر محاصيل ومناطق مختلفة، داعية إلى نهج موحد لتوقعات عائد المحاصيل على نطاق واسع يجمع بين المبادئ الزراعية والتعلم الآلي. تكشف دراسات الحالة أن التوقعات الإقليمية تحقق أخطاء جذر متوسط مربع طبيعية (NRMSE) أقل مقارنة بالنماذج الوطنية، مما يبرز أهمية التنبؤات المحلية لصنع السياسات الفعالة. تشير الأبحاث إلى أن التعلم الآلي، جنبًا إلى جنب مع المعرفة الزراعية التقليدية، يمكن أن يعزز دقة توقعات العائد ويساهم في ممارسات زراعية مستدامة، مما يعالج في النهاية تحديات الأمن الغذائي العالمي.

Journal: Potato Research
DOI: https://doi.org/10.1007/s11540-024-09753-w
Publication Date: 2024-07-13
Author(s): El‐Sayed M. El‐kenawy et al.
Primary Topic: Smart Agriculture and AI

Overview

The research paper highlights the significance of accurate potato yield forecasting for enhancing agricultural practices and ensuring food security. It evaluates various machine learning (ML) and deep learning (DL) models, including K-nearest neighbors (KNN), gradient boosting, XGBoost, multilayer perceptron, graph neural networks (GNNs), gated recurrent units (GRUs), and long short-term memory networks (LSTMs). The study finds that while gradient boosting and XGBoost demonstrate good predictive capabilities, GNNs and LSTMs excel in capturing complex spatial and temporal patterns, achieving the lowest mean squared error (MSE) of 0.02363 and a coefficient of determination ($R^2$) of 0.51719 for GNNs.

In the conclusion, the authors emphasize the potential of ML and DL models to improve yield predictions and support sustainable agricultural decision-making. They suggest several avenues for future research, including model optimization, integration of diverse data sources, enhancing explainability, real-time prediction capabilities, and ensuring scalability across different regions. The paper advocates for collaborative efforts among data scientists, agricultural experts, and policymakers to leverage these advanced technologies in addressing global hunger and advancing modern agriculture.

Introduction

The introduction of this research paper emphasizes the critical role of agricultural productivity in ensuring global food security and economic development, particularly as the world population is projected to reach approximately 9.7 billion by 2050. The paper highlights the urgent need for crop yield optimization to meet rising food demands while conserving natural resources. Traditional forecasting methods, which rely heavily on historical data and expert opinions, are increasingly inadequate due to the complex interactions of modern agricultural factors such as climate change, soil fertility depletion, and evolving pest dynamics. The significance of potatoes as a staple crop is underscored, given their nutritional value and adaptability to various climates and soils, while also noting their vulnerability to diseases and environmental stresses.

The paper advocates for the integration of advanced machine learning (ML) and deep learning (DL) techniques to enhance the accuracy of yield predictions and improve agricultural practices. By leveraging diverse data sources—including satellite imagery, soil sensors, and weather records—these technologies can identify intricate patterns and relationships that traditional models overlook. The potential benefits include increased productivity, resource efficiency, and resilience in potato farming. However, challenges such as data quality, model complexity, and biases must be addressed to ensure equitable and reliable applications of ML and DL in agriculture. The study aims to evaluate various predictive models, enhance resource management, and explore the broader implications of these technologies for sustainable agricultural practices and policy development.

Methods

In the Methods section, the paper outlines the data sources and pre-processing techniques employed for crop yield predictions, alongside the configurations of the machine learning and deep learning models utilized in the study. The evaluation metrics applied to assess the performance of these models include mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean bias error (MBE), Pearson’s correlation coefficient ($R$), coefficient of determination ($R^2$), relative root mean squared error (RRMSE), Nash-Sutcliffe efficiency (NSE), and Willmott index (WI).

These metrics provide a comprehensive framework for measuring and comparing the accuracy, reliability, and efficiency of the various forecasting models, facilitating a robust analysis of their feasibility in predicting crop yields. The results from these evaluations are critical for understanding the effectiveness of the applied methodologies in agricultural forecasting.

Results

The results section of the study presents a comprehensive evaluation of various machine learning and deep learning techniques for predicting crop yield. Table 2 summarizes the performance metrics of several machine learning models, including K-nearest neighbors (KNN), gradient boosting, XGBoost, and multilayer perceptron, using indicators such as mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and Pearson’s correlation coefficient ($R$). The findings indicate that KNN achieved the lowest MSE of 0.03437, suggesting superior predictive accuracy compared to other models. Figures 9 and 10 further illustrate the distribution of prediction errors and the variability among models, providing insights into their reliability and accuracy.

In the deep learning segment, Table 3 compares graph neural networks (GNNs) with gated recurrent units (GRUs) and long short-term memory networks (LSTMs) using similar performance metrics. The MSE results, depicted in Figure 12, highlight the precision of each model, with lower MSE values indicating better performance. The use of violin plots in Figure 13 allows for a detailed visualization of prediction error distributions, aiding in the assessment of model stability and potential biases. Additionally, residual plots in Figure 14 facilitate the examination of prediction accuracy, revealing any systematic deviations that may necessitate further model optimization. Overall, the results underscore the effectiveness of machine learning and deep learning approaches in enhancing crop yield predictions, with implications for agricultural planning and resource management.

Discussion

The discussion section of the research paper emphasizes the significance of accurate crop yield prediction, particularly for potatoes, in enhancing agricultural planning, resource management, and food security. Various methodologies, including Artificial Neural Networks (ANNs) such as Radial Basis Function Neural Networks (RBFNN) and General Regression Neural Networks (GRNN), have demonstrated effectiveness in forecasting potato yields based on critical indicators like leaf area index and biomass. The GRNN, noted for its rapid learning capabilities, outperforms RBFNN in yield predictions. Furthermore, the integration of machine learning techniques, including supervised learning algorithms for soil water potential forecasting, has shown promise in improving agricultural decision-making.

The paper also highlights the challenges of transferring machine learning findings across different crops and regions, advocating for a standardized approach to large-scale crop yield predictions that combines agronomic principles with machine learning. Case studies reveal that regional predictions yield lower normalized root mean square errors (NRMSE) compared to national models, underscoring the importance of localized forecasting for effective policy-making. The research indicates that machine learning, alongside traditional agronomic knowledge, can enhance the accuracy of yield predictions and contribute to sustainable agricultural practices, ultimately addressing global food security challenges.