دمج الانحدار العشوائي القائم على الغابات لتحليل التباين المكاني للأمطار في المناطق الجافة وشبه الجافة Integrating random forest-based regression kriging for analyzing spatial variability of rainfall in arid and semi-arid regions

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-026-36074-4
PMID: https://pubmed.ncbi.nlm.nih.gov/41545471
تاريخ النشر: 2026-01-16
المؤلف: Marwa Manaf وآخرون
الموضوع الرئيسي: قياس وتحليل الهطول

نظرة عامة

تبحث هذه الدراسة في التباين المكاني لهطول الأمطار، وهو أمر حاسم لإدارة موارد المياه بفعالية والتكيف مع المناخ، خاصة في المناطق الجافة وشبه الجافة. غالبًا ما تفشل الطرق الجيودلالية التقليدية، مثل الكريغينغ العادي، في نمذجة العلاقات غير الخطية بين هطول الأمطار والإحداثيات المكانية بشكل كافٍ. لمعالجة هذه القيود، تقارن الدراسة طرق الكريغينغ الانحداري المدعوم بتعلم الآلة (ML-RK)، باستخدام خطوط العرض والطول كمتنبئات، بدلاً من بناء نموذج شامل لتوقع هطول الأمطار. تقيم الدراسة ستة نماذج انحدار—غابة عشوائية (RF)، آلة الدعم الناقل (SVM)، أقرب الجيران (KNN)، الشبكة العصبية (NN)، الشبكة المرنة (EN)، والانحدار المتعدد الحدود (PR)—بالتزامن مع الكريغينغ الانحداري.

باستخدام بيانات هطول الأمطار الشهرية والعشرية من 42 محطة أرصاد جوية في باكستان (2001-2021)، تقيم الدراسة الهيكل المكاني الأمثل من خلال أربعة نماذج نظرية للمتغيرات: الأسية، الدائرية، الكروية، والخطية، باستخدام طريقة التحقق المتبادل Leave-One-Out. تشير مقاييس الأداء، بما في ذلك جذر متوسط مربع الخطأ (RMSE) ومتوسط الخطأ المطلق (MAE)، إلى أن مجموعة RF-RK تتفوق باستمرار على مجموعات ML-RK الأخرى. تشير النتائج إلى أن دمج التعلم الجماعي مع الاستيفاء الجيودلالي يلتقط بفعالية العلاقات غير الخطية والاعتمادات المكانية، مما يؤدي إلى خرائط هطول أمطار عالية الدقة يمكن أن تعزز تخطيط التكيف مع المناخ، وجدولة الري، وإدارة موارد المياه المستدامة في المناطق التي تعاني من نقص البيانات مثل باكستان.

مقدمة

تتناول مقدمة الورقة القضايا الملحة للاحتباس الحراري وتغير المناخ، مع التأكيد على تأثيراتها العميقة على الأنظمة الفيزيائية والبيولوجية والاجتماعية والاقتصادية. أدت زيادة انبعاثات غازات الدفيئة إلى ارتفاع درجات الحرارة العالمية وتعطيل الدورات الهيدرولوجية، مما أسفر عن زيادة تكرار الأحداث الجوية المتطرفة مثل الفيضانات والجفاف، التي تشكل تهديدات كبيرة للزراعة وموارد المياه وسبل العيش. نظرًا للدور الحاسم لهطول الأمطار في تنظيم توازن المياه ودعم النظم البيئية، فإن فهم تباينه المكاني والزماني أمر ضروري لإدارة مخاطر المناخ بفعالية وتخطيط التكيف.

يبرز المؤلفون أهمية بيانات هطول الأمطار الدقيقة لتحديد المناطق المعرضة للخطر وإجراء تقييمات لموارد المياه. يشيرون إلى أنه تم استخدام طرق إحصائية وجيودلالية متنوعة في الأبحاث السابقة لتحليل توزيع هطول الأمطار، بما في ذلك تقنيات مثل وزن المسافة العكسية (IDW)، الكريغينغ العادي (OK)، والانحدار الجغرافي الموزون (GWR). ومع ذلك، غالبًا ما تفشل هذه الطرق التقليدية في التقاط العلاقات المكانية المعقدة. للتغلب على هذه القيود، تدعو الورقة إلى دمج الأساليب الجيودلالية المتقدمة وتقنيات تعلم الآلة (ML)، التي يمكن أن نمذج العلاقات المكانية غير الخطية بشكل أكثر فعالية. يشير المؤلفون إلى أنه على الرغم من أن تقنيات ML قد تم استخدامها على نطاق واسع في الدراسات البيئية، فإن تطبيقها المحدد في الاستيفاء المكاني لهطول الأمطار هو محور هذه الدراسة، التي تهدف إلى تحسين تمثيل التدرجات المكانية، خاصة في ظروف نقص البيانات.

الطرق

تحدد قسم “المواد والطرق” تصميم التجربة والإجراءات المستخدمة في الدراسة. توضح المواد المحددة المستخدمة، بما في ذلك أي مواد كيميائية، معدات، وعينات بيولوجية، لضمان إمكانية تكرار التجارب. تشمل المنهجية البروتوكولات لجمع البيانات، بما في ذلك تقنيات أخذ العينات، القياسات، وأي تحليلات إحصائية تم إجراؤها لتفسير النتائج.

بالإضافة إلى ذلك، قد يصف القسم الظروف التجريبية التي أجريت فيها الدراسة، مثل درجة الحرارة، المدة، وتدابير التحكم لتقليل التحيز. تضمن هذه المقاربة الشاملة أن تكون النتائج قوية ويمكن التحقق منها من خلال جهود البحث المستقبلية.

النتائج

تقدم النتائج المعروضة في الجدول 1 إحصائيات هطول الأمطار الشهرية من محطات الأرصاد الجوية في باكستان، تغطي عقدين: 2001-2011 و2011-2021. من الجدير بالذكر أنه خلال العقد الثاني (2011-2021)، كشفت تحليل قيم نصف التباين عن زيادة مستمرة مع مسافة التأخير. تشير هذه الاتجاهات إلى وجود كبير للاعتماد الذاتي المكاني داخل بيانات هطول الأمطار، مما يدل على أن أنماط هطول الأمطار ليست موزعة عشوائيًا ولكنها تظهر درجة من الاعتماد المكاني.

المناقشة

في هذه الدراسة، بحث المؤلفون في فعالية خوارزميات تعلم الآلة (ML) المختلفة المدمجة مع الكريغينغ الانحداري (RK) لتخطيط هطول الأمطار المكاني في باكستان، وهي منطقة تتميز بنقص توفر البيانات. قارنوا بشكل منهجي بين ستة خوارزميات ML—الانحدار المتعدد الحدود (PR)، آلة الدعم الناقل (SVM)، الغابة العشوائية (RF)، الشبكات العصبية (NN)، أقرب الجيران (KNN)، والشبكة المرنة (EN)—المتكاملة مع RK، باستخدام بيانات هطول الأمطار الشهرية المتوسطة من 42 محطة أرصاد جوية على مدى عقدين (2001-2021). كشفت التحليلات أن نموذج RF-RK تفوق باستمرار على المجموعات الأخرى، مما أظهر دقة تنبؤية واستقرار أعلى عبر كلا العقدين. هذا النموذج التقط بفعالية العلاقات غير الخطية والاعتماد الذاتي المكاني، مما يجعله أداة قوية لتقدير هطول الأمطار في البيئات التي تعاني من نقص البيانات.

أبرزت النتائج اختلافات موسمية وعشرية واضحة في أنماط هطول الأمطار، مع زيادة التباين المكاني الملحوظ في العقد الأكثر حداثة (2011-2021). أكدت الدراسة على أهمية دمج الطرق الجيودلالية التقليدية مع تقنيات ML الحديثة لتحسين توقعات هطول الأمطار، خاصة في المناطق ذات الشبكات المراقبة القليلة. بينما أظهر نموذج RF-RK نتائج واعدة، أشار المؤلفون إلى القيود، مثل الاعتماد على الإحداثيات الجغرافية كمتنبئات وحيدة والحاجة إلى متغيرات فيزيائية إضافية في الأبحاث المستقبلية. بشكل عام، تسهم هذه العمل في تقديم رؤى قيمة حول تطبيق أطر ML-RK الهجينة لتحسين تخطيط هطول الأمطار المكاني وتؤكد على ضرورة وجود نهج نمذجة تكيفية استجابةً لتغيرات المناخ المتطورة.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-026-36074-4
PMID: https://pubmed.ncbi.nlm.nih.gov/41545471
Publication Date: 2026-01-16
Author(s): Marwa Manaf et al.
Primary Topic: Precipitation Measurement and Analysis

Overview

This study investigates the spatial variability of precipitation, which is crucial for effective water resource management and climate adaptation, particularly in arid and semiarid regions. Traditional geostatistical methods, such as ordinary kriging, often fail to adequately model the nonlinear relationships between rainfall and spatial coordinates. To address this limitation, the research compares machine learning-assisted regression kriging (ML-RK) methods, utilizing latitude and longitude as predictors, rather than constructing a comprehensive rainfall prediction model. The study evaluates six regression models—Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Neural Network (NN), Elastic Net (EN), and Polynomial Regression (PR)—in conjunction with regression kriging.

Using monthly and decadal precipitation data from 42 meteorological stations in Pakistan (2001-2021), the research assesses the optimal spatial structure through four theoretical variogram models: exponential, circular, spherical, and linear, employing Leave-One-Out Cross-Validation for evaluation. Performance metrics, including Root Mean Square Error (RMSE) and Mean Absolute Error (MAE), indicate that the RF-RK combination consistently outperforms other ML-RK combinations. The findings suggest that integrating ensemble learning with geostatistical interpolation effectively captures nonlinear relationships and spatial dependencies, resulting in high-resolution rainfall maps that can enhance climate adaptation planning, irrigation scheduling, and sustainable water resource management in data-scarce regions like Pakistan.

Introduction

The introduction of the paper addresses the pressing issues of global warming and climate change, emphasizing their profound impacts on physical, biological, and socioeconomic systems. Rising greenhouse gas emissions have led to increased global temperatures and disrupted hydrological cycles, resulting in a higher frequency of extreme weather events such as floods and droughts, which pose significant threats to agriculture, water resources, and livelihoods. Given the critical role of precipitation in regulating water balance and supporting ecosystems, understanding its spatial and temporal variability is essential for effective climate risk management and adaptation planning.

The authors highlight the importance of accurate rainfall data for identifying vulnerable regions and conducting water-resource assessments. They note that various statistical and geostatistical methods have been employed in past research to analyze rainfall distribution, including techniques such as Inverse Distance Weighting (IDW), Ordinary Kriging (OK), and Geographically Weighted Regression (GWR). However, these traditional methods often fall short in capturing complex spatial relationships. To overcome these limitations, the paper advocates for the integration of advanced geostatistical and machine learning (ML) approaches, which can model nonlinear spatial relationships more effectively. The authors point out that while ML techniques have been widely utilized in environmental studies, their specific application in spatial rainfall interpolation is the focus of this research, aiming to enhance the representation of spatial gradients, particularly in data-scarce conditions.

Methods

The “Materials and Methods” section outlines the experimental design and procedures employed in the study. It details the specific materials used, including any reagents, equipment, and biological samples, ensuring reproducibility of the experiments. The methodology encompasses the protocols for data collection, including sampling techniques, measurements, and any statistical analyses performed to interpret the results.

Additionally, the section may describe the experimental conditions under which the study was conducted, such as temperature, duration, and control measures to minimize bias. This comprehensive approach ensures that the findings are robust and can be validated by future research efforts.

Results

The results presented in Table 1 detail the monthly precipitation statistics from meteorological stations across Pakistan, covering two decades: 2001-2011 and 2011-2021. Notably, during the second decade (2011-2021), the analysis of semivariance values revealed a consistent increase with lag distance. This trend suggests a significant presence of spatial autocorrelation within the precipitation data, indicating that precipitation patterns are not randomly distributed but rather exhibit a degree of spatial dependence.

Discussion

In this study, the authors investigated the effectiveness of various machine learning (ML) algorithms combined with regression kriging (RK) for spatial rainfall mapping in Pakistan, a region characterized by limited data availability. They systematically compared six ML algorithms—Polynomial Regression (PR), Support Vector Machine (SVM), Random Forest (RF), Neural Networks (NN), K-Nearest Neighbors (KNN), and Elastic Net (EN)—integrated with RK, using monthly mean precipitation data from 42 meteorological stations over two decades (2001-2021). The analysis revealed that the RF-RK model consistently outperformed other combinations, demonstrating superior predictive accuracy and stability across both decades. This model effectively captured nonlinear relationships and spatial autocorrelation, making it a robust tool for rainfall estimation in data-scarce environments.

The findings highlighted distinct seasonal and decadal variations in rainfall patterns, with increased spatial heterogeneity observed in the more recent decade (2011-2021). The study emphasized the importance of integrating traditional geostatistical methods with modern ML techniques to enhance rainfall predictions, particularly in regions with sparse observational networks. While the RF-RK model showed promising results, the authors noted limitations, such as the reliance on geographic coordinates as the sole predictors and the need for additional physical covariates in future research. Overall, this work contributes valuable insights into the application of hybrid ML-RK frameworks for improving spatial rainfall mapping and underscores the necessity for adaptive modeling approaches in response to evolving climate dynamics.