تحسين توقع استقرار المنحدرات باستخدام تقنيات التعلم الآلي الجماعي Enhanced slope stability prediction using ensemble machine learning techniques

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-90539-6
PMID: https://pubmed.ncbi.nlm.nih.gov/40025161
تاريخ النشر: 2025-03-01
المؤلف: Devendra Kumar Yadav وآخرون
الموضوع الرئيسي: الانهيارات الأرضية والمخاطر المرتبطة بها

نظرة عامة

تبحث هذه الدراسة في تطبيق نماذج التعلم الآلي (ML) للتنبؤ باستقرار المنحدرات، مع معالجة التحديات الكامنة في البيئات الواقعية. تقدم الدراسة نهجًا جماعيًا يستخدم تقنيات التجميع والتعزيز مع مصنفات أساسية مختارة لتعزيز دقة التنبؤ. من خلال تحليل 125 نقطة بيانات وسبعة معايير كمية، تحقق الدراسة دقة تزيد عن 90% في مهام التصنيف، لا سيما مع مصنفات شجرة القرار (DT) والغابات العشوائية (RF). ومن الجدير بالذكر أن النماذج الجماعية تظهر تحسنًا بنسبة 8-10% في الدقة مقارنة بالمصنفات الأساسية، حتى بعد تطبيق تقنيات تقليل الأبعاد.

فيما يتعلق بتحليل الانحدار، تبرز الدراسة فعالية انحدار التجميع الجماعي، الذي يحسن متوسط قيمة $R^2$ بنسبة 8-10% مقارنة بنماذج الانحدار التقليدية. تحقق نماذج انحدار Lasso-Lars CV والتجميع أعلى قيم $R^2$ (0.84)، مما يشير إلى قدرات تنبؤية قوية لاستقرار المنحدرات. تشير النتائج إلى أن الجمع بين RF والتجميع مع DT كمصنف أساسي يوفر إطارًا موثوقًا للتنبؤ بعوامل الأمان (FOS) في تقييمات استقرار المنحدرات. بشكل عام، تؤكد الدراسة على تفوق التقنيات الجماعية في كل من مهام التصنيف والانحدار، مما يقدم رؤى قيمة لتقدم ممارسات الهندسة الجيوتقنية. قد تركز الأعمال المستقبلية على تحسين أداء النموذج مع مجموعات بيانات أكبر.

الطرق

في هذا القسم، يوضح المؤلفون طرق الانحدار المستخدمة لتحليل العلاقة بين المتغيرات التابعة والمستقلة، باستخدام تقنيات الانحدار الخطي بشكل أساسي. يتم تمثيل نموذج الانحدار الخطي البسيط على أنه $ y = \beta_0 + \beta_1 x + \epsilon $، بينما يتم التعبير عن نموذج الانحدار الخطي المتعدد على أنه $ y = X\beta + \epsilon $، حيث $ X $ هو مصفوفة المتغيرات المستقلة، و $ \beta $ هو متجه المعاملات، و $ \epsilon $ هو متجه الخطأ. لتعزيز أداء النموذج، يتم تقديم تقنيات التنظيم مثل انحدار Ridge وLasso، مع وظائف التكلفة الخاصة بها التي تقلل من البقايا مع التحكم في حجم المعاملات. يجمع ElasticNet بين كل من التنظيم L1 وL2، مما يوفر فوائد الاستقرار واختيار المتغيرات.

تكشف النتائج التجريبية عن أداء نماذج التعلم الآلي المختلفة، بما في ذلك الطرق الجماعية مثل التجميع والتعزيز، بالإضافة إلى المصنفات التقليدية مثل K-NN وSVM وأشجار القرار (DT) والغابات العشوائية (RF). تشير النتائج إلى أن RF حقق أعلى دقة بنسبة 92%، تليه DT وطرق التجميع عن كثب. استخدم المؤلفون نهج التحقق المتقاطع 10 مرات لضمان مقاييس أداء قوية، حيث تفوقت RF والتجميع مع DT باستمرار على النماذج الأخرى عبر معايير تقييم متعددة، بما في ذلك الدقة والاسترجاع ودرجات AUC. بالإضافة إلى ذلك، تظهر النتائج أن تقليل الأبعاد باستخدام تحليل المكونات الرئيسية Kernel (KPCA) أثر سلبًا على دقة النموذج، على الرغم من أن RF وطرق التعزيز لا تزال تؤدي بشكل فعال. بشكل عام، تستنتج الدراسة أن الطرق الجماعية، لا سيما التجميع وLassoLarsCV، تعزز بشكل كبير القدرات التنبؤية لتقييمات استقرار المنحدرات، متفوقة على تقنيات الانحدار التقليدية.

المناقشة

تقدم ورقة البحث تحليلًا شاملاً لتنبؤ استقرار المنحدرات من خلال كل من منهجيات التصنيف والانحدار، مع التركيز على عامل الأمان (FOS). تبدأ الدراسة بجمع البيانات التي تشمل سبعة عوامل مؤثرة، والتي يتم معالجتها بعد ذلك باستخدام تقنيات التوحيد والتطبيع. بالنسبة للتصنيف، يتم استخدام خوارزميات التعلم الآلي التقليدية مثل الجيران الأقرب (K-NN) وأشجار القرار (DT) والغابات العشوائية (RF) وآلات الدعم الناقل (SVM)، تليها طرق جماعية مثل التجميع والتعزيز. تشير النتائج إلى أن المصنفات الجماعية، لا سيما تلك التي تستخدم التجميع مع DT وRF، تحقق دقة تزيد عن 90%، متفوقة بشكل كبير على الطرق الحالية. يكشف التحليل أيضًا أن ارتفاع المنحدر هو العامل الأكثر أهمية الذي يؤثر على الاستقرار.

في تحليل الانحدار، يتم ملاءمة نماذج قياسية وجماعية مختلفة، بما في ذلك الانحدار الخطي، والانحدار باستخدام آلات الدعم (SVR)، وLasso، لقيم FOS. تشير النتائج إلى أن نماذج التجميع وLasso-Lars CV تتفوق في التنبؤ بـ FOS مقارنة بتقنيات الانحدار الأخرى. تؤكد الدراسة على قوة RF والتجميع مع DT كنماذج جماعية لتقييم استقرار المنحدرات. علاوة على ذلك، تناقش الورقة آثار تقنيات تقليل الأبعاد، وتجد أنها غير ضرورية للحفاظ على دقة النموذج العالية. بشكل عام، تسهم هذه الدراسة في تقديم رؤى قيمة حول النمذجة التنبؤية لاستقرار المنحدرات، مع تسليط الضوء على أهمية معلمات معينة وفعالية أساليب التعلم الآلي المتقدمة في الهندسة الجيوتقنية.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-90539-6
PMID: https://pubmed.ncbi.nlm.nih.gov/40025161
Publication Date: 2025-03-01
Author(s): Devendra Kumar Yadav et al.
Primary Topic: Landslides and related hazards

Overview

This research investigates the application of machine learning (ML) models for predicting slope stability, addressing the challenges inherent in real-world environments. The study introduces an ensemble approach utilizing bagging and boosting techniques with selected base classifiers to enhance prediction accuracy. By analyzing 125 data points and seven quantitative parameters, the study achieves over 90% accuracy in classification tasks, particularly with the Decision Tree (DT) and Random Forest (RF) classifiers. Notably, the ensemble models demonstrate an 8-10% improvement in accuracy compared to baseline classifiers, even after applying dimensionality reduction techniques.

In terms of regression analysis, the study highlights the effectiveness of ensemble bagging regression, which improves the average $R^2$ value by 8-10% over traditional regression models. The Lasso-Lars CV and bagging regression models yield the highest $R^2$ values (0.84), indicating robust predictive capabilities for slope stability. The findings suggest that the combination of RF and bagging with DT as a base classifier provides a reliable framework for predicting factors of safety (FOS) in slope stability assessments. Overall, the research underscores the superiority of ensemble techniques in both classification and regression tasks, offering valuable insights for advancing geotechnical engineering practices. Future work may focus on enhancing model performance with larger datasets.

Methods

In this section, the authors detail the regression methods employed to analyze the relationship between dependent and independent variables, primarily using linear regression techniques. The simple linear regression model is represented as $ y = \beta_0 + \beta_1 x + \epsilon $, while the multiple linear regression model is expressed as $ y = X\beta + \epsilon $, where $ X $ is the matrix of independent variables, $ \beta $ is the coefficient vector, and $ \epsilon $ is the error vector. To enhance model performance, regularization techniques such as Ridge Regression and Lasso are introduced, with their respective cost functions minimizing residuals while controlling for coefficient magnitude. ElasticNet combines both L1 and L2 regularization, providing stability and variable selection benefits.

The experimental results reveal the performance of various machine learning models, including ensemble methods like bagging and boosting, as well as traditional classifiers such as K-NN, SVM, decision trees (DT), and random forests (RF). The findings indicate that RF achieved the highest accuracy at 92%, followed closely by DT and bagging methods. The authors utilized a 10-fold cross-validation approach to ensure robust performance metrics, with RF and bagging with DT consistently outperforming other models across multiple evaluation criteria, including precision, recall, and AUC scores. Additionally, the results demonstrate that dimensionality reduction using Kernel Principal Component Analysis (KPCA) negatively impacted model accuracy, although RF and boosting methods still performed effectively. Overall, the study concludes that ensemble methods, particularly bagging and LassoLarsCV, significantly enhance predictive capabilities for slope stability assessments, outperforming traditional regression techniques.

Discussion

The research paper presents a comprehensive analysis of slope stability prediction through both classification and regression methodologies, focusing on the Factor of Safety (FOS). The study begins with the collection of data encompassing seven influential factors, which are then preprocessed using standardization and normalization techniques. For classification, traditional machine learning algorithms such as K-Nearest Neighbors (K-NN), Decision Trees (DT), Random Forests (RF), and Support Vector Machines (SVM) are employed, followed by ensemble methods like bagging and boosting. The results indicate that ensemble classifiers, particularly those utilizing bagging with DT and RF, achieve over 90% accuracy, significantly outperforming existing methods. The analysis also reveals that slope height is the most critical factor affecting stability.

In the regression analysis, various standard and ensemble models, including Linear Regression, Support Vector Regression (SVR), and Lasso, are fitted to the FOS values. The findings suggest that bagging and Lasso-Lars CV models excel in predicting FOS compared to other regression techniques. The study emphasizes the robustness of RF and bagging with DT as ensemble models for slope stability assessment. Furthermore, the paper discusses the implications of dimension reduction techniques, finding them unnecessary for maintaining high model accuracy. Overall, this research contributes valuable insights into the predictive modeling of slope stability, highlighting the importance of specific parameters and the effectiveness of advanced machine learning approaches in geotechnical engineering.