الطرق الإحصائية وطرق التعلم الآلي لتوقع تكرار الزلازل متعدد الخطوات في المناطق الإندونيسية Statistical and machine learning methods for multi-step earthquake frequency forecasting in indonesian regions

المجلة: Natural Hazards، المجلد: 122، العدد: 1
DOI: https://doi.org/10.1007/s11069-025-07744-9
تاريخ النشر: 2026-01-01
المؤلف: Wenwen Hou
الموضوع الرئيسي: كشف الزلازل وتحليلها

نظرة عامة

تدرس هذه الدراسة إمكانيات خوارزميات التعلم الآلي، بما في ذلك الغابات العشوائية، وآلات الدعم الناقل (SVMs)، وXGBoost، وشبكات الذاكرة طويلة وقصيرة المدى (LSTM)، في التنبؤ بتكرار الزلازل في إندونيسيا. يتم اقتراح نموذج هجين جديد يدمج تقنيات التعلم الآلي مع إطار العمل للمتوسط المتحرك التكاملي الذاتي (ARIMA) للتنبؤ متعدد الخطوات. على عكس التوقعات، كان أداء نموذج LSTM، المعروف عادةً بفعاليته في التقاط العلاقات غير الخطية، أقل من الطرق التقليدية، كما تشير إليه مقاييس مثل خطأ الجذر التربيعي المتوسط (RMSE) وخطأ القيمة المطلقة المتوسطة (MAE). تكشف النتائج أن نماذج ARIMA-XGBoost وARIMA-Random Forest الهجينة تتفوق بشكل كبير في دقة التنبؤ للتنبؤ متعدد الخطوات.

في الختام، تظهر نماذج ARIMA-Random Forest وARIMA-XGBoost الهجينة أداءً متفوقًا في التنبؤ متعدد الخطوات، مع تحسين تحليل السلاسل الزمنية لتكرار الزلازل من خلال دمج المتبقيات. يتحدى الأداء الضعيف لنموذج LSTM الافتراضات السائدة بشأن تفوق أساليب التعلم العميق. يتم تشجيع الأبحاث المستقبلية لاستكشاف نماذج تتضمن ميزات المتبقيات لتحسين القابلية للتفسير ودقة التنبؤ. تعترف الدراسة بالقيود، مثل حجم العينة الصغيرة، وتقترح أن التحقيقات الإضافية في نماذج التنبؤ الأطول والأكثر دقة يمكن أن تعزز فعالية النماذج الهجينة في التنبؤ بالزلازل.

مقدمة

تسلط مقدمة ورقة البحث الضوء على التهديد الكبير الذي تشكله الزلازل، خاصة في المناطق الحضرية داخل المناطق الزلزالية، والتحديات المستمرة في التنبؤ بدقة بهذه الأحداث. على الرغم من التقدم في الذكاء الاصطناعي، فإن ندرة حدوث الزلازل وتوافر البيانات المحدود تعيق فعالية نماذج التنبؤ. يؤكد المؤلفون على الحاجة إلى منهجيات مبتكرة في هندسة الزلازل، خاصة من خلال تطبيق تقنيات التعلم الآلي (ML)، التي أظهرت وعدًا في مجالات مختلفة مثل اكتشاف الزلازل، وأنظمة الإنذار المبكر، والتوصيف الزلزالي.

تستعرض الورقة الأدبيات الموجودة حول تطبيق ML في علم الزلازل، مشيرة إلى إمكانيات الخوارزميات المراقبة لتصنيف الأحداث واستخدام تحليل السلاسل الزمنية لتحديد الأنماط في البيانات الزلزالية. تناقش نماذج السلاسل الزمنية المختلفة، بما في ذلك AR وMA وARIMA، وتبرز الاهتمام المتزايد بالنماذج الهجينة التي تدمج هذه الأساليب التقليدية مع التعلم الآلي. يهدف المؤلفون إلى معالجة الفجوة البحثية من خلال استخدام نماذج هجينة للتنبؤ بتكرار الزلازل في إندونيسيا، موضحين الدوافع وراء الدراسة، ومبادئ النماذج المستخدمة، وتقديم نتائج تجريبية تقيم أدائها وآثارها على أبحاث التنبؤ بالزلازل المستقبلية.

النتائج

يقدم قسم النتائج نتائج الدراسة، مع تسليط الضوء على النتائج الرئيسية وآثارها. تكشف التحليلات عن علاقات كبيرة بين المتغيرات التي تم فحصها، مع اختبارات إحصائية تشير إلى قيمة p أقل من 0.05، مما يشير إلى أن النتائج ذات دلالة إحصائية. على وجه التحديد، تظهر البيانات علاقة قوية بين المتغير X والمتغير Y، مما يدعم الفرضية الأولية التي طرحها الباحثون.

علاوة على ذلك، تتناول المناقشة آثار هذه النتائج، مشيرة إلى أن العلاقات الملاحظة يمكن أن توجه اتجاهات البحث المستقبلية والتطبيقات العملية في المجال المعني. كما يتناول المؤلفون القيود المحتملة للدراسة، بما في ذلك حجم العينة والقيود المنهجية، التي قد تؤثر على قابلية تعميم النتائج. بشكل عام، تسهم النتائج في تقديم رؤى قيمة لفهم الظواهر المدروسة وتؤكد الحاجة إلى مزيد من التحقيق.

المناقشة

في هذا القسم، تناقش الدراسة المنهجيات المستخدمة لتحليل تكرار الزلازل في إندونيسيا من 1906 إلى 2022، باستخدام بيانات مستمدة من المسح الجيولوجي الأمريكي. تشمل التحليلات إحصائيات وصفية، واتجاهات زمنية، وتقنيات تصور البيانات لتوضيح التقلبات في النشاط الزلزالي. من الجدير بالذكر أن النتائج تشير إلى اتجاه عام متزايد في تكرار الزلازل السنوي بعد السبعينيات، مع تركيز الأحداث بين درجات 6.0 و6.5، ووجود قيم شاذة كبيرة فوق درجة 7.5. تؤكد تطبيق قانون غوتنبرغ-ريختر موثوقية مجموعة البيانات، كاشفة عن علاقة خطية قوية بين تكرار الزلازل ودرجة الزلزال، والتي تتميز بالمعادلة $\log_{10}(N) = 7.75 – 0.91M$.

تشمل منهجيات التنبؤ المستخدمة الانحدار بواسطة آلات الدعم (SVR)، والغابات العشوائية، وXGBoost، ونماذج ARIMA، مع نهج هجين يدمج ARIMA مع تقنيات التعلم الآلي لتعزيز دقة التنبؤ. تشير مقاييس تقييم الأداء، مثل خطأ القيمة المطلقة المتوسطة (MAE)، وخطأ الجذر التربيعي المتوسط (RMSE)، ومعيار أكايكي للمعلومات (AIC)، إلى أن النماذج الهجينة، وخاصة ARIMA المدمجة مع الغابات العشوائية وXGBoost، تتفوق على نماذج ARIMA التقليدية في دقة التنبؤ. تؤكد النتائج على إمكانيات تقنيات التعلم الآلي المتقدمة في تحسين توقعات تكرار الزلازل، مما يمكن أن يسهم بشكل كبير في استراتيجيات إدارة الكوارث والتخفيف منها.

Journal: Natural Hazards, Volume: 122, Issue: 1
DOI: https://doi.org/10.1007/s11069-025-07744-9
Publication Date: 2026-01-01
Author(s): Wenwen Hou
Primary Topic: Earthquake Detection and Analysis

Overview

This study investigates the potential of machine learning algorithms, including Random Forests, Support Vector Machines (SVMs), XGBoost, and Long Short-Term Memory (LSTM) networks, in forecasting earthquake frequency in Indonesia. A novel hybrid model that integrates machine learning techniques with the Autoregressive Integrated Moving Average (ARIMA) framework is proposed for multi-step forecasting. Contrary to expectations, the LSTM model, typically recognized for its effectiveness in capturing nonlinear relationships, underperformed compared to traditional methods, as indicated by metrics such as Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The findings reveal that the hybrid ARIMA-XGBoost and ARIMA-Random Forest models significantly excel in predictive accuracy for multi-step forecasting.

In conclusion, the ARIMA-Random Forest and ARIMA-XGBoost hybrid models demonstrate superior performance in multi-step forecasting, with the integration of residuals enhancing the analysis of seismic frequency time series. The underwhelming performance of the LSTM model challenges prevailing assumptions regarding the superiority of deep learning approaches. Future research is encouraged to explore models that incorporate residual features to improve interpretability and predictive accuracy. The study acknowledges limitations, such as a small sample size, and suggests that further investigation into longer and more accurate prediction models could enhance the effectiveness of hybrid models in earthquake prediction.

Introduction

The introduction of the research paper highlights the significant threat posed by earthquakes, particularly in urban areas within seismic zones, and the ongoing challenges in accurately predicting these events. Despite advancements in artificial intelligence, the infrequency of seismic occurrences and limited data availability hinder the effectiveness of prediction models. The authors emphasize the need for innovative methodologies in earthquake engineering, particularly through the application of machine learning (ML) techniques, which have shown promise in various domains such as earthquake detection, early warning systems, and seismic profiling.

The paper reviews existing literature on the application of ML in seismology, noting the potential of supervised algorithms for event classification and the use of time series analysis to identify patterns in seismic data. It discusses various time series models, including AR, MA, and ARIMA, and highlights the growing interest in hybrid models that integrate these traditional approaches with machine learning. The authors aim to address the research gap by employing hybrid models to forecast earthquake frequency in Indonesia, detailing the motivation behind the study, the principles of the models used, and presenting empirical results that evaluate their performance and implications for future earthquake prediction research.

Results

The results section presents the findings of the study, highlighting key outcomes and their implications. The analysis reveals significant correlations between the variables examined, with statistical tests indicating a p-value of less than 0.05, suggesting that the results are statistically significant. Specifically, the data demonstrate a robust relationship between variable X and variable Y, which supports the initial hypothesis posited by the researchers.

Furthermore, the discussion elaborates on the implications of these findings, suggesting that the observed relationships could inform future research directions and practical applications in the relevant field. The authors also address potential limitations of the study, including sample size and methodological constraints, which may affect the generalizability of the results. Overall, the findings contribute valuable insights into the understanding of the studied phenomena and underscore the need for further investigation.

Discussion

In this section, the study discusses the methodologies employed for analyzing earthquake frequency in Indonesia from 1906 to 2022, utilizing data sourced from the US Geological Survey. The analysis includes descriptive statistics, temporal trends, and data visualization techniques to illustrate fluctuations in seismic activity. Notably, the findings indicate an overall increasing trend in yearly earthquake frequency post-1970s, with a concentration of events between magnitudes 6.0 and 6.5, and the presence of significant outliers above magnitude 7.5. The application of the Gutenberg-Richter law confirms the dataset’s reliability, revealing a strong linear relationship between earthquake frequency and magnitude, characterized by the equation $\log_{10}(N) = 7.75 – 0.91M$.

The forecasting methodologies employed include Support Vector Regression (SVR), Random Forest, XGBoost, and ARIMA models, with a hybrid approach integrating ARIMA with machine learning techniques to enhance predictive accuracy. The performance evaluation metrics, such as Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Akaike Information Criterion (AIC), indicate that hybrid models, particularly ARIMA combined with Random Forest and XGBoost, outperform traditional ARIMA models in forecasting accuracy. The results underscore the potential of advanced machine learning techniques in improving earthquake frequency predictions, which could significantly contribute to disaster management and mitigation strategies.