آراء HESS: لا تدرب شبكة الذاكرة طويلة وقصيرة المدى (LSTM) على حوض واحد فقط HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin

المجلة: Hydrology and earth system sciences، المجلد: 28، العدد: 17
DOI: https://doi.org/10.5194/hess-28-4187-2024
تاريخ النشر: 2024-09-12
المؤلف: Frederik Kratzert وآخرون
الموضوع الرئيسي: التنبؤ الهيدرولوجي باستخدام الذكاء الاصطناعي

نظرة عامة

في مجال علوم الهيدرولوجيا، أصبحت تقنيات التعلم الآلي، وخاصة الشبكات العصبية الذاكرة الطويلة القصيرة (LSTM)، ذات أهمية متزايدة لنمذجة هطول الأمطار والجريان السطحي. ومع ذلك، فإن مشكلة شائعة تم تحديدها في الأدبيات هي تدريب هذه النماذج على مجموعات بيانات صغيرة ومتجانسة، غالبًا ما تكون مستمدة من حوض هيدرولوجي واحد. يجادل هذا الورقة بأن نماذج LSTM تحقق أداءً أفضل عند تدريبها على بيانات تشمل مجموعة متنوعة من الأحواض، بدلاً من مجموعات البيانات المحدودة.

يؤكد المؤلفون أن العديد من الدراسات قد استخدمت نماذج تعلم آلي كبيرة، بما في ذلك LSTMs، على بيانات غير كافية، مما أدى إلى استنتاجات مضللة حول تحسينات النموذج. ويؤكدون أنه ليس من المدهش تحقيق نتائج أفضل من خلال تعديل نماذج تم تدريبها بشكل سيء. بدلاً من ذلك، يدعو المؤلفون إلى دمج الفيزياء في نماذج التعلم الآلي المدربة بشكل جيد، على الرغم من أن المحاولات السابقة لم تظهر أداءً محسناً. ويؤكدون أن الباحثين يجب أن يستفيدوا من بيانات تدفق المياه المتاحة للجمهور لتدريب النماذج على مئات الأحواض، حتى عند التركيز على أحواض معينة، لضمان نتائج موثوقة وقوية في نمذجة هطول الأمطار والجريان السطحي.

نقاش

يؤكد قسم النقاش في الورقة على المنهجيات المتميزة المطلوبة للتعلم الآلي (ML) في النمذجة الهيدرولوجية مقارنة بالأساليب التقليدية. يبرز أنه بينما تتطلب نماذج هطول الأمطار والجريان السطحي التقليدية غالبًا المعايرة على البيانات المحلية، تستفيد نماذج التعلم الآلي، وخاصة الشبكات العصبية الذاكرة الطويلة القصيرة (LSTM)، بشكل كبير من التدريب على بيانات من عدة أحواض. تمكن هذه الطريقة نماذج التعلم الآلي من التقاط مجموعة أوسع من الاستجابات الهيدرولوجية، مما يعزز قدراتها التنبؤية، خاصة في الأحواض غير المقاسة وأثناء الأحداث القصوى. يجادل المؤلفون من أجل تحول في النموذج في النمذجة الهيدرولوجية، داعين إلى نهج من أعلى إلى أسفل حيث يتم تدريب النماذج في البداية على مجموعات بيانات واسعة قبل أن يتم ضبطها لتناسب أحواض معينة.

تقدم الورقة أيضًا أدلة تجريبية تظهر أن نماذج LSTM المدربة على عدة أحواض تتفوق على تلك المعايرة لأحواض فردية. توضح الأشكال أن النماذج التقليدية تحقق نتائج أفضل عند التركيز المحلي، بينما تظهر نماذج LSTM أداءً متفوقًا عند الاستفادة من بيانات تدريب متنوعة. يؤكد المؤلفون على أهمية التنوع الهيدرولوجي في مجموعات بيانات التدريب، مشيرين إلى أن مجموعات البيانات الأكبر والأكثر تنوعًا تؤدي إلى تحسين أداء النموذج وتقليل احتمالية أخطاء الاستقراء أثناء الاستدلال. ويخلصون إلى أن الاتجاه السائد في تدريب نماذج LSTM على مجموعات بيانات صغيرة من أحواض فردية يقوض الفوائد المحتملة للتعلم الآلي، مؤكدين على الحاجة إلى البحث المستقبلي لاستكشاف فوائد مجموعات التدريب الأكبر والأكثر تنوعًا في النمذجة الهيدرولوجية.

Journal: Hydrology and earth system sciences, Volume: 28, Issue: 17
DOI: https://doi.org/10.5194/hess-28-4187-2024
Publication Date: 2024-09-12
Author(s): Frederik Kratzert et al.
Primary Topic: Hydrological Forecasting Using AI

Overview

In the realm of hydrological sciences, machine learning, particularly Long Short-Term Memory (LSTM) networks, has become increasingly significant for rainfall-runoff modeling. However, a prevalent issue identified in the literature is the training of these models on small, homogeneous datasets, often derived from a single hydrological basin. This position paper argues that LSTM models yield superior performance when trained on data encompassing a diverse array of basins, rather than limited datasets.

The authors emphasize that many studies have employed large ML models, including LSTMs, on insufficient data, leading to misleading conclusions about model improvements. They contend that it is unremarkable to achieve better results by modifying poorly trained models. Instead, the authors advocate for the integration of physics into well-trained ML models, although previous attempts have not demonstrated enhanced performance. They assert that researchers should leverage the extensive publicly available streamflow data to train models on hundreds of basins, even when focusing on specific watersheds, to ensure robust and reliable rainfall-runoff modeling outcomes.

Discussion

The discussion section of the paper emphasizes the distinct methodologies required for machine learning (ML) in hydrological modeling compared to traditional approaches. It highlights that while conventional rainfall-runoff models often necessitate calibration to local data, ML models, particularly Long Short-Term Memory (LSTM) networks, benefit significantly from training on data from multiple watersheds. This approach enables ML models to capture a broader range of hydrological responses, enhancing their predictive capabilities, especially in ungauged basins and during extreme events. The authors argue for a paradigm shift in hydrological modeling, advocating for a top-down approach where models are initially trained on extensive datasets before being fine-tuned for specific catchments.

The paper also presents empirical evidence demonstrating that LSTM models trained on multiple basins outperform those calibrated to individual watersheds. Figures illustrate that traditional models yield better results when localized, whereas LSTM models show superior performance when leveraging diverse training data. The authors stress the importance of hydrological diversity in training datasets, noting that larger and more varied datasets lead to improved model performance and a reduced likelihood of extrapolation errors during inference. They conclude that the prevailing trend of training LSTM models on small datasets from single catchments undermines the potential advantages of ML, emphasizing the need for future research to explore the benefits of larger, more diverse training sets in hydrological modeling.

كلمات مفتاحية: الذاكرة طويلة وقصيرة المدى، الذكاء الاصطناعي، حوض هيكلي، شبكة عصبية اصطناعية، شبكة عصبية متكررة، علم الجيولوجيا، علم الحفريات، علم الفلك، علوم الحاسوب، فترة (زمن)