التعلم التآزري مع DeepONet متعدد المهام لحل مشكلات PDE بكفاءة Synergistic learning with multi-task DeepONet for efficient PDE problem solving

المجلة: Neural Networks، المجلد: 184
DOI: https://doi.org/10.1016/j.neunet.2024.107113
PMID: https://pubmed.ncbi.nlm.nih.gov/39793491
تاريخ النشر: 2025-01-05
المؤلف: Varun Kumar وآخرون
الموضوع الرئيسي: تقليل النماذج والشبكات العصبية

نظرة عامة

يقدم القسم نظرة عامة على التعلم متعدد المهام (MTL) وتطبيقه على المشكلات التي تحكمها المعادلات التفاضلية الجزئية (PDEs) في العلوم والهندسة. يعمل MTL كآلية نقل استقرائي تعزز أداء التعميم من خلال الاستفادة من المعلومات من مهام متعددة، مما يعالج التحديات مثل ندرة البيانات والتكيف الزائد في الشبكات العصبية. يقدم المؤلفون شبكة مشغل عميقة متعددة المهام (MT-DeepONet) مصممة لتعلم الحلول عبر أشكال وظيفية مختلفة من الحدود المصدرية في PDEs وأشكال هندسية متعددة خلال جلسة تدريب واحدة.

تشمل الابتكارات الرئيسية في MT-DeepONet تعديلات على شبكة الفرع في DeepONet القياسي لاستيعاب معاملات مختلفة محددة في PDEs وإدخال قناع ثنائي للتعامل مع الأشكال الهندسية المعلمة. يتم دمج هذا القناع في مصطلح الخسارة لتعزيز التقارب والتعميم لمهام هندسية جديدة. يتم إثبات فعالية الإطار المقترح من خلال ثلاثة مشكلات مرجعية: تعلم أشكال وظيفية متنوعة من الحد المصدر في معادلة فيشر، ومعالجة أشكال هندسية متعددة في مشكلة تدفق دارسي ثنائية الأبعاد مع تحسين قدرات التعلم الانتقالي، والتنبؤ بالحلول لأشكال هندسية ثلاثية الأبعاد معلمة في سيناريوهات نقل الحرارة. بشكل عام، يمثل إطار MT-DeepONet تقدمًا كبيرًا في حل المشكلات المتعلقة بـ PDE من خلال تسهيل التعلم التآزري وتقليل تكاليف التدريب لمشغلات الشبكات العصبية.

مقدمة

في مجال التعلم الآلي العلمي، ظهرت مقدمة المشغلين العصبيين (NOs) كنهج واعد لحل المعادلات التفاضلية الجزئية (PDEs) من خلال رسم خرائط لمختلف دوال الإدخال، مثل الشروط الأولية والحدودية، إلى حلولها المقابلة. غالبًا ما تكافح الطرق العددية التقليدية، على الرغم من فعاليتها، مع PDEs عالية الأبعاد ويمكن أن تكون مكثفة حسابيًا، خاصة عندما تتطلب التغييرات الطفيفة في الإدخال أو الهندسة إعادة حساب. لقد سهلت التقدمات الأخيرة في الشبكات العصبية العميقة (DNNs) تطوير أطر مثل DeepONet وغيرها من المشغلين التكاملية، التي أظهرت إمكانيات عبر تطبيقات متنوعة. ومع ذلك، غالبًا ما تكون أداء هذه NOs مقيدًا بنقص بيانات التدريب المعلّمة، خاصة للنماذج المعقدة.

لمعالجة هذه التحديات، تم اقتراح التعلم متعدد المهام (MTL) كاستراتيجية للاستفادة من المعلومات من المهام ذات الصلة، مما يقلل من ندرة البيانات ويعزز التعميم. تم تطبيق تقنيات MTL، بما في ذلك مشاركة المعلمات الصعبة والناعمة، بنجاح في مجالات متنوعة، بما في ذلك رؤية الكمبيوتر. يقدم هذا البحث DeepONet متعدد المهام (MT-DeepONet)، الذي يوسع قدرات DeepONet للتدريب في وقت واحد على عدة PDEs معلمة عبر مجالات هندسية مختلفة. يهدف MT-DeepONet إلى تحسين القابلية للتعميم ونقل المعرفة من خلال السماح بالتعلم المتزامن للحلول لمهام مترابطة دون الحاجة إلى إعادة التدريب. يتم توضيح المنهجية من خلال التطبيقات في معادلات تدفق دارسي ثنائية الأبعاد وسيناريوهات نقل الحرارة ثلاثية الأبعاد، مما يظهر أداءً محسنًا في التعلم عبر أشكال هندسية متنوعة.

نقاش

يناقش القسم التقدمات في التعلم متعدد المهام لمشغلين عصبيين، تحديدًا من خلال إدخال إطار MT-DeepONet، الذي يعزز قدرات نماذج DeepONet التقليدية. تم تصميم المشغلين العصبيين لتعلم الخرائط غير الخطية بين الفضاءات الوظيفية، خاصة للمعادلات التفاضلية الجزئية (PDEs) المعقدة. يسمح MT-DeepONet بالتدريب المتزامن على وظائف وأشكال هندسية متعددة، مما يحسن قدرة النموذج على التعميم عبر ظروف معلمات متنوعة. يتم تحقيق ذلك من خلال تعديل شبكة الفرع لاستيعاب حدود مصدرية وأشكال هندسية مختلفة، مما يسهل تعلم أنظمة فيزيائية متنوعة في دورة تدريب واحدة.

تستند بنية MT-DeepONet إلى DeepONet القياسي، الذي يتكون من شبكة فرع تشفر دوال الإدخال وشبكة جذع تقيم مشغل الحل عند إحداثيات مكانية زمنية محددة. يستخدم الإطار تقنية قناع ثنائي لضمان أن الحلول محصورة ضمن الحدود الهندسية لمجال المشكلة. يتم إثبات فعالية MT-DeepONet من خلال أمثلة عددية متنوعة، بما في ذلك معادلة فيشر ومشكلات تدفق دارسي، مما يبرز قدرته على التعلم والتنبؤ بالحلول عبر أشكال هندسية وظروف أولية متعددة. تشير النتائج إلى أنه بينما يمكن أن يعزز التعلم متعدد المهام التعميم، يتطلب الأمر اعتبارًا دقيقًا لتجنب آثار النقل السلبية عند دمج المهام غير ذات الصلة. بشكل عام، يمثل إطار MT-DeepONet خطوة كبيرة إلى الأمام في تطبيق المشغلين العصبيين لحل PDEs المعقدة عبر سيناريوهات متنوعة.

Journal: Neural Networks, Volume: 184
DOI: https://doi.org/10.1016/j.neunet.2024.107113
PMID: https://pubmed.ncbi.nlm.nih.gov/39793491
Publication Date: 2025-01-05
Author(s): Varun Kumar et al.
Primary Topic: Model Reduction and Neural Networks

Overview

The section presents an overview of multi-task learning (MTL) and its application to problems governed by partial differential equations (PDEs) in science and engineering. MTL serves as an inductive transfer mechanism that enhances generalization performance by leveraging information from multiple tasks, addressing challenges such as data sparsity and overfitting in neural networks. The authors introduce a multi-task deep operator network (MT-DeepONet) designed to learn solutions across various functional forms of source terms in PDEs and multiple geometries during a single training session.

Key innovations in the MT-DeepONet include modifications to the branch network of the standard DeepONet to accommodate different parameterized coefficients in PDEs and the introduction of a binary mask to handle parameterized geometries. This mask is integrated into the loss term to enhance convergence and generalization to new geometrical tasks. The effectiveness of the proposed framework is demonstrated through three benchmark problems: learning various functional forms of the source term in the Fisher equation, addressing multiple geometries in a 2D Darcy Flow problem with improved transfer learning capabilities, and predicting solutions for 3D parameterized geometries in heat transfer scenarios. Overall, the MT-DeepONet framework represents a significant advancement in solving PDE-related problems by facilitating synergistic learning and reducing training costs for neural operators.

Introduction

In the realm of scientific machine learning, the introduction of neural operators (NOs) has emerged as a promising approach for solving partial differential equations (PDEs) by mapping various input functions, such as initial and boundary conditions, to their corresponding solutions. Traditional numerical methods, while effective, often struggle with high-dimensional PDEs and can be computationally intensive, particularly when minor changes in input or geometry necessitate recomputation. Recent advancements in deep neural networks (DNNs) have facilitated the development of frameworks like DeepONet and other integral operators, which have shown potential across diverse applications. However, the performance of these NOs is frequently constrained by the scarcity of labeled training data, especially for complex models.

To address these challenges, multi-task learning (MTL) has been proposed as a strategy to leverage information from related tasks, thereby mitigating data sparsity and enhancing generalization. MTL techniques, including hard and soft parameter sharing, have been successfully applied in various domains, including computer vision. This paper introduces the multi-task DeepONet (MT-DeepONet), which extends the capabilities of DeepONet to concurrently train on multiple parameterized PDEs across different geometric domains. The MT-DeepONet aims to improve generalizability and knowledge transfer by allowing simultaneous learning of solutions for correlated tasks without the need for retraining. The methodology is exemplified through applications in 2D Darcy flow equations and 3D heat transfer scenarios, demonstrating enhanced performance in learning across varied geometries.

Discussion

The section discusses the advancements in multi-task learning for neural operators, specifically through the introduction of the MT-DeepONet framework, which enhances the capabilities of traditional DeepONet models. Neural operators are designed to learn nonlinear mappings between functional spaces, particularly for complex parametric partial differential equations (PDEs). The MT-DeepONet allows for concurrent training on multiple functions and geometries, improving the model’s ability to generalize across varied parametric conditions. This is achieved by modifying the branch network to accommodate different source terms and geometries, thereby facilitating the learning of diverse physical systems in a single training cycle.

The architecture of MT-DeepONet builds upon the standard DeepONet, which consists of a branch network that encodes input functions and a trunk network that evaluates the solution operator at specified spatio-temporal coordinates. The framework employs a binary masking technique to ensure that solutions are confined within the geometrical boundaries of the problem domain. The effectiveness of MT-DeepONet is demonstrated through various numerical examples, including the Fisher equation and Darcy flow problems, showcasing its ability to learn and predict solutions across multiple geometries and initial conditions. The results indicate that while multi-task learning can enhance generalization, careful consideration is required to avoid negative transfer effects when combining unrelated tasks. Overall, the MT-DeepONet framework represents a significant step forward in the application of neural operators for solving complex PDEs across diverse scenarios.