شبكة المشغل العشوائية: نهج قائم على مبدأ الحد الأقصى العشوائي لتعلم المشغل Stochastic Operator Network: A Stochastic Maximum Principle Based Approach to Operator Learning

المجلة: Journal of Machine Learning، المجلد: 5، العدد: 1
DOI: https://doi.org/10.4208/jml.250709
تاريخ النشر: 2026-01-01
المؤلف: Ryan Bausback وآخرون
الموضوع الرئيسي: الشبكات العصبية والتطبيقات

نظرة عامة

في هذا البحث، يقدم المؤلفون شبكة المشغل العشوائية (SON)، وهي إطار عمل جديد مصمم لتقدير عدم اليقين في تعلم المشغلين. من خلال دمج مفاهيم من التحكم الأمثل العشوائي وهندسة DeepONet، يعيد SON صياغة شبكة الفروع كمعادلة تفاضلية عشوائية (SDE) ويستخدم الانتشار العكسي عبر المعادلة التفاضلية العشوائية العكسية (BSDE). تتيح هذه الطريقة المبتكرة للشبكة تعلم عدم اليقين الكامن في المشغلين من خلال استخدام معلمات الانتشار، مما يعزز عملية التدريب من خلال تدرج الهاميلتوني المستمد من مبدأ الحد الأقصى العشوائي (SMP).

تُ validated فعالية SON من خلال تجارب عددية تشمل كل من المشغلين التكاملين ومشغلين الحلول من المعادلات التفاضلية العشوائية في بعدين وثلاثة أبعاد. تشير النتائج إلى أن SON لا يكرر فقط مخرجات المشغلين المزعجين بدقة عالية، بل quantifies أيضًا عدم اليقين في المشغلين بفعالية عبر معلمات قابلة للتدريب. من الجدير بالذكر أن SON يستعيد باستمرار عامل قياس الضوضاء عبر جميع التجارب، مما يوضح قوته في تقدير عدم اليقين دون تكبد عقوبات تدريب كبيرة مقارنة بـ DeepONet التقليدي. تشمل اتجاهات البحث المستقبلية معالجة سيناريوهات عدم اليقين الأكثر تعقيدًا، وتطبيق SON على المعادلات التفاضلية العشوائية (SDEs) والمعادلات التفاضلية الجزئية العشوائية (SPDEs) الصعبة، ودمج تقنيات استيعاب البيانات لتعزيز أدائه في البيئات ذات بيانات التدريب المحدودة.

مقدمة

في السنوات الأخيرة، اكتسب تعلم المشغلين زخماً كبديل مبتكر لحلول الأعداد التقليدية للمعادلات التفاضلية. على عكس أساليب التعلم الآلي التقليدية التي تعمل على نقاط بيانات منفصلة، تعالج المشغلين العصبيين وظائف كاملة، مما ينتج عنه وظائف مخرجات متناسبة. المعمارية الرائدة في هذا المجال هي شبكة المشغل العميق (DeepONet) والمشغل العصبي فورييه (FNO). تستخدم DeepONet هيكل شبكة فرعية مزدوجة لتعلم المعاملات والأسس المدفوعة بالبيانات، بينما يستخدم FNO تحويلات الأساس فورييه ضمن هيكله. من الجدير بالذكر أن كلا المعماريتين تتجاوزان قيود التقطيع الثابت للشبكة، مما يمكّن الاستدلال الفعال بدون شبكة لمجموعة متنوعة من المعادلات التفاضلية الجزئية (PDEs) المعلمة.

على الرغم من التقدم، تركز أطر تعلم المشغلين الحالية بشكل أساسي على المخرجات الحتمية، مع استكشاف محدود للمشغلين العشوائيين الذين يدمجون الضوضاء على طول المسارات. يقدم هذا البحث استراتيجية تدريب جديدة تدمج الشبكات العصبية العشوائية (SNNs) مع إطار DeepONet لمعالجة هذه الفجوة. تصيغ بنية SNN تطور الطبقات المخفية كنظام معادلات تفاضلية عادية (ODE) مقطعة، مع دمج الحركة البراونية الإضافية للانتقال إلى إطار المعادلات التفاضلية العشوائية (SDE). تستفيد شبكة المشغل العشوائية (SON) من مبدأ الحد الأقصى العشوائي لصياغة دالة خسارة لعملية الانتشار العكسي، مما يحول التدريب إلى مشكلة تحكم أمثل عشوائي. يوضح البحث المنهجية، ويقدم تجارب عددية تتحقق من النهج، ويقارن أداء SON مقابل DeepONet القياسي، مما يوضح فعاليته في التعامل مع المشغلين العشوائيين.

نقاش

تناقش هذه القسم شبكة المشغل العميق (DeepONet) وتوسيعها، شبكة المشغل العشوائية (SON)، التي تدمج العناصر العشوائية لنمذجة المشغلين المزعجين بشكل أفضل. تم تصميم DeepONet لتقريب الخرائط بين فضاءين باناش من خلال أخذ دالة إدخال \( u \) ومجموعة من نقاط التقييم \( y \)، مما ينتج عنه مخرج \( G(u)(y) \). يتضمن تدريب الشبكة تقطيع دالة الإدخال \( u \) في مواقع حساسة ثابتة، مما يسمح بالمرونة في استراتيجيات العينة. يثبت نظرية التقريب الشامل للمشغلين أن هيكل فرع-جذع ضحل يمكنه تقريب أي مشغل مستمر، بينما يسمح التمديد بهياكل أعمق وغير متماثلة.

تعزز SON DeepONet من خلال استبدال شبكة الفروع الخاصة بها بشبكة عصبية عشوائية (SNN)، التي تدمج المعادلات التفاضلية العشوائية لالتقاط الضوضاء الكامنة في حلول المعادلات التفاضلية العشوائية (SDEs). يتم صياغة SNN كمشكلة تحكم أمثل عشوائي، حيث يتم توجيه عملية التدريب بواسطة مبدأ الحد الأقصى العشوائي. يسمح هذا الإطار لـ SON بتعلم المشغلين المزعجين بفعالية، كما يتضح من خلال تجارب عددية متنوعة، بما في ذلك تقريب مشغلين مضادين للمعادلات التفاضلية العادية (ODEs) مع إضافة الضوضاء. تشير النتائج إلى أن SON يمكنه التقاط كل من المتوسط والانحراف المعياري لمخرجات المشغل، مما quantifies بنجاح عدم اليقين في التوقعات.

Journal: Journal of Machine Learning, Volume: 5, Issue: 1
DOI: https://doi.org/10.4208/jml.250709
Publication Date: 2026-01-01
Author(s): Ryan Bausback et al.
Primary Topic: Neural Networks and Applications

Overview

In this research, the authors introduce the Stochastic Operator Network (SON), a novel framework designed for uncertainty quantification in operator learning. By integrating concepts from stochastic optimal control and the DeepONet architecture, SON reformulates the branch network as a stochastic differential equation (SDE) and employs backpropagation through the adjoint backward stochastic differential equation (BSDE). This innovative approach allows the network to learn the uncertainty inherent in operators by utilizing diffusion parameters, thereby enhancing the training process through the gradient of the Hamiltonian derived from the Stochastic Maximum Principle (SMP).

The effectiveness of SON is validated through numerical experiments involving both integral operators and solution operators from stochastic differential equations in two and three dimensions. The results indicate that SON not only replicates the outputs of noisy operators with high accuracy but also quantifies operator uncertainty effectively via trainable parameters. Notably, SON consistently recovers the noise-scaling factor across all experiments, demonstrating its robustness in uncertainty quantification without incurring significant training penalties compared to the traditional DeepONet. Future research directions include addressing more complex uncertainty scenarios, applying SON to challenging stochastic differential equations (SDEs) and stochastic partial differential equations (SPDEs), and integrating data assimilation techniques to enhance its performance in environments with limited training data.

Introduction

In recent years, operator learning has gained traction as an innovative alternative to traditional numerical solvers for differential equations. Unlike classical machine learning approaches that operate on discrete data points, neural operators process entire functions, producing corresponding output functions. The two leading architectures in this domain are the Deep Operator Network (DeepONet) and the Fourier Neural Operator (FNO). DeepONet utilizes a dual subnetwork structure to learn coefficients and data-driven bases, while FNO employs Fourier basis transformations within its architecture. Notably, both architectures circumvent the limitations of fixed mesh discretization, enabling efficient, mesh-free inference for a variety of parameterized partial differential equations (PDEs).

Despite advancements, existing operator learning frameworks primarily focus on deterministic outputs, with limited exploration into stochastic operators that incorporate noise along trajectories. This paper introduces a novel training strategy that integrates Stochastic Neural Networks (SNNs) with the DeepONet framework to address this gap. The SNN architecture formulates the evolution of hidden layers as a discretized ordinary differential equation (ODE) system, incorporating additive Brownian motion to transition into a stochastic differential equation (SDE) framework. The proposed Stochastic Operator Network (SON) leverages the Stochastic Maximum Principle to formulate a loss function for the backpropagation process, transforming the training into a stochastic optimal control problem. The paper outlines the methodology, presents numerical experiments validating the approach, and compares the performance of SON against the standard DeepONet, demonstrating its effectiveness in handling stochastic operators.

Discussion

The section discusses the Deep Operator Network (DeepONet) and its extension, the Stochastic Operator Network (SON), which incorporates stochastic elements to better model noisy operators. DeepONet is designed to approximate mappings between two Banach spaces by taking an input function \( u \) and a set of evaluation points \( y \), producing an output \( G(u)(y) \). The network’s training involves discretizing the input function \( u \) at fixed sensor locations, allowing for flexibility in sampling strategies. The Universal Approximation Theorem for Operators establishes that a shallow branch-trunk architecture can approximate any continuous operator, while an extension allows for deeper and asymmetric architectures.

The SON enhances DeepONet by replacing its branch network with a Stochastic Neural Network (SNN), which integrates stochastic differential equations to capture the inherent noise in solutions of stochastic differential equations (SDEs). The SNN is formulated as a stochastic optimal control problem, where the training process is guided by the Stochastic Maximum Principle. This framework allows SON to learn noisy operators effectively, as demonstrated through various numerical experiments, including the approximation of antiderivative operators and ordinary differential equations (ODEs) with added noise. The results indicate that SON can accurately capture both the mean and variance of the operator outputs, successfully quantifying uncertainty in the predictions.