استدلال متسارع لنماذج الحاويات العشوائية مع ملاحظات جزئية مفرطة التشتت Accelerated inference for stochastic compartmental models with over-dispersed partial observations

المجلة: Statistics and Computing، المجلد: 36، العدد: 3
DOI: https://doi.org/10.1007/s11222-026-10865-1
PMID: https://pubmed.ncbi.nlm.nih.gov/41878292
تاريخ النشر: 2026-03-22
المؤلف: Michael Whitehouse
الموضوع الرئيسي: دراسات وبائية حول COVID-19

نظرة عامة

في هذا القسم، يقدم المؤلفون نهجًا جديدًا لاشتقاق احتمال كثافة مفترض تقريبًا لنماذج الحاويات العشوائية الملاحظة جزئيًا التي تأخذ في الاعتبار التشتت الزائد في الملاحظات. من خلال اعتبار احتمالات الإبلاغ المتغيرة مع الزمن كمتغيرات خفية واستخدام تقريب لابلاس ضمن احتمالات بواسون التقريبية (LawPAL)، يحققون تقريبًا حتميًا سريعًا للاحتمال الهامشي وتوزيعات التصفية. تؤسس الدراسة نتيجة تصفية دقيقة تقاربيًا في سيناريوهات السكان الكبيرة، مما يوضح بفعالية قدرة الطريقة على استعادة حالات المرض الخفية واحتمالات الإبلاغ.

يؤكد المؤلفون منهجيتهم من خلال المحاكاة، حيث يظهر أن مقدر الاحتمال التقريبي الأقصى يعمل بشكل جيد في استعادة الحقيقة الأساسية في سياقات السكان الكبيرة وآفاق الزمن الممتدة. بالإضافة إلى ذلك، يبلغون عن مكاسب كبيرة في الكفاءة الحاسوبية – بترتيب من حيث الحجم – مقارنة بالأساليب التقليدية المعتمدة على احتمالات مونت كارلو التسلسلية، بينما يناقشون أيضًا المقايضات الإحصائية المعنية في تقريبهم. يتم دمج المنهجية بشكل أكبر في لغة البرمجة الاحتمالية ستان، مما يسهل الاستدلال البايزي الآلي ويمكّن من تطوير نموذج عملي باستخدام بيانات من تفشي كوفيد-19 في سويسرا.

مقدمة

تناقش مقدمة هذه الورقة البحثية تطور وتطبيق نماذج الحاويات في علم الأوبئة، والتي تصنف السكان إلى حالات مرضية متميزة لتحليل ديناميات المرض. تم تطويرها في البداية من قبل مكيندريك وآخرين، تستخدم هذه النماذج معدلات الانتقال لوصف الانتقالات بين الحاويات، على الرغم من أنها غالبًا ما تواجه تحديات بسبب البيانات الملاحظة المزعجة، مثل التقليل من الإبلاغ عن الحدوث والانتشار. تبرز الورقة قيود الأساليب الحتمية، وخاصة نماذج المعادلات التفاضلية العادية (ODE)، التي تتجاهل الطبيعة العشوائية لانتقال المرض.

لمعالجة هذه التحديات، تم اقتراح طرق مختلفة تعتمد على المحاكاة، بما في ذلك مونت كارلو التسلسلي (SMC) وحساب بايزي التقريبي، والتي، على الرغم من فعاليتها، تتطلب موارد حاسوبية واسعة وضبطًا معقدًا. يقدم المؤلفون احتمالات الكثافة المفترضة (ADAL)، وهي طريقة فعالة من حيث الحوسبة للاستدلال في نماذج الحاويات العشوائية ذات الزمن المنفصل، قادرة على التعامل مع التقليل من الإبلاغ الثنائي. ومع ذلك، فإن النهج القياسي لـ ADAL يواجه صعوبة مع البيانات المفرطة التشتت. تمتد هذه الورقة لتشمل ADALs لاستيعاب التشتت الزائد في الملاحظات من خلال استخدام تقريب لابلاس لدمج معدلات الإبلاغ الخفية، مما يوفر طريقة حتمية جديدة لتقريب الاحتمال الهامشي دون عبء الحوسبة لطرق أخذ العينات. ستفصل الأقسام اللاحقة إطار النموذج، وتطوير الخوارزمية، والسلوك التقاربي، ومقارنات الأداء من خلال المحاكاة.

نقاش

في هذا القسم، يناقش المؤلفون الإطار الرياضي والمنهجية لنمذجة انتقال المرض باستخدام نموذج حاوي خفي، مع التركيز بشكل خاص على نموذج SEIR (المعرض، المكشوف، المصاب، المتعافي). يتميز النموذج بمجموعة من الحاويات، كل منها يمثل حالة من تقدم المرض، ويستخدم نهجًا عشوائيًا لالتقاط ديناميات انتقال المرض. يتم تهيئة السكان بتوزيع متعدد الحدود، وتتحكم الانتقالات بين الحاويات بمصفوفة عشوائية صفية تعكس احتمالات الانتقال من حالة إلى أخرى بناءً على توزيع السكان الحالي. يؤكد المؤلفون على أهمية نمذجة حدوث المرض بدلاً من الانتشار، حيث أن الأول أكثر قابلية للملاحظة في الممارسة العملية.

تقدم الورقة نهجًا مبتكرًا للاستدلال من خلال منهجية احتمالات الكثافة المفترضة (ADAL)، التي تسمح بالحساب الفعال للاحتمالات في وجود ملاحظات مفرطة التشتت. يحدد المؤلفون خوارزمية تكرارية، تُسمى LawPAL، التي تستفيد من تقريب لابلاس لاشتقاق توزيعات التصفية والاحتمالات الهامشية. تظهر هذه الخوارزمية أنها تستعيد تقاربيًا كل من انتشار المرض الأساسي واحتمالات الإبلاغ مع زيادة حجم السكان. تدعم النتائج النظرية المحاكاة التي توضح فعالية الخوارزمية في تقدير معلمات النموذج، مما يبرز إمكاناتها للتطبيقات العملية في نمذجة الأوبئة. كما يعترف المؤلفون بالمقايضات بين الكفاءة الحاسوبية والدقة الإحصائية المتأصلة في تقريباتهم، مقترحين طرقًا للبحث المستقبلي لمعالجة هذه التحديات.

Journal: Statistics and Computing, Volume: 36, Issue: 3
DOI: https://doi.org/10.1007/s11222-026-10865-1
PMID: https://pubmed.ncbi.nlm.nih.gov/41878292
Publication Date: 2026-03-22
Author(s): Michael Whitehouse
Primary Topic: COVID-19 epidemiological studies

Overview

In this section, the authors present a novel approach to derive an assumed density approximate likelihood for partially observed stochastic compartmental models that account for observational over-dispersion. By treating time-varying reporting probabilities as latent variables and employing Laplace approximations within Poisson Approximate Likelihoods (LawPAL), they achieve a rapid deterministic approximation of the marginal likelihood and filtering distributions. The study establishes an asymptotically exact filtering result in large population scenarios, effectively demonstrating the method’s capability to recover latent disease states and reporting probabilities.

The authors validate their methodology through simulations, showing that the maximum approximate likelihood estimator performs favorably in recovering ground truth in large population and extended time horizon contexts. Additionally, they report significant computational efficiency gains—by an order of magnitude—over traditional sequential Monte Carlo likelihood-based approaches, while also discussing the statistical trade-offs involved in their approximation. The methodology is further integrated into the probabilistic programming language Stan, facilitating automated Bayesian inference and enabling the development of a practical model using data from the Covid-19 outbreak in Switzerland.

Introduction

The introduction of this research paper discusses the evolution and application of compartmental models in epidemiology, which categorize populations into distinct disease states to analyze disease dynamics. Initially developed by McKendrick and others, these models utilize transmission rates to describe transitions between compartments, although they often face challenges due to noisy observational data, such as under-reporting of incidence and prevalence. The paper highlights the limitations of deterministic approaches, particularly the Ordinary Differential Equation (ODE) models, which overlook the stochastic nature of disease transmission.

To address these challenges, various simulation-based methods have been proposed, including Sequential Monte Carlo (SMC) and Approximate Bayesian Computation, which, while effective, require extensive computational resources and complex tuning. The authors introduce the Assumed Density Approximate Likelihoods (ADAL), a computationally efficient method for inference in discrete-time stochastic compartmental models, capable of handling Binomial under-reporting. However, the standard ADAL approach struggles with over-dispersed data. This paper extends ADALs to accommodate observational over-dispersion by employing Laplace approximations to integrate latent reporting rates, thus offering a novel deterministic method for marginal likelihood approximation without the computational burden of sampling methods. Subsequent sections will detail the model framework, algorithm development, asymptotic behavior, and performance comparisons through simulations.

Discussion

In this section, the authors discuss the mathematical framework and methodology for modeling disease transmission using a latent compartmental model, particularly focusing on the SEIR (Susceptible, Exposed, Infected, Removed) model. The model is characterized by a set of compartments, each representing a state of disease progression, and employs a stochastic approach to capture the dynamics of disease transmission. The population is initialized with a multinomial distribution, and transitions between compartments are governed by a row-stochastic matrix that reflects the probabilities of moving from one state to another based on the current population distribution. The authors emphasize the importance of modeling disease incidence rather than prevalence, as the former is more readily observable in practice.

The paper introduces an innovative approach for inference through the Assumed Density Approximate Likelihood (ADAL) methodology, which allows for efficient computation of likelihoods in the presence of over-dispersed observations. The authors outline a recursive algorithm, termed LawPAL, which leverages Laplace approximations to derive filtering distributions and marginal likelihoods. This algorithm is shown to asymptotically recover both the underlying disease prevalence and the reporting probabilities as the population size increases. Theoretical results are supported by simulations demonstrating the algorithm’s effectiveness in estimating model parameters, highlighting its potential for practical applications in epidemiological modeling. The authors also acknowledge the trade-offs between computational efficiency and statistical accuracy inherent in their approximations, suggesting avenues for future research to address these challenges.