مقدرات ليو المعدلة الجديدة للتعامل مع التعدد الخطي في نموذج الانحدار بيتا: المحاكاة والتطبيقات New Modified Liu Estimators to Handle the Multicollinearity in the Beta Regression Model: Simulation and Applications

المجلة: Modern Journal of Statistics، المجلد: 1، العدد: 1
DOI: https://doi.org/10.64389/mjs.2025.01111
تاريخ النشر: 2025-07-12
المؤلف: Ali Hammad وآخرون
الموضوع الرئيسي: طرق ونماذج إحصائية متقدمة

نظرة عامة

تقدم البحث مقدرات ليو المعدلة لنموذج الانحدار بيتا (BRM) بهدف تحسين دقة التقدير في وجود تعدد الارتباط بين المتغيرات التفسيرية. غالبًا ما تصبح مقدرات الاحتمال الأقصى التقليدية (MLE) غير مستقرة وغير فعالة في ظل هذه الظروف. تعزز المقدرات المقترحة مقدر ليو التقليدي من خلال دمج معلمات تحيز مرنة، والتي تتفوق نظريًا على الطرق الحالية. توضح الدراسة من خلال المقارنات النظرية ومحاكاة مونت كارلو أن هذه المقدرات المعدلة لليو تقلل بشكل كبير من تحيز التقدير والتباين، مما يؤدي إلى معاملات انحدار أكثر موثوقية.

تقدم التحقق التجريبي من النتائج النظرية من خلال محاكاة شاملة وتطبيقات في العالم الحقيقي، حيث تظهر المقدرات المعدلة لليو أداءً متفوقًا باستمرار، لا سيما في السيناريوهات التي تتسم بتعدد الارتباط العالي بين المتنبئين. تشير النتائج إلى أن المقدرات المعدلة لليو تحقق قيم خطأ مربع متوسط (MSE) أقل مقارنة بمقدرات MLE، ومقدرات انحدار ريدج المتحيزة (BRRE)، وطرق التقدير المتحيزة الأخرى. بشكل عام، تشير النتائج إلى أن المقدرات المقترحة تمثل حلاً فعالًا لتحليل الانحدار الذي يتضمن متنبئين متداخلين، مما يعزز دقة التقدير في كل من مجموعات البيانات المحاكية والعملية.

مقدمة

تناقش مقدمة الورقة القيود المفروضة على نماذج الانحدار الخطي التقليدية (LRMs) عندما لا يلتزم المتغير التابع بتوزيع طبيعي، لا سيما في الحالات التي تتضمن توزيعات عائلية أسية. في مثل هذه السيناريوهات، يُوصى باستخدام النماذج الخطية العامة (GLMs)، وبشكل خاص نموذج الانحدار بيتا (BRM)، لتحليل المتغيرات المستجيبة المحدودة مثل النسب. بينما يُستخدم تقدير الاحتمال الأقصى (MLE) عادةً لنموذج BRM، فإنه يواجه تحديات تحت تعدد الارتباط، مما قد يؤدي إلى تضخم تباينات المعاملات ويقوض موثوقية تقديرات المعلمات. غالبًا ما تكون الحلول التقليدية لتعدد الارتباط، مثل جمع المزيد من البيانات أو إزالة المتغيرات المرتبطة، غير كافية، خاصة في سياق BRM.

لمعالجة هذه القضايا، تسلط الورقة الضوء على ظهور تقنيات التقدير المتحيز، بما في ذلك انحدار ريدج ومقدر ليو، والتي تقدم تحيزًا محكومًا لتعزيز استقرار التقدير في وجود تعدد الارتباط. يقترح المؤلفون مقدرات ليو المعدلة ذات المعلمة الواحدة والمعلمتين المصممة خصيصًا لـ BRM، بهدف التخفيف من آثار تعدد الارتباط. توضح الدراسة طرقًا منهجية لاختيار المعلمات المثلى وتقارن أداء المقدرات المقترحة ضد الطرق الحالية، بما في ذلك MLE، وانحدار ريدج، ومقدرات ليو. تم هيكلة الورقة لمراجعة توزيع بيتا أولاً والمقدرات المتحيزة الحالية، تليها تطوير المقدّر المقترح، وتصميم المحاكاة، ودراسة حالة عملية لإظهار فائدته.

طرق

توضح قسم المنهجية تطبيق نموذج الانحدار بيتا (BRM) لتحليل الاستجابات المحدودة في الفترة (0,1) باستخدام توزيع بيتا يتميز بمعلمات $\mu$ (المتوسط) و$\theta$ (الدقة). يتم تعريف دالة كثافة الاحتمال على النحو التالي:

$$ f(y; \mu, \theta) = \frac{\Gamma(\theta)}{\Gamma(\mu \theta) \Gamma((1 – \mu) \theta)} y^{\mu \theta – 1} (1 – y)^{(1 – \mu) \theta – 1}, $$

حيث $0 < y < 1$، $0 < \mu < 1$، و$\theta > 0$. يستخدم النموذج دالة ربط لوغاريتمية لربط الاستجابة المتوسطة بالمتغيرات المفسرة، ويستخدم تقدير الاحتمال الأقصى (MLE) لتقدير المعلمات. يتم اشتقاق دالة اللوغاريتم الاحتمالي، ويتم صياغة دالة الدرجة لتسهيل عملية التقدير، عادةً باستخدام تسجيل فيشر أو المربعات الصغرى المعاد وزنها بشكل تكراري (IRLS).

لمعالجة تعدد الارتباط في النماذج الخطية العامة (GLMs)، يناقش القسم إدخال مقدرات ريدج، وبشكل خاص مقدر ريدج بيتا (BRRE) ومقدر ليو بيتا (BLE). يدمج BRRE معلمة ريدج $k$ لاستقرار التقديرات، بينما يستخدم BLE معلمة انكماش $d$ للتخفيف من آثار تعدد الارتباط. يتم اشتقاق الخصائص الإحصائية لكلا المقدرين، بما في ذلك التحيز، والتغاير، وخطأ المربع المتوسط (MSE)، مع تسليط الضوء على أدائهما مقارنةً بـ MLE التقليدية. يُلاحظ أن BLE يتمتع بأداء متفوق في إدارة تعدد الارتباط مقارنةً بـ BRRE.

مناقشة

في هذا القسم، يقدم المؤلفون مقدرات ليو المعدلة ذات المعلمة الواحدة والمعلمتين لنماذج الانحدار بيتا (BRM)، وبشكل خاص مقدر ليو المعدل ذو المعلمة الواحدة بيتا (BMOPLE) ومقدر ليو المعدل ذو المعلمتين بيتا (BMTPLE). يتم تعريف BMOPLE على النحو التالي

\[
\beta_{BMOPLE} = (Q + I)^{-1}(Q – dI) \beta_{ML}, \quad 0 < d < 1, \] حيث $d$ هو معلمة تتحكم في الانكماش. يوضح المؤلفون أن BMOPLE يتفوق على تقدير الاحتمال الأقصى التقليدي (MLE) وطرق أخرى من حيث خطأ المربع المتوسط (MSE) عند معالجة تعدد الارتباط. يتم اشتقاق الخصائص الإحصائية لـ BMOPLE، بما في ذلك التحيز والتغاير، ويتم حساب خطأ المربع المتوسط الأدنى (MMSE) له. بالمثل، يتم تقديم BMTPLE، الذي يُعرف على النحو التالي \[ \beta_{BMTPLE} = (Q + I)^{-1}(Q - (k + d)I) \beta_{ML}, \quad k > 0, \, 0 < d < 1, \] لتحسين توازن التحيز-التباين في الإعدادات المتعددة الارتباط. يقدم المؤلفون مقارنات نظرية لـ MMSE وMSE لهذه المقدرات مقابل الطرق الحالية، مما يثبت كفاءتها من خلال عدة ليمات ونظريات. يختتم القسم بمناقشة حول اختيار معلمات التحيز المثلى، مقترحًا صيغًا متنوعة لكل من $k$ و$d$ بناءً على الأبحاث السابقة. يتم التحقق من أداء المقدرات المقترحة من خلال محاكاة مونت كارلو الواسعة، والتي تكشف أن BMTPLE يتفوق باستمرار على الطرق التقليدية، لا سيما في السيناريوهات ذات تعدد الارتباط العالي. تشير النتائج إلى أن المقدرات المعدلة لليو هي بدائل قوية لنموذج BRM، حيث تقدم دقة تقدير محسنة في كل من التطبيقات المحاكية والعملية.

Journal: Modern Journal of Statistics, Volume: 1, Issue: 1
DOI: https://doi.org/10.64389/mjs.2025.01111
Publication Date: 2025-07-12
Author(s): Ali Hammad et al.
Primary Topic: Advanced Statistical Methods and Models

Overview

The research presents modified Liu estimators for the beta regression model (BRM) aimed at improving estimation accuracy in the presence of multicollinearity among explanatory variables. Traditional maximum likelihood estimators (MLE) often become unstable and inefficient under such conditions. The proposed estimators enhance the traditional Liu estimator by integrating flexible biasing parameters, which theoretically outperform existing methods. The study demonstrates through theoretical comparisons and Monte Carlo simulations that these modified Liu estimators significantly reduce estimation bias and variance, leading to more reliable regression coefficients.

Empirical validation of the theoretical findings is provided through comprehensive simulations and real-world applications, where the modified Liu estimators consistently exhibit superior performance metrics, particularly in scenarios with high multicollinearity among predictors. The results indicate that the modified Liu estimators yield lower mean squared error (MSE) values compared to MLE, biased ridge regression estimators (BRRE), and other biased estimation methods. Overall, the findings suggest that the proposed estimators represent an effective solution for regression analysis involving collinear predictors, enhancing estimation accuracy in both simulated and practical datasets.

Introduction

The introduction of the paper discusses the limitations of traditional linear regression models (LRMs) when the dependent variable does not adhere to a normal distribution, particularly in cases involving exponential family distributions. In such scenarios, generalized linear models (GLMs), specifically the beta regression model (BRM), are recommended for analyzing bounded response variables like proportions. While maximum likelihood estimation (MLE) is commonly used for BRM, it faces challenges under multicollinearity, which can inflate coefficient variances and compromise the reliability of parameter estimates. Traditional solutions to multicollinearity, such as collecting more data or removing correlated variables, often fall short, especially in the context of BRM.

To address these issues, the paper highlights the emergence of biased estimation techniques, including ridge regression and the Liu estimator, which introduce controlled bias to enhance estimation stability in the presence of multicollinearity. The authors propose modified one- and two-parameter Liu estimators tailored for BRM, aiming to mitigate the effects of multicollinearity. The study outlines systematic methods for selecting optimal parameters and compares the performance of the proposed estimators against existing methods, including MLE, ridge, and Liu estimators. The paper is structured to first review the beta distribution and existing biased estimators, followed by the development of the proposed estimator, simulation design, and a practical case study to demonstrate its utility.

Methods

The methodology section outlines the application of the Beta Regression Model (BRM) for analyzing bounded responses in the interval (0,1) using a beta distribution characterized by parameters $\mu$ (mean) and $\theta$ (precision). The probability density function is defined as:

$$ f(y; \mu, \theta) = \frac{\Gamma(\theta)}{\Gamma(\mu \theta) \Gamma((1 – \mu) \theta)} y^{\mu \theta – 1} (1 – y)^{(1 – \mu) \theta – 1}, $$

where $0 < y < 1$, $0 < \mu < 1$, and $\theta > 0$. The model employs a logit link function to relate the mean response to covariates, and maximum likelihood estimation (MLE) is used for parameter estimation. The log-likelihood function is derived, and the score function is formulated to facilitate the estimation process, typically using Fisher scoring or iteratively reweighted least squares (IRLS).

To address multicollinearity in generalized linear models (GLMs), the section discusses the introduction of ridge estimators, specifically the Beta Ridge Estimator (BRRE) and the Beta Liu Estimator (BLE). The BRRE incorporates a ridge parameter $k$ to stabilize estimates, while the BLE employs a shrinkage parameter $d$ to mitigate multicollinearity effects. The statistical properties of both estimators, including bias, covariance, and mean squared error (MSE), are derived, highlighting their performance relative to traditional MLE. The BLE is noted for its superior performance in managing multicollinearity compared to the BRRE.

Discussion

In this section, the authors introduce modified one- and two-parameter Liu estimators for Beta Regression Models (BRM), specifically the Beta modified one-parameter Liu estimator (BMOPLE) and the Beta modified two-parameter Liu estimator (BMTPLE). The BMOPLE is defined as

\[
\beta_{BMOPLE} = (Q + I)^{-1}(Q – dI) \beta_{ML}, \quad 0 < d < 1, \] where $d$ is a parameter controlling shrinkage. The authors demonstrate that BMOPLE outperforms traditional maximum likelihood estimation (MLE) and other methods in terms of mean squared error (MSE) when addressing multicollinearity. The statistical properties of BMOPLE, including bias and covariance, are derived, and its minimum mean squared error (MMSE) is calculated. Similarly, the BMTPLE, defined as \[ \beta_{BMTPLE} = (Q + I)^{-1}(Q - (k + d)I) \beta_{ML}, \quad k > 0, \, 0 < d < 1, \] is introduced to optimize the bias-variance tradeoff in multicollinear settings. The authors provide theoretical comparisons of the MMSE and MSE of these estimators against existing methods, establishing their efficiency through several lemmas and theorems. The section concludes with a discussion on the selection of optimal biasing parameters, proposing various formulations for both $k$ and $d$ based on prior research. The performance of the proposed estimators is validated through extensive Monte Carlo simulations, which reveal that BMTPLE consistently outperforms traditional methods, particularly in high multicollinearity scenarios. The findings suggest that the modified Liu estimators are robust alternatives for BRM, offering improved estimation accuracy in both simulated and real-world applications.