لا حاجة لمركزة المتوسط في تحليلات الانحدار، ومن المحتمل أن تزيد من خطر تفسير المعاملات بشكل غير صحيح Mean centering is not necessary in regression analyses, and probably increases the risk of incorrectly interpreting coefficients

المجلة: Frontiers in Psychology، المجلد: 16
DOI: https://doi.org/10.3389/fpsyg.2025.1634152
PMID: https://pubmed.ncbi.nlm.nih.gov/40741431
تاريخ النشر: 2025-07-16
المؤلف: Lee H. Wurm وآخرون
الموضوع الرئيسي: طرق ونماذج إحصائية متقدمة

نظرة عامة

في السنوات الأخيرة، اعتمد العلماء المألوفون مع تحليل التباين العامل (ANOVAs) بشكل متزايد تقنيات النمذجة الخطية، لا سيما في سياق الانحدار باستخدام المربعات الصغرى العادية (OLS). نقطة خلافية كبيرة في الأدبيات هي ضرورة مركزية المتوسط للمتغيرات المستمرة قبل التحليل، مع تباين التوصيات بشكل واسع بين الكتب الإحصائية. بعض المؤلفين يدعون إلى المركزية، بينما يعتبرها آخرون غير ضرورية، مما يؤدي إلى ارتباك بشأن تفسير معاملات الانحدار من الدرجة الأولى في الانحدار المعتدل. تستعرض هذه الدراسة هذه التوصيات وتوضح المفاهيم الخاطئة، مشددة على أن معاملات الدرجة الأولى لا تمثل التأثيرات الرئيسية بغض النظر عن المركزية.

من خلال عرضين باستخدام نماذج الانحدار OLS، يوضح المؤلفون أن مركزية المتوسط لا تغير التقديرات العددية، أو فترات الثقة، أو قيم الدلالة (t أو p-values) للتأثيرات الرئيسية، أو التفاعلات، أو الحدود التربيعية، شريطة أن يتم تقييم المعاملات بشكل صحيح. كما تنتقد الدراسة استخدام معاملات الانحدار المعيارية (β) وتبرز فوائد معامل الارتباط شبه الجزئي (sr). في النهاية، بينما قد تعزز مركزية المتوسط من قابلية التفسير في بعض الحالات، إلا أنها ليست شرطًا لتحليل دقيق في نماذج OLS مع المتنبئين المستمرين. يختتم المؤلفون بتوصيات عملية للباحثين الذين يتنقلون في هذه القضايا.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على تحول في النهج التحليلي ضمن بعض مجالات البحث، حيث تنتقل من تحليل التباين التقليدي (ANOVAs) إلى تقنيات الانحدار الأكثر مرونة (van Rij et al., 2020; Wurm and Fisicaro, 2014). بينما يعتبر هذا الانتقال مفيدًا لفحص أدوار المتغيرات المستقلة، فإنه يثير أيضًا مخاوف بشأن التطبيق الصحيح وتفسير هذه الأساليب، لا سيما بين الباحثين المدربين أساسًا في ANOVA (Darlington and Hayes, 2017; Hayes et al., 2012; Irwin and McClelland, 2001). تركز الدراسة على مسألة مركزية المتوسط في تحليلات الانحدار، مع معالجة الآراء المتضاربة للخبراء حول ضرورتها وآثارها، خاصة في سياق التفاعل والحدود التربيعية.

تحدد الورقة الفرق بين التأثيرات الرئيسية والتأثيرات الشرطية في الانحدار، مشددة على أن التأثيرات الرئيسية تمثل علاقات ثابتة، بينما تشير التأثيرات التفاعلية إلى أن هذه العلاقات تتغير اعتمادًا على متنبئين آخرين. يدعو المؤلفون إلى نهج تحليل هرمي لتقييم التأثيرات الرئيسية والتفاعلات بشكل منفصل، موضحين أن مركزية المتوسط لا تغير بشكل كبير تقييم هذه التأثيرات. تؤكد النتائج أن قرار مركزية المتنبئين له تأثير ضئيل على تفسير التأثيرات الرئيسية والتفاعلات، مما يعزز أهمية استخدام معادلات الانحدار الكاملة للحصول على تصورات دقيقة وفهم البيانات (Bobko, 2001; Cohen and Cohen, 1983).

طرق البحث

في هذا القسم، يناقش المؤلفون قيود معاملات الانحدار (β) التي تنتجها البرمجيات الإحصائية مثل SPSS وJamovi وJASP، لا سيما في التحليلات التي تشمل مصطلحات التفاعل أو الحدود التربيعية للمتنبئين. يبرزون أن مركزية البيانات تؤثر على قيم β المحسوبة، والتي غالبًا ما تكون غير صحيحة بسبب ترتيب العمليات غير الصحيح الذي تقوم به هذه الحزم البرمجية. على وجه التحديد، عند تحليل التأثيرات التفاعلية، تقوم البرمجيات بتوحيد مصطلحات التفاعل بعد تحويل المتنبئين الفرديين إلى درجات z، مما يؤدي إلى نتائج خاطئة. يجادل المؤلفون بأن النهج الصحيح هو أولاً توحيد المتنبئين الفرديين والمتغير التابع قبل إعادة حساب مصطلحات التفاعل أو الحدود التربيعية للحصول على معاملات غير موحدة دقيقة.

بالإضافة إلى ذلك، يشير المؤلفون إلى انتقادات معاملات الانحدار المعيارية، مشيرين إلى أن بعض الباحثين، مثل سيركين (2006)، يشككون في فائدتها العملية في سياقات التنبؤ. كما يذكرون المخاوف بشأن توحيد المتغيرات الوهمية والعوامل، مشيرين إلى أن هذه الممارسة قد لا تكون مستحسنة. لمساعدة في فهم هذه القضايا، يقدم المؤلفون كود R في المواد التكميلية التي توضح كل من الطرق الصحيحة والخاطئة لحساب هذه المعاملات.

النتائج

تشير النتائج إلى أن مركزية المتوسط قللت بشكل كبير من الارتباطات بين المتنبئين الفرديين ومصطلح التفاعل الخاص بهم، حيث انخفضت الارتباطات من 0.886 و0.926 إلى 0.310 و0.559، على التوالي. كشفت تحليلات الانحدار أن المركزية لم تغير من دلالة تأثير التفاعل، الذي ظل دالًا (p = 0.016) في كل من التحليلات الأصلية والمركزية. أظهرت معادلات الانحدار المشتقة من المتغيرات الأصلية والمركزية أنه بينما تغيرت انحدارات درجة الحرارة والرطوبة اعتمادًا على المركزية، إلا أن تأثير التفاعل العام ظل متسقًا بصريًا عبر التحليلات.

علاوة على ذلك، أظهرت تحليل الانحدار الهرمي للوقت المستغرق في الامتحان نمطًا مشابهًا، حيث انخفض الارتباط بين الدقائق والدقائق² من 0.995 إلى 0.510 بعد المركزية. تغير معامل β للمصطلح التربيعي من -4.136 إلى -0.472، ومع ذلك اعتبرت كلا القيمتين غير صحيحتين، مما يبرز الحاجة إلى تفسير دقيق لهذه المعاملات. تؤكد النتائج أن المركزية تغير تفسير التأثيرات الشرطية دون التأثير على العلاقات الإحصائية الأساسية، مما يعزز الفكرة بأن b₁ وb₂ لا ينبغي أن يُساء فهمها كتأثيرات رئيسية.

المناقشة

تتناول قسم المناقشة في الورقة بشكل نقدي ممارسة مركزية المتوسط في تحليل الانحدار، لا سيما في سياق التعدد الخطي وتفسير معاملات الانحدار. تبرز أن مركزية المتوسط يمكن أن تقلل من الارتباطات بين المتنبئين من الدرجة الأولى ومصطلحات التفاعل أو الحدود التربيعية، إلا أن هذا الانخفاض لا يعالج بالضرورة القضايا الأساسية المتعلقة بالتعدد الخطي الأساسي وغير الأساسي. يشير المؤلفون إلى العديد من العلماء الذين ناقشوا ضرورة وفعالية المركزية، مشيرين إلى أن العديد من الادعاءات حول فوائدها – مثل تحسين قابلية تفسير المعاملات وتقليل الأخطاء المعيارية – مضللة أو غير صحيحة. على سبيل المثال، لا تغير المركزية من الدلالة الإحصائية أو قيم التأثيرات الرئيسية أو التفاعلات، ولا تعزز من قوة الاختبارات الإحصائية.

علاوة على ذلك، تؤكد الورقة أن مركزية المتوسط تغير تفسير معاملات الدرجة الأولى من “عالمية” إلى “محلية”، مما يعني أن هذه المعاملات تمثل تأثيرات شرطية عند قيم محددة من المتنبئين بدلاً من التأثيرات الرئيسية. هذا التحول يعقد تفسير النتائج، لا سيما في النماذج التي تحتوي على تفاعلات. يجادل المؤلفون بأنه بينما قد توفر المركزية بعض المزايا التفسيرية من خلال ضمان أن المعاملات تتوافق مع قيم ذات مغزى، إلا أنها لا تغير بشكل أساسي العلاقات بين المتنبئين. في النهاية، يجب أن يستند قرار المركزية إلى تفضيل المحلل للتصور والتفسير، بدلاً من الاعتقاد في ضرورتها لتقليل التعدد الخطي. يختتم المؤلفون بأن الفوائد المتصورة لمركزية المتوسط هي في الغالب وهمية، ويدعون إلى فهم أوضح لتبعاتها في تحليل الانحدار.

القيود

في قسم “القيود”، يعترف المؤلفون بأن نتائجهم تستند إلى نماذج الانحدار باستخدام المربعات الصغرى العادية (OLS) التي تستخدم متنبئين مستمرين. يؤكدون أن النتائج يجب أن تكون قابلة للتطبيق على النماذج الخطية بشكل أوسع، مشيرين إلى دراسات سابقة (Hayes et al., 2012; Irwin and McClelland, 2001) لدعم هذا الادعاء. ومع ذلك، يلاحظون قيدًا كبيرًا: الاستنتاجات المستخلصة صحيحة فقط لنماذج OLS مع متنبئين مستمرين، وتبقى الآثار بالنسبة للنماذج ذات المتنبئين المتقطعين غير مؤكدة.

يؤكد المؤلفون أنه بينما ستظل النتائج الرياضية متسقة إذا تم التعامل مع درجة الحرارة والرطوبة النسبية كمتغيرات متقطعة، إلا أن مزيدًا من البحث ضروري لتحديد الظروف التي قد لا تكون فيها استنتاجاتهم قابلة للتطبيق. يشير هذا إلى الحاجة إلى استكشاف إضافي لفهم حدود نتائجهم بشكل كامل فيما يتعلق بأنواع مختلفة من المتنبئين.

Journal: Frontiers in Psychology, Volume: 16
DOI: https://doi.org/10.3389/fpsyg.2025.1634152
PMID: https://pubmed.ncbi.nlm.nih.gov/40741431
Publication Date: 2025-07-16
Author(s): Lee H. Wurm et al.
Primary Topic: Advanced Statistical Methods and Models

Overview

In recent years, scholars familiar with factorial ANOVAs have increasingly adopted linear modeling techniques, particularly in the context of ordinary least squares (OLS) regression. A significant point of contention in the literature is the necessity of mean centering continuous variables before analysis, with recommendations varying widely among statistical textbooks. Some authors advocate for centering, while others deem it unnecessary, leading to confusion regarding the interpretation of first-order regression coefficients in moderated regression. This study reviews these recommendations and clarifies misconceptions, emphasizing that first-order coefficients do not represent main effects regardless of centering.

Through two demonstrations using OLS regression models, the authors illustrate that mean centering does not alter the numeric estimates, confidence intervals, or significance values (t or p-values) for main effects, interactions, or quadratic terms, provided that the coefficients are assessed correctly. The study also critiques the use of standardized regression coefficients (β) and highlights the benefits of the semipartial correlation coefficient (sr). Ultimately, while mean centering may enhance interpretability in some cases, it is not a requisite for accurate analysis in OLS models with continuous predictors. The authors conclude with practical recommendations for researchers navigating these issues.

Introduction

The introduction of this research paper highlights a shift in the analytical approach within certain research domains, moving from traditional factorial ANOVAs to more flexible regression-based techniques (van Rij et al., 2020; Wurm and Fisicaro, 2014). While this transition is beneficial for examining the roles of independent variables, it also raises concerns regarding the proper application and interpretation of these methods, particularly among researchers trained primarily in ANOVA (Darlington and Hayes, 2017; Hayes et al., 2012; Irwin and McClelland, 2001). The study focuses on the issue of mean centering in regression analyses, addressing conflicting expert opinions on its necessity and implications, especially in the context of interaction and polynomial terms.

The paper delineates the distinction between main effects and conditional effects in regression, emphasizing that main effects represent constant relationships, while interaction effects indicate that these relationships vary depending on other predictors. The authors advocate for a hierarchical analysis approach to assess main effects and interactions separately, demonstrating that mean centering does not significantly alter the assessment of these effects. The findings underscore that the decision to center predictors has minimal impact on the interpretation of main effects and interactions, reinforcing the importance of using full regression equations for accurate visualizations and understanding of the data (Bobko, 2001; Cohen and Cohen, 1983).

Methods

In this section, the authors discuss the limitations of regression coefficients (β) produced by statistical software such as SPSS, Jamovi, and JASP, particularly in analyses involving interaction terms or polynomial terms of predictors. They highlight that centering the data influences the computed β values, which are often incorrect due to the improper order of operations performed by these software packages. Specifically, when analyzing interaction effects, the software standardizes the interaction terms after converting the individual predictors into z-scores, leading to erroneous results. The authors argue that the correct approach is to first standardize the individual predictors and the dependent variable before recomputing the interaction or polynomial terms to obtain accurate unstandardized coefficients.

Additionally, the authors reference critiques of standardized regression coefficients, noting that some researchers, such as Sirkin (2006), question their practical utility in prediction contexts. They also cite concerns regarding the standardization of dummy variables and factors, suggesting that the practice may not be advisable. To aid in understanding these issues, the authors provide R code in the Supplementary materials that demonstrates both the correct and incorrect methods for computing these coefficients.

Results

The results indicate that mean centering significantly reduced the correlations between individual predictors and their interaction term, with correlations dropping from 0.886 and 0.926 to 0.310 and 0.559, respectively. The regression analyses revealed that centering did not alter the significance of the interaction effect, which remained significant (p = 0.016) in both original and centered analyses. The regression equations derived from the original and centered variables demonstrated that while the slopes for temperature and humidity changed depending on the centering, the overall interaction effect remained visually consistent across analyses.

Moreover, the hierarchical regression analysis of time spent on the exam showed a similar pattern, with the correlation between Minutes and Minutes² decreasing from 0.995 to 0.510 after centering. The β coefficient for the quadratic term changed from -4.136 to -0.472, yet both values were deemed incorrect, highlighting the need for careful interpretation of these coefficients. The findings emphasize that centering alters the interpretation of conditional effects without affecting the underlying statistical relationships, reinforcing the notion that b₁ and b₂ should not be misconstrued as main effects.

Discussion

The discussion section of the paper critically examines the practice of mean centering in regression analysis, particularly in the context of multicollinearity and the interpretation of regression coefficients. It highlights that while mean centering can reduce correlations between first-order predictors and their interaction or polynomial terms, this reduction does not necessarily address the underlying issues of essential versus non-essential collinearity. The authors reference various scholars who have debated the necessity and effectiveness of centering, noting that many claims about its benefits—such as improved interpretability of coefficients and reduced standard errors—are misleading or incorrect. For instance, centering does not alter the statistical significance or values of main effects or interactions, nor does it enhance the power of statistical tests.

Furthermore, the paper emphasizes that mean centering changes the interpretation of first-order coefficients from “global” to “local,” meaning that these coefficients represent conditional effects at specific values of the predictors rather than main effects. This shift complicates the interpretation of results, particularly in models with interactions. The authors argue that while centering may provide some interpretational advantages by ensuring that coefficients correspond to meaningful values, it does not fundamentally change the relationships among the predictors. Ultimately, the decision to center should be based on the analyst’s preference for visualization and interpretation, rather than on a belief in its necessity for reducing multicollinearity. The authors conclude that the perceived benefits of mean centering are largely illusory, and they advocate for a clearer understanding of its implications in regression analysis.

Limitations

In the “Limitations” section, the authors acknowledge that their findings are based on Ordinary Least Squares (OLS) regression models utilizing continuous predictors. They assert that the results should be applicable to linear models more broadly, referencing previous studies (Hayes et al., 2012; Irwin and McClelland, 2001) to support this claim. However, they note a significant limitation: the conclusions drawn are strictly valid for OLS models with continuous predictors, and the implications for models with discrete predictors remain uncertain.

The authors emphasize that while the mathematical outcomes would remain consistent if temperature and relative humidity were treated as discrete variables, further research is necessary to delineate the conditions under which their conclusions may not be applicable. This indicates a need for additional exploration to fully understand the boundaries of their findings in relation to different types of predictors.