خصائص واستنتاج توزيع باريتو لوماكس مع تطبيقات على بيانات حقيقية Properties and inference of the Pareto Lomax distribution with applications to real data

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-026-43273-6
PMID: https://pubmed.ncbi.nlm.nih.gov/41832259
تاريخ النشر: 2026-03-14
المؤلف: Ahmed Z. Afify وآخرون
الموضوع الرئيسي: تقدير التوزيع الإحصائي وتطبيقاته

نظرة عامة

تقدم هذه الورقة توزيع Pareto-Lomax الغريب (OPLx)، وهو امتداد جديد بأربعة معلمات لنموذج Lomax، مصمم لتعزيز نمذجة البيانات الواقعية التي تتميز بسلوكيات مختلفة لمعدل الفشل، بما في ذلك الأنماط المتناقصة، الأحادية القمة، المتزايدة، على شكل حرف J، وعلى شكل حرف J مقلوب. يوضح المؤلفون خصائص نموذج OPLx ويستخدمون ثمانية طرق تقدير، بما في ذلك تقدير الاحتمالية القصوى، لتحديد معاييره. تظهر دراسات المحاكاة فعالية هذه التقديرات، حيث تم تحديد نهج RTAD كالأكثر موثوقية. يُظهر توزيع OPLx أنه يتفوق على النماذج الأخرى المستندة إلى Lomax عند تطبيقه على ثلاثة مجموعات بيانات واقعية، مما يبرز مرونته وقوته في التعامل مع القيم المتطرفة والخصائص ذات الذيل الثقيل.

تشير النتائج إلى أن توزيع OPLx يحقق تقدمًا كبيرًا في نمذجة البيانات في مجالات مثل تحليل الموثوقية، وتقييم المخاطر، وتحليل البقاء، حيث تكافح النماذج التقليدية غالبًا. إن قدرته على التقاط سلوكيات معدل الخطر المعقدة تجعله أداة قيمة للباحثين والممارسين، لا سيما في التطبيقات التي تتضمن مجموعات بيانات غير متوازنة وأحداث نادرة. ستركز الأبحاث المستقبلية على تطبيق توزيع OPLx في التعلم الآلي، خاصة في تحليل البيانات الطبية، حيث يمكن أن يحسن دقة التنبؤ للأحداث النادرة والمتطرفة. تهدف هذه الدراسة إلى الاستفادة من نقاط قوة توزيع OPLx لتعزيز نمذجة التوزيعات ذات الذيل الثقيل، مما قد يؤدي إلى رؤى أكثر موثوقية عبر مجالات مختلفة.

مقدمة

تسلط المقدمة الضوء على الطلب المتزايد على التوزيعات الاحتمالية المرنة والقابلة للتكيف لالتقاط أنماط البيانات المتنوعة بشكل أفضل عبر مجالات مختلفة. يُعتبر توزيع Lomax (Lx)، الذي اقترحه لومكس في البداية، نموذجًا مهمًا لبيانات فشل الأعمال وله تطبيقات في مجالات مثل تفاوت الدخل، وهندسة الموثوقية، وتحليل البقاء بسبب قدرته على نمذجة البيانات ذات الذيل الثقيل. قام الباحثون بتوسيع توزيع Lx من خلال تعميمات مختلفة، مما أدى إلى ظهور العديد من التوزيعات الجديدة التي تعزز من مرونته وقابليته للتطبيق.

تقدم هذه الدراسة توزيع Pareto-Lomax الغريب (OPLx)، الذي يدمج معلمين إضافيين للشكل في نموذج Lx، بناءً على عائلة Pareto-G الغريبة المقترحة من قبل حسين وآخرين. يستنتج المؤلفون العديد من الخصائص الرياضية لتوزيع OPLx ويقدرون معاييره باستخدام تقنيات مختلفة. تُجرى محاكاة شاملة لتقييم أداء هذه التقديرات، ويتم تطبيق توزيع OPLx على ثلاث مجموعات بيانات من الحياة الواقعية لتوضيح فائدته العملية. تم هيكلة الورقة لتقديم توزيع OPLx، وخصائصه، وطرق تقدير المعلمات، ودراسات المحاكاة، والتطبيقات العملية في الأقسام التالية.

طرق

في هذا القسم، يوضح المؤلفون ثمانية طرق متميزة لتقدير المعلمات المطبقة على توزيع OPLx: الاحتمالية القصوى (ML)، المربعات الصغرى (LS)، المربعات الصغرى الموزونة (WLS)، الحد الأقصى لمنتج الفواصل (MPS)، النسب المئوية (PC)، Cramér-von Mises (CRVM)، Anderson-Darling (AD)، ومقدرات Anderson-Darling لذيل اليمين (RAD). يتم اشتقاق تقديرات المعلمات لكل طريقة من خلال تحسين عددي لوظائف الهدف الخاصة بها باستخدام خوارزمية Broyden-Fletcher-Goldfarb-Shanno (BFGS) المطبقة في R. تُستخدم دالة اللوغاريتم الاحتمالي لاشتقاق مقدرات ML للمعلمات $\alpha$، $\beta$، $\theta$، و$\zeta$، بينما يتم الحصول على مقدرات LS من خلال تقليل دالة معينة تتضمن دالة التوزيع التراكمي $F(x; \alpha, \beta, \theta, \zeta)$.

بالإضافة إلى ذلك، يتم حساب مقدرات WLS من خلال تقليل مجموع موزون من الفروق المربعة، ويتم اشتقاق مقدرات CRVM وAD من تقليل دوال تتضمن التوزيع التجريبي. يتم الحصول على مقدرات MPS من خلال تعظيم لوغاريتم الفواصل الموحدة المشتقة من العينة، بينما يتم تحديد مقدرات PC من خلال تقليل دالة تتضمن إحصائيات ترتيب العينة والمعلمات. تدعم كل طريقة معادلاتها المقابلة التي تسهل حساب التقديرات، مما يضمن نهجًا شاملاً لتقدير المعلمات لتوزيع OPLx.

مناقشة

يظهر توزيع OPLx، وهو امتداد جديد بأربعة معلمات لنموذج Lomax، مرونة كبيرة في نمذجة مجموعات البيانات الواقعية المختلفة، لا سيما تلك التي تظهر قيمًا متطرفة أو خصائص ذات ذيل ثقيل. يتم اشتقاق دالة التوزيع التراكمي (CDF) ودالة كثافة الاحتمال (PDF) لتوزيع OPLx من توزيع Lx، مع دمج معلمات شكل إضافية تسمح بمجموعة متنوعة من سلوكيات معدل الفشل. يتم تأكيد فعالية النموذج من خلال محاكاة شاملة وتطبيقات البيانات الحقيقية، حيث تفوق باستمرار على النماذج المنافسة، بما في ذلك التوزيعات التقليدية مثل Weibull وgamma، بالإضافة إلى نماذج أخرى مستندة إلى Lomax.

في تحليل ثلاث مجموعات بيانات متميزة – أوقات الشفاء من سرطان المثانة، وأوقات بقاء خنازير غينيا، وحياة كسر التعب للمواد – حقق توزيع OPLx أدنى القيم عبر إحصائيات جودة الملاءمة المختلفة، مما يشير إلى ملاءمة متفوقة. كانت تقديرات معلمات النموذج مستقرة ودقيقة، مما يعزز موثوقيته للتطبيقات العملية في مجالات مثل البحث الطبي، والهندسة، وعلوم المواد. تشير النتائج إلى أن توزيع OPLx لا يعالج فقط قيود النماذج الحالية، بل يوفر أيضًا إطارًا قويًا لتحليل مجموعات البيانات المعقدة، لا سيما تلك التي تتميز بسلوكيات معدل الفشل غير القياسية. ستستكشف الأبحاث المستقبلية دمج المتغيرات المساعدة وتطبيق توزيع OPLx في سياقات التعلم الآلي، لا سيما في تحليل البيانات الطبية، حيث يكون التقاط الأحداث النادرة والمتطرفة أمرًا حاسمًا.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-026-43273-6
PMID: https://pubmed.ncbi.nlm.nih.gov/41832259
Publication Date: 2026-03-14
Author(s): Ahmed Z. Afify et al.
Primary Topic: Statistical Distribution Estimation and Applications

Overview

This paper presents the odd Pareto-Lomax (OPLx) distribution, a novel four-parameter extension of the Lomax model, designed to enhance the modeling of real-world data characterized by various failure rate behaviors, including decreasing, unimodal, increasing, J-shaped, and reversed-J-shaped patterns. The authors detail the properties of the OPLx model and employ eight estimation methods, including maximum likelihood estimation, to determine its parameters. Simulation studies demonstrate the effectiveness of these estimators, with the RTAD approach identified as the most reliable. The OPLx distribution is shown to outperform other Lomax-based models when applied to three real-world datasets, highlighting its superior flexibility and robustness in handling extreme values and heavy-tailed characteristics.

The findings suggest that the OPLx distribution significantly advances data modeling in fields such as reliability analysis, risk assessment, and survival analysis, where traditional models often struggle. Its capacity to capture complex hazard rate behaviors makes it a valuable tool for researchers and practitioners, particularly in applications involving imbalanced datasets and rare events. Future research will focus on the application of the OPLx distribution in machine learning, especially in medical data analysis, where it can improve prediction accuracy for rare and extreme occurrences. This work aims to leverage the OPLx distribution’s strengths to enhance the modeling of heavy-tailed distributions, potentially leading to more reliable insights across various domains.

Introduction

The introduction highlights the growing demand for flexible and adaptable probability distributions to better capture diverse data patterns across various fields. The Lomax (Lx) distribution, initially proposed by Lomax, serves as a significant model for business failure data and has applications in areas such as income disparity, reliability engineering, and survival analysis due to its capacity to model heavy-tailed data. Researchers have expanded the Lx distribution through various generalizations, resulting in numerous novel distributions that enhance its versatility and applicability.

This study introduces the odd Pareto-Lomax (OPLx) distribution, which incorporates two additional shape parameters into the Lx model, building on the odd Pareto-G family proposed by Hussein et al. The authors derive several mathematical properties of the OPLx distribution and estimate its parameters using different techniques. Comprehensive simulations are conducted to evaluate the performance of these estimators, and the OPLx distribution is applied to three real-life datasets to illustrate its practical utility. The paper is structured to present the OPLx distribution, its properties, parameter estimation methods, simulation studies, and practical applications in subsequent sections.

Methods

In this section, the authors outline eight distinct methods for parameter estimation applied to the OPLx distribution: maximum likelihood (ML), least squares (LS), weighted least squares (WLS), maximum product of spacing (MPS), percentiles (PC), Cramér-von Mises (CRVM), Anderson-Darling (AD), and right-tail Anderson-Darling (RAD) estimators. Each method’s parameter estimates are derived by numerically optimizing their respective objective functions using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm implemented in R. The log-likelihood function is utilized to derive ML estimators for parameters $\alpha$, $\beta$, $\theta$, and $\zeta$, while LS estimators are obtained by minimizing a specific function involving the cumulative distribution function $F(x; \alpha, \beta, \theta, \zeta)$.

Additionally, the WLS estimators are calculated by minimizing a weighted sum of squared differences, and the CRVM and AD estimators are derived from minimizing functions that involve the empirical distribution. The MPS estimators are obtained by maximizing the logarithm of uniform spacings derived from the sample, while the PC estimators are determined by minimizing a function that incorporates the sample order statistics and the parameters. Each method is supported by corresponding equations that facilitate the calculation of the estimators, ensuring a comprehensive approach to parameter estimation for the OPLx distribution.

Discussion

The OPLx distribution, a novel four-parameter extension of the Lomax model, demonstrates significant versatility in modeling various real-world datasets, particularly those exhibiting extreme values or heavy-tailed characteristics. The cumulative distribution function (CDF) and probability density function (PDF) of the OPLx distribution are derived from the Lx distribution, incorporating additional shape parameters that allow for a diverse range of failure rate behaviors. The model’s effectiveness is substantiated through comprehensive simulations and real data applications, where it consistently outperformed competing models, including traditional distributions like Weibull and gamma, as well as other Lomax-based models.

In the analysis of three distinct datasets—bladder cancer remission times, guinea pig survival times, and fatigue fracture life of materials—the OPLx distribution achieved the lowest values across various goodness-of-fit statistics, indicating a superior fit. The model’s parameter estimates were stable and precise, reinforcing its reliability for practical applications in fields such as medical research, engineering, and materials science. The findings suggest that the OPLx distribution not only addresses the limitations of existing models but also provides a robust framework for analyzing complex datasets, particularly those with non-standard failure rate behaviors. Future research will explore the integration of covariates and the application of the OPLx distribution in machine learning contexts, particularly in medical data analysis, where capturing rare and extreme events is crucial.