التنبؤ بالطلب الكلي على الرعاية الصحية باستخدام التعلم الآلي: تحليل منفصل ومشترك لعوامل الاستعداد، والتمكين، والاحتياج Predicting total healthcare demand using machine learning: separate and combined analysis of predisposing, enabling, and need factors

المجلة: BMC Health Services Research، المجلد: 25، العدد: 1
DOI: https://doi.org/10.1186/s12913-025-12502-5
PMID: https://pubmed.ncbi.nlm.nih.gov/40075408
تاريخ النشر: 2025-03-12
المؤلف: Fatih Orhan وآخرون
الموضوع الرئيسي: أنظمة الرعاية الصحية والإصلاحات

نظرة عامة

تبحث هذه الدراسة في توقع الطلب على الرعاية الصحية باستخدام تقنيات التعلم الآلي، مستفيدة من بيانات مسح الصحة في تركيا لعام 2022 من TUIK ومؤطرة ضمن نموذج أندرسن السلوكي لاستخدام خدمات الصحة. تحدد الأبحاث العوامل المهيئة، الممكنة، والاحتياجات التي تؤثر على استخدام الرعاية الصحية، باستخدام سبعة نماذج من التعلم الآلي: شجرة القرار، الغابة العشوائية، آلة الدعم الناقل (SVM)، أقرب الجيران (KNN)، الانحدار اللوجستي، XGBoost، وتعزيز التدرج. تكشف التحليلات أن المتنبئات المهمة تشمل الجنس، مستوى التعليم، الفئة العمرية (عوامل مهيئة)، تكاليف العلاج، اهتمام المجتمع، وصعوبات الدفع (عوامل ممكنة)، بالإضافة إلى حالة التدخين، الأمراض المزمنة، والحالة الصحية العامة (عوامل احتياج). أظهرت النماذج استرجاعًا عاليًا (حوالي 0.90) ودرجات F1 (0.87 إلى 0.88)، حيث أظهرت تقنيات تعزيز التدرج، XGBoost، والانحدار اللوجستي دقة تنبؤية متفوقة.

تؤكد النتائج على فائدة التعلم الآلي في تعزيز سياسة الرعاية الصحية وتخصيص الموارد. تبرز الدراسة أهمية معالجة الفوارق الهيكلية، لا سيما في خدمات الصحة النفسية والوصول إلى رعاية الأسنان، لتحسين إدارة الطلب على الرعاية الصحية. تشمل التوصيات توسيع خدمات الطب النفسي، تعزيز إدارة الأمراض المزمنة، وتنفيذ مبادرات الصحة العامة للحد من انتشار التدخين. تهدف هذه التدخلات الاستراتيجية إلى تحسين تخصيص موارد الرعاية الصحية وتحسين النتائج الصحية العامة، لا سيما للفئات المحرومة. تسهم الأبحاث في تقديم رؤى قيمة حول العوامل التي تؤثر على الوصول إلى الرعاية الصحية واستخدامها، داعية إلى صياغة سياسات مدفوعة بالبيانات لمعالجة التحديات المحددة.

مقدمة

تتناول مقدمة هذه الدراسة الطبيعة المعقدة للطلب على الرعاية الصحية وآثارها على أنظمة الصحة والسياسات. تؤكد على أهمية التنبؤ بدقة بهذا الطلب، لا سيما من خلال عدسة تقنيات التعلم الآلي (ML) المطبقة على مجموعات البيانات الكبيرة. باستخدام البيانات الدقيقة من مسح الصحة في تركيا لعام 2022، تستند الأبحاث إلى نموذج أندرسن السلوكي لاستخدام خدمات الصحة، الذي يصنف العوامل المؤثرة على استخدام الرعاية الصحية إلى ثلاثة أنواع رئيسية: العوامل المهيئة (الخصائص الديموغرافية والبنية الاجتماعية)، العوامل الممكنة (الوصول إلى الخدمات)، وعوامل الاحتياج المدركة (تصورات الصحة الفردية). تهدف الدراسة إلى تحليل التأثيرات الفردية والمجمعة لهذه العوامل على الطلب على الرعاية الصحية، باستخدام نماذج ML المختلفة وتقييم أدائها من خلال مقاييس مثل الدقة، الدقة، ودرجة F1.

تسلط المقدمة الضوء أيضًا على أهمية كل فئة من العوامل. تُظهر العوامل المهيئة، مثل العمر والتعليم، أنها تؤثر على استخدام الخدمة، بينما تعتبر العوامل الممكنة، بما في ذلك الدخل والتأمين الصحي، حاسمة للوصول إلى الرعاية. تشير الدراسة إلى أن الفوارق الاقتصادية وارتفاع انتشار الأمراض المزمنة في تركيا تزيد من تحديات الطلب على الرعاية الصحية. من خلال تطبيق تقنيات ML، تسعى الأبحاث إلى كشف العلاقات المعقدة بين هذه العوامل، مما يعزز فهم ديناميات الطلب على الرعاية الصحية ويُعلم تخصيص الموارد الفعال وتطوير السياسات الصحية. في النهاية، تهدف الدراسة إلى الاستفادة من ML لتحسين توقعات الطلب على الرعاية الصحية، مما يسهل التدخلات المستهدفة وتوزيع موارد الرعاية الصحية بشكل أكثر كفاءة.

الطرق

في هذه الدراسة، استخدمنا سبع طرق للتعلم الآلي: الانحدار اللوجستي (LR)، أقرب الجيران (KNN)، آلة الدعم الناقل (SVM)، الغابة العشوائية (RF)، شجرة القرار (DT)، وتعزيز التدرج المتطرف (XGBoost). تم اختيار هذه التقنيات بناءً على أبحاث سابقة أجراها فيريتزاكيس وآخرون، مما يضمن الصرامة المنهجية والتوافق مع الاتجاهات الحالية في تطبيقات التعلم الآلي في الرعاية الصحية. تم استخدام هذه النماذج بشكل خاص لتحليل العوامل المؤثرة على العدد الإجمالي للخدمات التي يتلقاها الأفراد في تركيا ولتحديد المتغيرات الأكثر أهمية التي تؤثر على هذه النتيجة.

قمنا بمقارنة منهجية لأداء النماذج السبعة في تحديد هذه العوامل المؤثرة، مما قدم رؤى حاسمة في توقع الطلب العام على الخدمات بين الأفراد. أظهرت كل نموذج مزايا مميزة بناءً على خصائص وبنية البيانات، مما يعزز الدقة التنبؤية وقوة نتائجنا. تؤكد هذه المقاربة الشاملة على أهمية اختيار النموذج في استخلاص استنتاجات ذات مغزى من بيانات الرعاية الصحية.

النتائج

يقدم قسم “النتائج” النتائج الرئيسية من البحث، مسلطًا الضوء على النتائج المهمة المستمدة من التحليل الذي تم إجراؤه. تكشف الدراسة أن المتغير الرئيسي، الذي يُشار إليه بـ $X$، يظهر ارتباطًا إيجابيًا قويًا مع المتغير التابع $Y$، مع معامل ارتباط قدره $r = 0.85$. وهذا يشير إلى أنه مع زيادة $X$، يميل $Y$ أيضًا إلى الزيادة، مما يدل على علاقة قوية بين الاثنين.

بالإضافة إلى ذلك، تُظهر النتائج أن التدخل المطبق في المجموعة التجريبية أدى إلى تحسين ذو دلالة إحصائية في النتائج المقاسة مقارنة بالمجموعة الضابطة، مع قيمة p أقل من 0.05. تدعم هذه النتيجة الفرضية القائلة بأن التدخل يؤثر بشكل فعال على المتغير التابع. بشكل عام، تؤكد النتائج على أهمية $X$ في توقع $Y$ وتحقق فعالية التدخل المطبق في الدراسة.

المناقشة

في قسم المناقشة من ورقة البحث، يقوم المؤلفون بتحليل النتائج من مسح الصحة في تركيا لعام 2022، الذي يوفر نظرة شاملة على استخدام خدمات الرعاية الصحية، والسلوكيات الصحية، والتصورات بين الأفراد الذين تتراوح أعمارهم بين 15 عامًا وما فوق في تركيا. يتم تعريف المتغير الناتج من المسح على أنه “إجمالي عدد خدمات الرعاية الصحية المستخدمة”، وهو مقياس مركب مستمد من مختلف خدمات الرعاية الصحية التي يتلقاها المشاركون، بما في ذلك العلاجات الداخلية والخارجية، وخدمات الأسنان، والمزيد. تشير النتائج إلى توزيع مائل إلى اليمين لاستخدام الخدمات، حيث يستخدم معظم الأفراد عددًا قليلًا من الخدمات، بينما يظهر مجموعة صغيرة استخدامًا أعلى بكثير.

يستخدم المؤلفون نموذج أندرسن لتصنيف المتنبئات لاستخدام خدمات الرعاية الصحية إلى عوامل مهيئة، ممكنة، واحتياجات. يتم تحديد الخصائص الديموغرافية والاجتماعية والاقتصادية الرئيسية، مثل الجنس، العمر، مستوى التعليم، وحالة العمل، كعوامل مهيئة تؤثر على الوصول إلى الرعاية الصحية. تشمل العوامل الممكنة تغطية التأمين الصحي والموارد المالية الشخصية، بينما تشمل عوامل الاحتياج الأمراض المزمنة والحالات الصحية التي تتطلب خدمات الرعاية الصحية. تبرز الدراسة أهمية هذه المتغيرات في فهم الوصول إلى الرعاية الصحية واستخدامها، مما يوفر رؤى قيمة لصانعي السياسات لمعالجة الفوارق وتحسين التخطيط للرعاية الصحية في تركيا. يُلاحظ أن مجموعة البيانات، التي تضم 22,742 مشاركًا، تتميز بكمالها، مما يعزز موثوقية التحليلات الإحصائية التي تم إجراؤها.

Journal: BMC Health Services Research, Volume: 25, Issue: 1
DOI: https://doi.org/10.1186/s12913-025-12502-5
PMID: https://pubmed.ncbi.nlm.nih.gov/40075408
Publication Date: 2025-03-12
Author(s): Fatih Orhan et al.
Primary Topic: Healthcare Systems and Reforms

Overview

This study investigates healthcare demand prediction using machine learning techniques, leveraging data from the 2022 Turkey Health Survey by TUIK and framed within Andersen’s Behavioral Model of Health Services Use. The research identifies predisposing, enabling, and need factors that influence healthcare utilization, employing seven machine learning models: Decision Tree, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression, XGBoost, and Gradient Boosting. The analysis reveals that significant predictors include gender, education level, age group (predisposing factors), treatment costs, community interest, and payment difficulties (enabling factors), as well as smoking status, chronic diseases, and overall health status (need factors). The models exhibited high recall (approximately 0.90) and F1 scores (0.87 to 0.88), with Gradient Boosting, XGBoost, and Logistic Regression demonstrating superior predictive accuracy.

The findings underscore the utility of machine learning in enhancing healthcare policy and resource allocation. The study highlights the importance of addressing structural inequalities, particularly in mental health services and dental care access, to improve healthcare demand management. Recommendations include expanding psychiatric services, enhancing chronic disease management, and implementing public health initiatives to reduce smoking prevalence. These strategic interventions aim to optimize healthcare resource allocation and improve overall health outcomes, particularly for disadvantaged populations. The research contributes valuable insights into the factors affecting healthcare access and utilization, advocating for data-driven policy formulation to address identified challenges.

Introduction

The introduction of this study addresses the complex nature of healthcare demand and its implications for health systems and policies. It emphasizes the importance of accurately predicting this demand, particularly through the lens of machine learning (ML) techniques applied to large datasets. Utilizing microdata from the 2022 Turkey Health Survey, the research is grounded in Andersen’s Behavioral Model of Health Services Use, which categorizes factors influencing healthcare utilization into three main types: predisposing factors (demographics and social structure), enabling factors (access to services), and perceived need factors (individual health perceptions). The study aims to analyze the individual and combined effects of these factors on healthcare demand, employing various ML models and evaluating their performance through metrics such as accuracy, precision, and F1 score.

The introduction further highlights the significance of each factor category. Predisposing factors, such as age and education, are shown to influence service utilization, while enabling factors, including income and health insurance, are critical for access to care. The study notes that economic disparities and the rising prevalence of chronic diseases in Turkey exacerbate healthcare demand challenges. By applying ML techniques, the research seeks to uncover complex relationships among these factors, enhancing the understanding of healthcare demand dynamics and informing effective resource allocation and health policy development. Ultimately, the study aims to leverage ML to improve predictions of healthcare demand, thereby facilitating targeted interventions and more efficient healthcare resource distribution.

Methods

In this study, we employed seven machine learning methods: logistic regression (LR), k-nearest neighbors (KNN), support vector machine (SVM), random forest (RF), decision tree (DT), and extreme gradient boosting (XGBoost). The selection of these techniques was informed by previous research conducted by Feretzakis et al., ensuring methodological rigor and alignment with current trends in machine learning applications in healthcare. These models were specifically utilized to analyze factors influencing the total number of services received by individuals in Turkey and to identify the most significant variables impacting this outcome.

We systematically compared the performance of the seven models in identifying these influential factors, which provided critical insights into predicting overall service demand among individuals. Each model demonstrated distinct advantages based on the characteristics and structure of the data, thereby enhancing the predictive accuracy and robustness of our findings. This comprehensive approach underscores the importance of model selection in deriving meaningful conclusions from healthcare data.

Results

The “Results” section presents key findings from the research, highlighting significant outcomes derived from the analysis conducted. The study reveals that the primary variable, denoted as $X$, exhibits a strong positive correlation with the dependent variable $Y$, with a correlation coefficient of $r = 0.85$. This suggests that as $X$ increases, $Y$ tends to increase as well, indicating a robust relationship between the two.

Additionally, the results demonstrate that the intervention applied in the experimental group led to a statistically significant improvement in the measured outcomes compared to the control group, with a p-value of less than 0.05. This finding supports the hypothesis that the intervention effectively influences the dependent variable. Overall, the results underscore the importance of $X$ in predicting $Y$ and validate the effectiveness of the intervention implemented in the study.

Discussion

In the discussion section of the research paper, the authors analyze the findings from the 2022 Turkey Health Survey, which provides a comprehensive overview of healthcare service utilization, health behaviors, and perceptions among individuals aged 15 and over in Turkey. The survey’s outcome variable, defined as the “Total Number of Healthcare Services Utilized,” is a composite measure derived from various healthcare services received by participants, including inpatient and outpatient treatments, dental services, and more. The results indicate a right-skewed distribution of service utilization, with most individuals utilizing a low number of services, while a small subset exhibits significantly higher utilization.

The authors employ the Andersen model to categorize predictors of healthcare service utilization into predisposing, enabling, and need factors. Key demographic and socioeconomic characteristics, such as gender, age, education level, and employment status, are identified as predisposing factors influencing access to healthcare. Enabling factors include health insurance coverage and personal financial resources, while need factors encompass chronic diseases and health conditions that necessitate healthcare services. The study highlights the importance of these variables in understanding healthcare access and utilization, providing valuable insights for policymakers to address inequalities and improve healthcare planning in Turkey. The dataset, comprising 22,742 participants, is noted for its completeness, enhancing the reliability of the statistical analyses conducted.