تحليل التعلم الآلي للكشف عن الاكتئاب والقلق والتوتر لدى مرضى COVID-19 المتعافين An Analysis of Machine Learning for Detecting Depression, Anxiety, and Stress of Recovered COVID-19 Patients

المجلة: Journal of Human Earth and Future، المجلد: 5، العدد: 1
DOI: https://doi.org/10.28991/hef-2024-05-01-01
تاريخ النشر: 2024-02-24
المؤلف: Tuan Anh Tran وآخرون
الموضوع الرئيسي: الصحة النفسية من خلال الكتابة

نظرة عامة

تستكشف هذه الدراسة نماذج التعلم الآلي المختلفة، وتحديداً الجار الأقرب (KNN)، والبيرسيبترون متعدد الطبقات (MLP)، وآلة الدعم الناقل (SVM)، لتحديد النهج الأكثر فعالية للكشف السريع والدقيق عن مشاكل الصحة النفسية بين المرضى المتعافين من COVID-19. باستخدام مجموعة بيانات تتضمن عوامل تتعلق بالاكتئاب والقلق والتوتر، تفحص البحث أيضاً تقنيات اختيار الميزات، وهي الإزالة التكرارية للميزات (RFE) والأشجار الإضافية (ET)، جنباً إلى جنب مع ضبط المعلمات الفائقة لتعزيز الأداء التنبؤي. تشير النتائج إلى أن SVM يتفوق على النماذج الأخرى، محققاً دقة لا تقل عن 0.984، بينما تم تحديد ET كأفضل طريقة لاختيار الميزات لبيانات القلق والتوتر.

تساهم الدراسة في مجال نمذجة الصحة النفسية التنبؤية من خلال تقديم رؤى حول اختيار تقنيات التعلم الآلي المناسبة وطرق اختيار الميزات للكشف المبكر عن حالات الصحة النفسية. تسلط النتائج الضوء على إمكانية دمج المؤشرات الفسيولوجية والبيوكيميائية مع بيانات الاستطلاع لتطوير استراتيجيات تدخل شخصية بناءً على توقعات التعلم الآلي. تهدف الأعمال المستقبلية إلى دمج هذه المؤشرات الإضافية لتحسين دعم الصحة النفسية للأفراد المتعافين من COVID-19.

مقدمة

لقد أثر وباء COVID-19، الناجم عن SARS-CoV-2، بشكل عميق على الصحة العالمية، خاصة بين المرضى المتعافين، الذين يُشار إليهم غالبًا باسم “المسافرين الطويلين”. غالبًا ما يعاني هؤلاء الأفراد من أعراض مستمرة تمتد إلى ما بعد المرحلة الحادة من المرض، حيث تظهر مشاكل الصحة النفسية مثل الاكتئاب والقلق والتوتر (DAS) كقضايا مهمة. تشمل أعراض الاكتئاب الحزن المستمر واستنفاد الطاقة، بينما يتميز القلق بالخوف وصعوبة التركيز. يظهر التوتر كتوتر عاطفي، مما يؤدي إلى انخفاض الطاقة والقلق. إن التعرف على هذه الأعراض أمر ضروري للتشخيص السريري الدقيق وتطوير استراتيجيات دعم الصحة النفسية الفعالة.

تتناول هذه الدراسة الفجوة في البحث بشأن الصحة النفسية للمرضى المتعافين من COVID-19 من خلال استخدام تقنيات التعلم الآلي (ML) للتنبؤ بمستويات DAS. تستخدم الدراسة بشكل محدد نماذج الجار الأقرب (KNN)، والبيرسيبترون متعدد الطبقات (MLP)، وآلة الدعم الناقل (SVM) لتحديد النموذج التنبؤي الأكثر دقة. يتم تطبيق طرق اختيار الميزات، بما في ذلك الإزالة التكرارية للميزات (RFE) والأشجار الإضافية (ET)، لتعزيز أداء النموذج. تستخدم الدراسة التحقق المتقاطع k-fold (مع \( k = 10 \)) لضمان تقييم قوي لدقة النموذج، بهدف الابتكار في تحديد نماذج ML المثلى للتنبؤ بنتائج الصحة النفسية في هذه الفئة السكانية.

الطرق

في هذا القسم، تركز منهجية البحث على اختيار الميزات وضبط المعلمات الفائقة لنماذج التعلم الآلي التي تهدف إلى التنبؤ بالاكتئاب لدى المرضى المتعافين من COVID-19. حددت طريقة الإزالة التكرارية للميزات (RFE) 14 ميزة مهمة، بما في ذلك انخفاض تقدير الذات (Q17) وصعوبة بدء المهام (Q5)، محققة دقة متوسطة تبلغ 0.848. في المقابل، أبرزت طريقة الأشجار الإضافية (ET) 11 ميزة، مع دقة متوسطة أعلى قليلاً تبلغ 0.851. تم استخدام الميزات المختارة من كلا الطريقتين لتحسين المعلمات الفائقة لثلاثة نماذج تعلم آلي: الجار الأقرب (KNN)، والبيرسيبترون متعدد الطبقات (MLP)، وآلة الدعم الناقل (SVM).

أشارت النتائج إلى أن نموذج SVM، الذي يستخدم الميزات المختارة بواسطة طريقة ET، حقق أعلى دقة تبلغ 0.984، مما يجعله النموذج الأمثل لتنبؤ الاكتئاب. كما قدم نموذج MLP أداءً جيدًا، محققًا دقة تبلغ 0.980 ودرجة F1 تبلغ 0.915. تم ضبط المعلمات الفائقة لكل نموذج بدقة، حيث استخدم KNN خوارزمية “brute” و5 جيران، بينما استخدم MLP دالة تنشيط “identity” مع 100 طبقة مخفية، وتم تكوين SVM مع نواة خطية ومعامل C مضبوط على 1. تؤكد هذه النتائج فعالية طريقة ET في اختيار الميزات وتفوق نموذج SVM في التنبؤ بدقة بمستويات الاكتئاب بين السكان المدروسين.

النتائج

يقدم قسم “النتائج” من ورقة البحث النتائج الرئيسية المستمدة من التجارب أو التحليلات التي تم إجراؤها. عادةً ما يتضمن بيانات كمية، وتحليلات إحصائية، وتمثيلات بصرية مثل الرسوم البيانية أو الجداول التي توضح نتائج الدراسة. غالبًا ما تتم مقارنة النتائج مع الفرضيات الأولية أو الأدبيات الموجودة لتسليط الضوء على الاتجاهات أو الارتباطات أو الشذوذات المهمة.

في هذا القسم، قد يقدم المؤلفون مقاييس محددة، مثل المتوسطات والانحرافات المعيارية أو قيم p، لدعم ادعاءاتهم. بالإضافة إلى ذلك، يتم مناقشة أي علاقات ملحوظة بين المتغيرات، فضلاً عن آثارها، لتوفير فهم شامل لنتائج البحث. بشكل عام، يخدم هذا القسم في التحقق من أهداف البحث ويساهم في المجال الأوسع للدراسة من خلال تقديم رؤى جديدة أو تأكيد النظريات الموجودة.

المناقشة

في هذه الدراسة، تم تطوير إطار عمل للتعلم الآلي (ML) للكشف عن الاكتئاب والقلق والتوتر لدى المرضى المتعافين من COVID-19 باستخدام مجموعة بيانات تضم 549 فردًا من محافظة دونغ ثاب، فيتنام. التزمت الدراسة بالإرشادات الأخلاقية وشملت تحليلًا شاملاً للعوامل الاجتماعية والديموغرافية، والحالات الصحية الأساسية، وخصائص الصحة النفسية. يتكون الإطار من أربع مراحل: معالجة البيانات المسبقة، واختيار الميزات، وضبط المعلمات الفائقة، واختيار نموذج التنبؤ الأمثل. من الجدير بالذكر أن نموذج آلة الدعم الناقل (SVM) أظهر أداءً متفوقًا عبر جميع مجموعات البيانات، محققًا دقة تبلغ 0.984 للاكتئاب، و1.00 للقلق، و0.991 للتوتر، مما يدل على فعاليته في التنبؤ بحالات الصحة النفسية.

تؤكد النتائج الأدبيات الموجودة التي تسلط الضوء على انتشار مشاكل الصحة النفسية بين الأفراد المتعافين من COVID-19، خاصة فيما يتعلق بالحالات الصحية الأساسية مثل ارتفاع ضغط الدم والسكري. بينما لم تظهر العوامل الاجتماعية والديموغرافية وعوامل COVID كعوامل حاسمة لدقة النموذج، إلا أن تضمينها لا يزال أسفر عن نتائج مهمة، مما يشير إلى تأثير محتمل على الرفاهية النفسية. ومع ذلك، تعترف الدراسة بالقيود، بما في ذلك نهج جمع البيانات من مرحلة واحدة، والذي قد يت overlook الديناميات الزمنية لحالات الصحة النفسية. يُشجع على إجراء أبحاث مستقبلية لاستكشاف أبعاد إضافية للصحة النفسية ودمج المؤشرات الفسيولوجية والبيوكيميائية لتعزيز الدقة التنبؤية ودعم استراتيجيات التدخل الشخصية.

Journal: Journal of Human Earth and Future, Volume: 5, Issue: 1
DOI: https://doi.org/10.28991/hef-2024-05-01-01
Publication Date: 2024-02-24
Author(s): Tuan Anh Tran et al.
Primary Topic: Mental Health via Writing

Overview

This study investigates various machine learning models, specifically K-nearest neighbor (KNN), Multilayer Perceptron (MLP), and Support Vector Machine (SVM), to determine the most effective approach for rapid and accurate detection of mental health issues among recovered COVID-19 patients. Utilizing a dataset that includes factors related to depression, anxiety, and stress, the research also examines feature selection techniques, namely Recursive Feature Elimination (RFE) and Extra Trees (ET), alongside hyper-parameter tuning to enhance predictive performance. The findings indicate that SVM outperforms the other models, achieving an accuracy of at least 0.984, while ET is identified as the superior feature selection method for anxiety and stress data.

The study contributes to the field of mental health predictive modeling by providing insights into the selection of appropriate machine learning techniques and feature selection methods for early detection of mental health conditions. The results highlight the potential of integrating physiological and biochemical markers with survey data to develop personalized intervention strategies based on machine learning predictions. Future work aims to incorporate these additional markers to further improve mental health support for individuals recovering from COVID-19.

Introduction

The COVID-19 pandemic, caused by SARS-CoV-2, has profoundly impacted global health, particularly among recovered patients, often referred to as “long-haulers.” These individuals frequently experience persistent symptoms that extend beyond the acute phase of the illness, with mental health issues such as depression, anxiety, and stress (DAS) emerging as significant concerns. Symptoms of depression include persistent sadness and energy depletion, while anxiety is characterized by fear and difficulty concentrating. Stress manifests as emotional tension, leading to low energy and agitation. Recognizing these symptoms is essential for accurate clinical diagnosis and the development of effective mental health support strategies.

This study addresses the gap in research regarding the mental health of recovered COVID-19 patients by employing machine learning (ML) techniques to predict levels of DAS. Specifically, the study utilizes K-nearest neighbor (KNN), Multilayer Perceptron (MLP), and Support Vector Machine (SVM) models to identify the most accurate predictive model. Feature selection methods, including Recursive Feature Elimination (RFE) and Extra Trees (ET), are applied to enhance model performance. The study employs k-fold cross-validation (with \( k = 10 \)) to ensure robust assessment of model accuracy, ultimately aiming to innovate in the identification of optimal ML models for predicting mental health outcomes in this population.

Methods

In this section, the research methodology focuses on feature selection and hyper-parameter tuning for machine learning models aimed at predicting depression in recovered COVID-19 patients. The Recursive Feature Elimination (RFE) method identified 14 significant features, including low self-worth (Q17) and difficulty initiating tasks (Q5), achieving a mean accuracy of 0.848. In contrast, the Extra Trees (ET) method highlighted 11 features, with a slightly higher mean accuracy of 0.851. The selected features from both methods were utilized to optimize hyper-parameters for three machine learning models: K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), and Support Vector Machine (SVM).

The results indicated that the SVM model, utilizing features selected by the ET method, yielded the highest accuracy of 0.984, making it the optimal model for depression prediction. The MLP model also performed well, achieving an accuracy of 0.980 and an F1-score of 0.915. The hyper-parameters for each model were meticulously tuned, with KNN using a ‘brute’ algorithm and 5 neighbors, MLP employing an ‘identity’ activation function with 100 hidden layers, and SVM configured with a linear kernel and C parameter set to 1. These findings underscore the effectiveness of the ET method in feature selection and the superiority of the SVM model in accurately predicting depression levels among the studied population.

Results

The “Results” section of the research paper presents the key findings derived from the conducted experiments or analyses. It typically includes quantitative data, statistical analyses, and visual representations such as graphs or tables that illustrate the outcomes of the study. The results are often compared against the initial hypotheses or existing literature to highlight significant trends, correlations, or anomalies.

In this section, the authors may report specific metrics, such as means, standard deviations, or p-values, to substantiate their claims. Additionally, any observed relationships between variables, as well as their implications, are discussed to provide a comprehensive understanding of the research outcomes. Overall, this section serves to validate the research objectives and contributes to the broader field of study by offering new insights or confirming existing theories.

Discussion

In this study, a machine learning (ML) framework was developed to detect depression, anxiety, and stress in recovered COVID-19 patients using a dataset of 549 individuals from Dong Thap province, Vietnam. The research adhered to ethical guidelines and involved a comprehensive analysis of sociodemographic factors, underlying health conditions, and mental health attributes. The framework consisted of four phases: data pre-processing, feature selection, hyper-parameter tuning, and optimal prediction model selection. Notably, the Support Vector Machine (SVM) model demonstrated superior performance across all datasets, achieving accuracies of 0.984 for depression, 1.00 for anxiety, and 0.991 for stress, indicating its effectiveness in predicting mental health conditions.

The findings corroborate existing literature that highlights the prevalence of mental health issues among individuals recovering from COVID-19, particularly in relation to underlying health conditions such as hypertension and diabetes. While sociodemographic and COVID-related factors did not emerge as critical for model accuracy, their inclusion still yielded significant results, suggesting a potential influence on mental well-being. However, the study acknowledges limitations, including the single-phase data collection approach, which may overlook the temporal dynamics of mental health conditions. Future research is encouraged to explore additional dimensions of mental health and integrate physiological and biochemical markers to enhance predictive accuracy and support personalized intervention strategies.