الكشف عن الاكتئاب في منشورات وسائل التواصل الاجتماعي باستخدام نماذج قائمة على المحولات وميزات مساعدة Depression detection in social media posts using transformer-based models and auxiliary features

المجلة: Social Network Analysis and Mining، المجلد: 14، العدد: 1
DOI: https://doi.org/10.1007/s13278-024-01360-4
تاريخ النشر: 2024-09-28
المؤلف: Marios Kerasiotis وآخرون
الموضوع الرئيسي: الصحة النفسية من خلال الكتابة

نظرة عامة

تتناول هذه الدراسة القضية الحرجة لاكتشاف الاكتئاب في منشورات وسائل التواصل الاجتماعي، مع تسليط الضوء على قيود خوارزميات التعلم الآلي التقليدية في التقاط الأنماط النصية المعقدة. لتعزيز الدقة والموثوقية، تقترح الدراسة بنية شبكة عصبية تدمج نماذج قائمة على المحولات، وبالتحديد DistilBERT، مع البيانات الوصفية وعلامات لغوية. من خلال استخراج المعلومات من آخر أربع طبقات من المحول واستخدام طبقات الإسقاط لتخفيف الإفراط في التكيف، يحقق النموذج دقة مرجحة، واسترجاع، ودرجات F1 بنسبة 84.26%، 84.18%، و84.15%، على التوالي. ومن الجدير بالذكر أن تقنيات زيادة البيانات، المستوحاة من زيادة البيانات السهلة (EDA)، تحسن الأداء بشكل كبير، مما يزيد من درجة F1 المرجحة من 72.59% إلى 84.15%.

في الختام، تقدم الدراسة نهجًا شاملاً للتعرف المبكر على مستويات شدة الاكتئاب في محتوى وسائل التواصل الاجتماعي، متجاوزة التحديات الشائعة مثل الإفراط في التكيف والتحيزات في مجموعة البيانات. لا يستفيد النموذج المقترح فقط من BERT لتحليل لغوي مفصل، بل يدمج أيضًا ميزات مساعدة لتعزيز الفهم. من خلال التجارب الواسعة، يتفوق النموذج على الطرق الحالية الرائدة بحوالي 10.94% في درجة F1 المرجحة. تشير النتائج إلى أن النموذج يمكن أن يسهل المراقبة في الوقت الحقيقي للصحة العقلية على وسائل التواصل الاجتماعي، مما قد يحول استراتيجيات التدخل ويعزز السياسات الصحية العامة. تسهم هذه الدراسة بشكل كبير في مجال الوعي بالصحة العقلية والتدخل المبكر، مقدمة إطار عمل قوي للدراسات المستقبلية في اكتشاف الاكتئاب باستخدام معالجة اللغة الطبيعية وتقنيات التعلم العميق.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على الزيادة الكبيرة في اضطرابات الصحة العقلية، وخاصة الاكتئاب، التي تفاقمت بسبب جائحة COVID-19، حيث تشير بيانات منظمة الصحة العالمية إلى زيادة بنسبة 26% في القلق وزيادة بنسبة 28% في حالات الاكتئاب. تؤكد الورقة على إمكانية وسائل التواصل الاجتماعي كمصدر غني للبيانات لفهم قضايا الصحة العقلية، مشيرة إلى التقدم في نماذج التعلم الآلي لاكتشاف الاكتئاب في منشورات وسائل التواصل الاجتماعي. ومع ذلك، غالبًا ما تكافح خوارزميات التعلم الآلي التقليدية مع تعقيدات البيانات النصية المتعلقة بالصحة العقلية، مما يؤدي إلى أداء دون المستوى وقضايا تعميم.

لمعالجة هذه القيود، يقترح المؤلفون بنية شبكة عصبية جديدة تستفيد من نموذج المحول distilbert-base-uncased. تعزز هذه البنية اكتشاف الاكتئاب من خلال استخراج المعلومات من آخر أربع طبقات من المحول، ودمج التضمينات السياقية مع البيانات الوصفية الإضافية وعلامات لغوية. يتضمن النموذج طبقات إسقاط لمنع الإفراط في التكيف ويستخدم شبكة عصبية متعددة الطبقات (MLP) للتصنيف. تشمل المساهمات الرئيسية للبحث نهج زيادة البيانات لإدارة مجموعات البيانات غير المتوازنة، وطريقة لتقييم شدة الاكتئاب، وتقنية المتوسط المرجح لاستخراج الحالات المخفية، وسلسلة من تجارب الإزالة للتحقق من فعالية البنية المقترحة. يهدف هذا النهج المبتكر إلى تحسين دقة تحديد المنشورات الاكتئابية على وسائل التواصل الاجتماعي، مما يسهم في تقديم رؤى قيمة في مجال اكتشاف الصحة العقلية.

الطرق

في هذا القسم، يتم تفصيل المنهجية لإعداد تجربة نموذج شبكة عصبية. تم تنفيذ التنفيذ باستخدام دفاتر بايثون على Google Colab وKaggle، مع استخدام مكتبات مثل Transformers من Hugging Face، وPyTorch، وnlpaug لزيادة النص. تم إجراء تقييم الأداء باستخدام مقاييس من مكتبة scikit-learn. لتعزيز الكفاءة الحاسوبية، تم استخدام وحدات معالجة الرسوميات T4 وP100، مع إجراء التدريب على مدى 11 دورة وحجم دفعة قدره 8، باستخدام 80% من مجموعة البيانات للتدريب. تم تدريب النموذج واختباره عبر خمس جولات، مع الإبلاغ عن النتائج كمتوسط والانحراف المعياري.

شمل عملية التحسين دالة خسارة الانتروبيا المتقاطعة ومحسن آدم بمعدل تعلم قدره \(1 \times 10^{-5}\). تضمنت المعلمات الفائقة الرئيسية أبعاد DistilBERT قدرها 768، وثلاث طبقات MLP (خطية، ReLU، خطية) بحجم مخفي قدره 512، ومعدل إسقاط قدره 0.1. سهلت هذه الإعدادات التجريبية الدقيقة تقييمًا شاملاً لقدرة النموذج على تصنيف المنشورات الاكتئابية، مما يظهر فعالية نهج الشبكة العصبية المقترح. تم تقديم ملخص لقيم المعلمات الفائقة في الجدول II.

النتائج

في قسم النتائج من الورقة البحثية، يقدم المؤلفون تقييمًا شاملاً لنموذجهم المقترح لاكتشاف أعراض الاكتئاب في محتوى وسائل التواصل الاجتماعي. حقق النموذج، الذي يستخدم رأس تصنيف MLP مع 512 طبقة مخفية وبنية ‘distilbert-base-uncased’، دقة مرجحة قدرها 84.26%، واسترجاع مرجح قدره 84.18%، ودرجة F1 مرجحة قدرها 84.15%. يمثل هذا الأداء تحسينًا كبيرًا قدره حوالي 23.86% في درجة F1 المرجحة مقارنة بأسوأ نموذج أساسي، وحوالي 10.99% مقارنة بأفضل نموذج أساسي. تظهر المقارنات مع النماذج الحالية، مثل M-MentalBERT وتكوينات BERT المختلفة، أن النموذج المقترح يتفوق على هذه النماذج الأساسية عبر جميع المقاييس، مما يبرز قوته في اكتشاف شدة الاكتئاب.

بالإضافة إلى ذلك، تشير النتائج إلى أن طرق زيادة البيانات ساهمت بشكل إيجابي في أداء النموذج، مع زيادات قدرها 12.26% في الدقة المرجحة، و10.75% في الاسترجاع، و11.52% في درجة F1. تكشف النتائج أيضًا أن النهج المقترح يتجاوز النماذج الأساسية التنافسية في الدقة المرجحة (بنسبة 9.78-11.27%)، والاسترجاع (بنسبة 10.95-14.10%)، ودرجة F1 (بنسبة 10.99-14.19%). توضح مصفوفة الالتباس المقدمة دقة تصنيف النموذج عبر أربعة مستويات من شدة الاكتئاب، كاشفة عن نقاط القوة ومجالات التحسين في التمييز بين هذه المستويات. بشكل عام، تسهم الدراسة في تقديم رؤى قيمة حول فعالية تقنيات التعلم الآلي المتقدمة في تقييم الصحة العقلية في سياقات وسائل التواصل الاجتماعي.

المناقشة

تسلط قسم المناقشة في الورقة البحثية الضوء على المشهد المتطور لتقنيات التعلم الآلي ومعالجة اللغة الطبيعية (NLP) المستخدمة لاكتشاف الاكتئاب من خلال تحليل وسائل التواصل الاجتماعي. استخدمت دراسات متنوعة خوارزميات التعلم الآلي التقليدية والشبكات العصبية لتحليل البيانات من منصات مثل تويتر ورديت، كاشفة عن رؤى مهمة حول الأنماط اللغوية والعاطفية المرتبطة باضطرابات الصحة العقلية. على سبيل المثال، قدم أراغون وآخرون طريقة تجمع بين المشاعر الدقيقة والأنماط الزمنية، محققة تحسينًا في القابلية للتفسير والأداء في اكتشاف الاضطرابات العقلية. وبالمثل، أكد غونتوكو وآخرون على القوة التنبؤية للغة في فيسبوك مقارنة بتويتر في قياس الضغط النفسي، بينما أظهر كاشيدا وآخرون فعالية مصنفات الغابات العشوائية في الاكتشاف المبكر للاكتئاب.

يناقش القسم أيضًا دور نماذج المحولات، مثل BERT ونسخها، في تعزيز دقة اكتشاف الاكتئاب. عرض أوبان وآخرون ولين وآخرون الأداء المتفوق لنماذج التعلم العميق، وخاصة الشبكات الهرمية والانتباه والشبكات متعددة الوسائط، في تحديد قضايا الصحة العقلية من منشورات وسائل التواصل الاجتماعي. تشير النتائج إلى أن نماذج المحولات تتفوق بشكل كبير على المصنفات التقليدية، حيث أبلغت الدراسات عن دقة تصل إلى 98% في اكتشاف الاكتئاب. تختتم الورقة بالاعتراف بحدود النماذج الحالية، مثل عدم توازن البيانات والحاجة إلى مجموعات بيانات أكثر موثوقية، مع اقتراح اتجاهات مستقبلية للبحث، بما في ذلك دمج الخبرة في المجال واستكشاف هياكل الشبكات العصبية الهجينة لتحسين فعالية الاكتشاف بشكل أكبر.

القيود

تقدم الدراسة عدة قيود قد تؤثر على قابلية تعميم وموثوقية نتائجها. من الجدير بالذكر أن غياب ضبط المعلمات الفائقة، بسبب الوصول المحدود إلى موارد GPU، قد يعيق تحسين أداء النموذج، حيث يُعرف أن مثل هذا الضبط يعزز نتائج التقييم. علاوة على ذلك، فإن الاعتماد على مجموعات البيانات المعلّمة يمثل تحديًا، نظرًا لتوافر هذه البيانات بشكل محدود في كثير من الأحيان. وقد أدى ذلك إلى تطوير طرق التعلم الذاتي للتخفيف من نقص التسميات.

بالإضافة إلى ذلك، تم إجراء البحث باستخدام مجموعة بيانات واحدة فقط، مما قد يحد من قابلية تطبيق النتائج في سياقات أوسع. أخيرًا، لم تتضمن الدراسة نهجًا قابلًا للتفسير، مما يترك أسئلة غير مجابة بشأن الأسباب وراء القرارات التي اتخذها الشبكة القائمة على المحولات. تشير هذه القيود إلى مجالات للبحث المستقبلي، لا سيما في تعزيز قابلية تفسير النموذج وتوسيع نطاق مجموعة البيانات.

Journal: Social Network Analysis and Mining, Volume: 14, Issue: 1
DOI: https://doi.org/10.1007/s13278-024-01360-4
Publication Date: 2024-09-28
Author(s): Marios Kerasiotis et al.
Primary Topic: Mental Health via Writing

Overview

The research addresses the critical issue of detecting depression in social media posts, highlighting the limitations of traditional machine learning algorithms in capturing complex textual patterns. To enhance accuracy and robustness, the study proposes a neural network architecture that integrates transformer-based models, specifically DistilBERT, with metadata and linguistic markers. By extracting information from the last four layers of the transformer and employing dropout layers to mitigate overfitting, the model achieves weighted Precision, Recall, and F1-scores of 84.26%, 84.18%, and 84.15%, respectively. Notably, data augmentation techniques, inspired by Easy Data Augmentation (EDA), significantly improve performance, increasing the weighted F1-score from 72.59% to 84.15%.

In conclusion, the study presents a comprehensive approach to early identification of depression severity levels in social media content, overcoming common challenges such as overfitting and dataset biases. The proposed model not only leverages BERT for detailed linguistic analysis but also incorporates auxiliary features to enhance comprehension. Through extensive experimentation, the model outperforms existing state-of-the-art methods by approximately 10.94% in weighted F1-score. The findings suggest that the model can facilitate real-time monitoring of mental health on social media, potentially transforming intervention strategies and informing public health policies. This research contributes significantly to the field of mental health awareness and early intervention, providing a robust framework for future studies in depression detection using natural language processing and deep learning techniques.

Introduction

The introduction of this research paper highlights the significant rise in mental health disorders, particularly depression, exacerbated by the COVID-19 pandemic, with WHO data indicating a 26% increase in anxiety and a 28% increase in depression cases. The paper emphasizes the potential of social media as a rich source of data for understanding mental health issues, noting advancements in machine learning models for detecting depression in social media posts. However, traditional machine learning algorithms often struggle with the complexities of mental health-related textual data, leading to suboptimal performance and generalization issues.

To address these limitations, the authors propose a novel neural network architecture that leverages the distilbert-base-uncased transformer model. This architecture enhances depression detection by extracting information from the last four layers of the transformer, combining contextual embeddings with additional metadata and linguistic markers. The model incorporates dropout layers to prevent overfitting and utilizes a Multi-Layer Perceptron (MLP) for classification. Key contributions of the research include a data augmentation approach to manage imbalanced datasets, a method for assessing depression severity, a weighted average technique for extracting hidden states, and a series of ablation experiments to validate the proposed architecture’s effectiveness. This innovative approach aims to improve the accuracy of identifying depressive posts on social media, thereby contributing valuable insights to the field of mental health detection.

Methods

In this section, the methodology for the experimental setup of a neural network model is detailed. The implementation was conducted using Python notebooks on Google Colab and Kaggle, employing libraries such as Hugging Face’s Transformers, PyTorch, and nlpaug for text augmentation. Performance evaluation was carried out using metrics from the scikit-learn library. To enhance computational efficiency, T4 and P100 GPUs were utilized, with training conducted over 11 epochs and a batch size of 8, using 80% of the dataset for training. The model was trained and tested across five runs, with results reported as mean and standard deviation.

The optimization process involved the Cross-Entropy Loss function and the Adam optimizer with a learning rate of \(1 \times 10^{-5}\). Key hyperparameters included a DistilBERT dimension of 768, three MLP layers (Linear, ReLU, Linear) with a hidden size of 512, and a dropout rate of 0.1. This rigorous experimental setup facilitated a comprehensive evaluation of the model’s ability to classify depressive posts, demonstrating the effectiveness of the proposed neural network approach. A summary of the hyperparameter values is provided in Table II.

Results

In the results section of the research paper, the authors present a comprehensive evaluation of their proposed model for detecting depressive symptoms in social media content. The model, which utilizes a classification head MLP with 512 hidden layers and the ‘distilbert-base-uncased’ architecture, achieved a weighted precision of 84.26%, a weighted recall of 84.18%, and a weighted F1-score of 84.15%. This performance marks a significant improvement of approximately 23.86% in the weighted F1-score over the worst-performing baseline model and about 10.99% over the best-performing baseline model. Comparisons with existing models, such as M-MentalBERT and various BERT configurations, demonstrate that the proposed model outperforms these baselines across all metrics, highlighting its robustness in detecting depression severity.

Additionally, the results indicate that data augmentation methods contributed positively to the model’s performance, with increases of 12.26% in weighted precision, 10.75% in recall, and 11.52% in F1-score. The findings also reveal that the proposed approach surpasses competitive baselines in weighted precision (by 9.78-11.27%), recall (by 10.95-14.10%), and F1-score (by 10.99-14.19%). The confusion matrix presented further elucidates the model’s classification accuracy across four severity levels of depression, revealing strengths and areas for improvement in distinguishing between these levels. Overall, the research contributes valuable insights into the effectiveness of advanced machine learning techniques for mental health assessment in social media contexts.

Discussion

The discussion section of the research paper highlights the evolving landscape of machine learning and natural language processing (NLP) techniques used to detect depression through social media analysis. Various studies have employed traditional machine learning algorithms and neural networks to analyze data from platforms like Twitter and Reddit, revealing significant insights into the linguistic and emotional patterns associated with mental health disorders. For instance, Aragón et al. introduced a method that combines fine-grained emotions and temporal patterns, achieving improved interpretability and performance in detecting mental disorders. Similarly, Guntuku et al. emphasized the predictive power of Facebook language over Twitter for measuring psychological stress, while Cacheda et al. demonstrated the effectiveness of random forest classifiers for early depression detection.

The section also discusses the role of transformer models, such as BERT and its variants, in enhancing the accuracy of depression detection. Uban et al. and Lin et al. showcased the superior performance of deep learning models, particularly hierarchical attention networks and multimodal approaches, in identifying mental health issues from social media posts. The findings indicate that transformer models significantly outperform traditional classifiers, with studies reporting accuracies as high as 98% in detecting depression. The paper concludes by acknowledging the limitations of existing models, such as data imbalance and the need for more robust datasets, while suggesting future directions for research, including the integration of domain expertise and the exploration of hybrid neural network architectures to further improve detection efficacy.

Limitations

The study presents several limitations that may impact the generalizability and robustness of its findings. Notably, the absence of hyperparameter tuning, due to restricted access to GPU resources, could hinder the optimization of model performance, as such tuning is known to enhance evaluation results. Furthermore, the reliance on labeled datasets poses a challenge, given the often limited availability of such data. This issue has prompted the development of self-supervised learning methods to mitigate the scarcity of labels.

Additionally, the research was conducted using only a single dataset, which may limit the applicability of the results to broader contexts. Lastly, the study did not incorporate an explainable approach, leaving unanswered questions regarding the rationale behind the decisions made by the transformer-based network. These limitations suggest areas for future research, particularly in enhancing model interpretability and expanding the dataset scope.