مراجعة شاملة لمعالجة اللغة الطبيعية لاكتشاف الضغط المرتبط بالعمل بين المهنيين الصحيين A scoping review of natural language processing for detecting work-related stress among health professionals

المجلة: Discover Computing، المجلد: 29، العدد: 1
DOI: https://doi.org/10.1007/s10791-025-09886-7
تاريخ النشر: 2026-01-08
المؤلف: Catherine Ikae وآخرون
الموضوع الرئيسي: الصحة النفسية من خلال الكتابة

نظرة عامة

تسلط النقص العالمي في المهنيين الصحيين، المتوقع أن يتجاوز 14 مليون بحلول عام 2030، الضوء على الحاجة الملحة لحلول مبتكرة في الرعاية الصحية، والتي تفاقمت بسبب عوامل مثل شيخوخة المجتمع، والتقدم في التكنولوجيا الطبية، وزيادة أعباء العمل، التي زادت حدتها بفعل جائحة COVID-19. يواجه المهنيون الصحيون ضغوطًا كبيرة تتعلق بالعمل، حيث تشير الدراسات إلى معدلات انتحار أعلى مقارنة بمهن أخرى. الجهود الحالية لاكتشاف الضغوط مبكرًا وبشكل تلقائي لا تزال في مراحلها الأولى، معوقة بنقص مجموعات البيانات النصية المعلّمة المحددة للمهنيين الصحيين. إن تطوير مجموعة بيانات نصية مخصصة أمر ضروري، ويتطلب اهتمامًا دقيقًا بمعايير الخصوصية والأخلاقيات والقانون والحوكمة مع تحقيق توازن بين دقة البيانات وموثوقيتها.

تكشف الأدبيات عن فجوات بحثية حاسمة، بما في ذلك نقص مجموعات البيانات القياسية لاكتشاف الضغوط في الرعاية الصحية والمعايير المعتمدة لتقييم نماذج معالجة اللغة الطبيعية (NLP). يجب أن تركز الدراسات المستقبلية على التقييمات الاستباقية لأدوات NLP في البيئات الواقعية، مع معالجة خصوصية البيانات، وقابلية التفسير، وقبول المستخدم. يمكن أن يسهل دمج أدوات NLP في سير العمل الحالي مراقبة الضغوط في الوقت الحقيقي، مع التركيز على الاكتشاف الفوري بدلاً من المراقبة المستمرة. يمكن أن يكشف تحليل البيانات النصية من السجلات الصحية الإلكترونية، ومنصات التواصل، وتعليقات الموظفين عن أنماط الضغوط، مما يساهم في أنظمة التدخل المبكر وبرامج الدعم المخصصة. هناك حاجة إلى جهود تعاونية بين المهنيين الصحيين، وعلماء البيانات، وصانعي السياسات لمعالجة التحديات المتعلقة بخصوصية البيانات وتوافق الأنظمة، مما يعزز في النهاية بيئة عمل أكثر صحة من خلال تطبيقات NLP الاستباقية.

الطرق

تركز الطرق المستخدمة في الدراسات التي تمت مراجعتها على استخراج أنماط ذات مغزى من البيانات النصية غير المنظمة، باستخدام تقنيات حسابية متنوعة. تستفيد أربع من الدراسات الست من معالجة اللغة الطبيعية (NLP) بالتزامن مع تخصيص ديريشليت الكامن (LDA) للتحليل الموضوعي، بينما تستخدم دراسة واحدة لكيم وآخرين NLP لاستخراج البيانات والتحليل النوعي دون LDA. تستخدم أيرونين وآخرون تقنيات استخراج النصوص، مع دمج التعلم شبه المراقب باستخدام Word2vec والتحليل الإحصائي عبر SAS لتحديد عوامل الخطر النفسية والاجتماعية. يتم استخدام مجموعة أدوات معالجة اللغة الطبيعية (NLTK) بشكل متنوع عبر الدراسات، حيث تؤدي وظائف تتراوح من المعالجة الأساسية للنصوص إلى التحليل المتقدم، مما يظهر مرونتها في التعامل مع مراحل مختلفة من تحليل البيانات.

يتم تطبيق التحليل الموضوعي بطرق منهجية متنوعة، تتراوح بين الأساليب الاستقرائية التي تشمل الترميز التعاوني إلى استراتيجيات مختلطة تجمع بين التحليلات الكمية. يتم استخدام LDA بطرق متنوعة، من تحديد الموضوعات المثلى إلى استكشاف العلاقات داخل مجموعات البيانات. تسلط الدراسات الضوء على قابلية تكيف تقنيات استخراج النصوص ونمذجة الموضوعات، مما يظهر فائدتها في سياقات بحثية مختلفة، مثل توثيق الصحة ومنتديات التمريض. على الرغم من التقدم في منهجيات NLP، لا تزال هناك تحديات في التطبيقات العملية، خاصة في البيئات الصحية، حيث يحد الوصول إلى البيانات الحساسة والاعتبارات الأخلاقية من نشر هذه التقنيات المتطورة. تشمل التوصيات للتغلب على هذه التحديات التدريب المتخصص، والاستفادة من الموارد السحابية، وتعزيز التعاون بين التخصصات لتعزيز دمج الأساليب الحسابية المتقدمة في الممارسة اليومية.

النتائج

تعتبر المراجعة الشاملة المقدمة في هذا القسم الأولى من نوعها التي تجمع الأدبيات العلمية حول تطبيق تقنيات معالجة اللغة الطبيعية (NLP) واستخراج النصوص لاكتشاف الضغوط المتعلقة بالعمل بين المهنيين الصحيين. كان الهدف الأساسي هو تحديد العمليات والأساليب الفعالة التي يمكن أن تسهل الاكتشاف التلقائي للضغوط، مما يعزز صحة ورفاهية هذه القوة العاملة.

على الرغم من أهمية النتائج، تستند المراجعة إلى عدد محدود من المقالات، مما يبرز فجوة حاسمة في المشهد البحثي الحالي. يؤكد المؤلفون على جانبين أساسيين من نتائجهم: مصادر البيانات المستخدمة والأساليب المتبعة في الدراسات التي تمت مراجعتها. وهذا يبرز ضرورة المزيد من التحقيق لتوسيع الفهم وتطبيق NLP في هذا السياق.

المناقشة

هدفت المراجعة الشاملة التي تم مناقشتها في هذا القسم إلى تحديد العمليات والأساليب لاكتشاف الضغوط المتعلقة بالعمل بشكل تلقائي بين المهنيين الصحيين باستخدام تقنيات معالجة اللغة الطبيعية (NLP) واستخراج النصوص. وفقًا لمنهجية معهد جوانا بريجز وقائمة مراجعة PRISMA-ScR، شملت المراجعة الدراسات المنشورة منذ عام 2013 والتي تركزت بشكل خاص على المهنيين الصحيين، مستبعدةً الفئات الأخرى. شملت المراجعة تصاميم دراسات متنوعة، بما في ذلك الدراسات المختلطة، والدراسات النوعية، والدراسات الملاحظة، وهدفت إلى تقديم نظرة شاملة على تطبيقات NLP في اكتشاف الضغوط، مع تسليط الضوء على العواقب الكبيرة للضغوط المتعلقة بالعمل لفترات طويلة، بما في ذلك السلوك الانتحاري.

كشفت النتائج أن ضغوط العمل، مثل عبء العمل والتحديات بين الأشخاص، تسهم في الضغط العاطفي والنفسي، مما يمكن أن يتصاعد إلى عواقب وخيمة مثل الانتحار. حددت المراجعة الاعتماد على بيانات وسائل التواصل الاجتماعي لتطبيقات NLP، والتي، على الرغم من كونها متاحة، قد تفتقر إلى الصلة السياقية لعوامل الضغط الخاصة بالرعاية الصحية. تم التأكيد على المخاوف الأخلاقية المتعلقة بجمع البيانات والخصوصية، مما يشير إلى أن الأبحاث المستقبلية يجب أن تستكشف طرقًا للاستفادة الأخلاقية من النصوص الناتجة عن مكان العمل وتطوير مجموعة بيانات متاحة للجمهور ومجهولة الهوية لاكتشاف الضغوط. يدعو المؤلفون إلى نهج متوازن يأخذ في الاعتبار تدخّل أساليب جمع البيانات مع ضمان دقة وموثوقية البيانات المستخرجة، بهدف تعزيز الفهم للضغوط بين المهنيين الصحيين ودعم صحتهم النفسية.

القيود

تنشأ قيود هذه الدراسة بشكل أساسي من تركيزها الضيق على المهنيين الصحيين، مما أدى إلى عدد محدود من المقالات المدرجة. هذا التخصص، على الرغم من كونه مقيدًا، مبرر بالحجة القائلة بأن النتائج من سياقات مهنية أخرى قد لا تكون قابلة للتطبيق مباشرة بسبب اختلافات في المصطلحات واللغة، مما يوفر أساسًا فريدًا لتحليل معالجة اللغة الطبيعية (NLP). على الرغم من ذلك، أظهرت العديد من الدراسات التي تمت مراجعتها نقاط ضعف منهجية، بما في ذلك أحجام عينات صغيرة، وتصاميم استعادية، وعموميات محدودة، مما يضعف قوة الأدلة المقدمة.

بالإضافة إلى ذلك، بينما أوضحت الدراسات بشكل عام أساليبها المنهجية، غالبًا ما كانت أوصاف المعالجة المسبقة وتنفيذ نموذج NLP تفتقر إلى التفاصيل الكافية، مما يعيق إمكانية إعادة الإنتاج. وهذا يبرز ضرورة تحسين الشفافية في معايير الإبلاغ ضمن أبحاث الرعاية الصحية التي تركز على NLP، خاصة فيما يتعلق بخطوات المعالجة المسبقة، والمعلمات الفائقة، وطرق التقييم. علاوة على ذلك، قد لا تكون المراجعة الشاملة قد شملت جميع الدراسات ذات الصلة بسبب الفجوات المحتملة في الفهرسة في قواعد البيانات المستخدمة واستبعاد المقالات غير الإنجليزية. أخيرًا، يتماشى غياب تقييم رسمي لمخاطر التحيز أو تقييم الجودة مع أهداف المراجعة الشاملة ولكنه يبقى قيدًا ملحوظًا.

Journal: Discover Computing, Volume: 29, Issue: 1
DOI: https://doi.org/10.1007/s10791-025-09886-7
Publication Date: 2026-01-08
Author(s): Catherine Ikae et al.
Primary Topic: Mental Health via Writing

Overview

The global shortage of health professionals, projected to exceed 14 million by 2030, highlights an urgent need for innovative healthcare solutions, exacerbated by factors such as societal aging, advancements in medical technology, and increased workloads, further intensified by the COVID-19 pandemic. Health professionals face significant work-related stress, with studies indicating higher suicide rates compared to other professions. Current efforts to detect stress early and automatically are nascent, hindered by a scarcity of labeled textual datasets specific to health professionals. The development of a dedicated text-based dataset is essential, requiring careful attention to privacy, ethical, legal, and governance standards while balancing data accuracy and reliability.

The literature reveals critical research gaps, including the lack of standardized datasets for stress detection in healthcare and validated benchmarks for evaluating natural language processing (NLP) models. Future studies should focus on prospective evaluations of NLP tools in real-world settings, addressing data privacy, interpretability, and user acceptance. Integrating NLP tools into existing workflows could facilitate real-time stress monitoring, with a focus on timely detection rather than continuous surveillance. Analyzing textual data from electronic health records, communication platforms, and staff feedback could uncover stress patterns, informing early intervention systems and tailored support programs. Collaborative efforts among health professionals, data scientists, and policymakers are necessary to tackle challenges related to data privacy and system interoperability, ultimately fostering a healthier work environment through proactive NLP applications.

Methods

The methods employed in the reviewed studies focus on extracting meaningful patterns from unstructured textual data, utilizing various computational techniques. Four of the six studies leverage Natural Language Processing (NLP) in conjunction with Latent Dirichlet Allocation (LDA) for thematic analysis, while one study by Kim et al. employs NLP for data extraction and qualitative analysis without LDA. Uronen et al. utilize text mining techniques, incorporating semi-supervised learning with Word2vec and statistical analysis via SAS to identify psychosocial risk factors. The Natural Language Toolkit (NLTK) is used variably across studies, serving functions from basic text preprocessing to advanced analysis, demonstrating its versatility in handling different stages of data analysis.

Thematic analysis is applied with varying methodologies, ranging from inductive approaches involving collaborative coding to mixed-method strategies that integrate quantitative analyses. LDA is employed in diverse ways, from determining optimal themes to exploring relationships within datasets. The studies highlight the adaptability of text mining and topic modeling techniques, showcasing their utility in different research contexts, such as health documentation and nursing forums. Despite the advancements in NLP methodologies, challenges remain in practical applications, particularly in healthcare settings, where access to sensitive data and ethical considerations limit the deployment of these sophisticated techniques. Recommendations for overcoming these challenges include specialized training, leveraging cloud-based resources, and fostering interdisciplinary collaboration to enhance the integration of advanced computational methods in everyday practice.

Results

The scoping review presented in this section is the first of its kind to synthesize scientific literature on the application of Natural Language Processing (NLP) and text mining techniques for the detection of work-related stress among health professionals. The primary objective was to identify effective processes and methodologies that could facilitate the automatic detection of stress, thereby enhancing the health and well-being of this workforce.

Despite the significance of the findings, the review is based on a limited number of articles, highlighting a critical gap in the existing research landscape. The authors emphasize two fundamental aspects of their findings: the sources of data utilized and the methodologies employed in the studies reviewed. This underscores the necessity for further investigation to expand the understanding and application of NLP in this context.

Discussion

The scoping review discussed in this section aimed to identify processes and methods for the automatic detection of work-related stress among health professionals using Natural Language Processing (NLP) and text mining techniques. Following the Joanna Briggs Institute methodology and the PRISMA-ScR checklist, the review included studies published from 2013 onward that specifically focused on health professionals, excluding other populations. The review encompassed various study designs, including mixed-methods, qualitative, and observational studies, and aimed to provide a comprehensive overview of NLP applications in stress detection, highlighting the significant consequences of prolonged work-related stress, including suicidal behavior.

The findings revealed that work-related stressors, such as workload and interpersonal challenges, contribute to emotional and psychological strain, which can escalate into severe consequences like suicide. The review identified a reliance on social media data for NLP applications, which, while accessible, may lack contextual relevance for healthcare-specific stressors. Ethical concerns regarding data collection and privacy were emphasized, suggesting that future research should explore ways to ethically leverage workplace-generated texts and develop a publicly available, anonymized dataset for stress detection. The authors advocate for a balanced approach that considers the invasiveness of data collection methods while ensuring the accuracy and reliability of the data obtained, ultimately aiming to enhance the understanding of stress among health professionals and support their mental well-being.

Limitations

The limitations of this study primarily stem from its narrow focus on health professionals, resulting in a limited number of included articles. This specificity, while potentially constraining, is justified by the argument that findings from other professional contexts may not be directly applicable due to variations in terminology and language, thus providing a unique foundation for natural language processing (NLP) analysis. Despite this, many of the studies reviewed exhibited methodological weaknesses, including small sample sizes, retrospective designs, and limited generalizability, which undermine the robustness of the evidence presented.

Additionally, while the studies generally outlined their methodological approaches, the descriptions of preprocessing and NLP model implementation often lacked sufficient detail, hindering reproducibility. This underscores the necessity for improved transparency in reporting standards within NLP-focused healthcare research, particularly regarding preprocessing steps, hyperparameters, and evaluation methods. Furthermore, the scoping review may not have encompassed all pertinent studies due to potential indexing gaps in the databases utilized and the exclusion of non-English articles. Lastly, the absence of a formal risk of bias assessment or quality appraisal aligns with the scoping review’s objectives but remains a notable limitation.