الطرق الحسابية لتحديد الأفكار الانتحارية: مراجعة منهجية Computational methods for the identification of suicidal ideation: a systematic review

المجلة: Frontiers in Artificial Intelligence، المجلد: 9
DOI: https://doi.org/10.3389/frai.2026.1704818
PMID: https://pubmed.ncbi.nlm.nih.gov/41658241
تاريخ النشر: 2026-01-22
المؤلف: Brahian Stiven Gil Arias وآخرون
الموضوع الرئيسي: الصحة النفسية من خلال الكتابة

نظرة عامة

تقدم ورقة البحث مراجعة منهجية للأدبيات تركز على التقنيات الحسابية لتحديد الأفكار الانتحارية في النصوص باللغة الطبيعية، لا سيما في سياق ارتفاع معدلات الانتحار بين الشباب التي تفاقمت بسبب جائحة COVID-19. باستخدام منهجية PRISMA 2020، قامت المراجعة بتحليل 25 دراسة، كاشفة عن هيمنة النماذج المعتمدة على المحولات، مثل BERT، جنبًا إلى جنب مع الأساليب الهجينة التي تدمج الشبكات العصبية التلافيفية (CNN) والشبكات العصبية طويلة وقصيرة المدى (LSTM). تسلط النتائج الضوء على فعالية هذه النماذج في تحقيق مقاييس أداء عالية، بما في ذلك الدقة، والتميز، والاسترجاع، ودرجة F1، مما يشير إلى إمكانية استخدامها في جهود الوقاية من الانتحار في وقت مبكر.

ومع ذلك، تحدد المراجعة قيودًا كبيرة، بما في ذلك نقص التنوع اللغوي والثقافي في مجموعات البيانات والاعتماد المفرط على بيانات وسائل التواصل الاجتماعي. كما يتم التأكيد على المخاوف الأخلاقية المتعلقة بالخصوصية والموافقة في استخدام البيانات الشخصية لتدريب النماذج. يجب أن تركز اتجاهات البحث المستقبلية على تعزيز قابلية تفسير النماذج للاستخدام السريري، وتوسيع تنوع مجموعات البيانات، وتعزيز التعاون بين التخصصات. تشمل الحلول المقترحة استخدام بيانات اصطناعية تم إنشاؤها بواسطة نماذج اللغة الكبيرة (LLMs) وتطوير نماذج الذكاء الاصطناعي الأكثر قابلية للتفسير (XAI). بشكل عام، بينما تظهر الأدوات الحسابية وعدًا في أتمتة اكتشاف الأفكار الانتحارية، فإن معالجة هذه القيود أمر حيوي لتطبيقها الفعال في استراتيجيات الوقاية من الانتحار على مستوى العالم.

مقدمة

تسلط مقدمة ورقة البحث الضوء على الأهمية العالمية للانتحار كسبب رئيسي للوفاة، مع أكثر من 49,000 حالة تم الإبلاغ عنها في عام 2022، مما يمثل زيادة بنسبة 2.6% عن العام السابق. يتم تحديد الأفراد الشباب الذين تتراوح أعمارهم بين 13 و30 عامًا على أنهم عرضة بشكل خاص بسبب التغيرات التنموية والعوامل الاجتماعية والاقتصادية، حيث يظهر الرجال معدل انتحار أعلى بحوالي أربع مرات من النساء. يؤكد المؤلفون على تعقيد القضية، مشيرين إلى أن الأفعال الاندفاعية والانحدار العاطفي المطول يمكن أن تدفع الأفراد للتفكير في الانتحار، مما يبرز ضرورة استراتيجيات التدخل المبكر لتقديم الدعم في الوقت المناسب.

تناقش الورقة إمكانية تقنيات معالجة اللغة الطبيعية (NLP) لتحديد الأفكار الانتحارية من مصادر نصية متنوعة، بما في ذلك وسائل التواصل الاجتماعي والاتصالات الشخصية. يتم تسليط الضوء على هذه الأساليب الحسابية كبدائل فعالة من حيث التكلفة للتدخلات التقليدية، القادرة على اكتشاف الأفراد المعرضين للخطر بسرعة. أظهرت الدراسات السابقة دقة تنبؤية عالية لنماذج NLP، مع مقاييس مثل منطقة تحت منحنى التشغيل (AUROC) بنسبة 98.6% لتحديد الأفكار الانتحارية. يقترح المؤلفون مراجعة منهجية لتقييم التقنيات الحسابية الحالية لاستخراج الأفكار الانتحارية من النصوص باللغة الطبيعية، بهدف تحديد مقاييس الأداء والفرص لتعزيز تقنيات الوقاية من الانتحار. يتم توضيح هيكل المقال، مع تفاصيل المنهجية والنتائج والمناقشة والاستنتاجات.

الطرق

في هذا القسم، يحدد المؤلفون طرقهم في التركيب لاختيار الدراسات بناءً على المتغيرات والمعايير المحددة. تم تنظيم النتائج بشكل منهجي وعرضها من خلال الرسوم البيانية والجداول، وفقًا لإرشادات PRISMA 2020 (Page et al., 2021). تم التقاط مقاييس رئيسية، مثل الدقة، وتم إجراء تحليل فرعي لاستكشاف الاختلافات في الأساليب الحسابية. ومع ذلك، نظرًا للاختلاف المنهجي للدراسات المدرجة – المميزة بتباين مصادر البيانات، وأنظمة التوضيح، وتعريفات الأفكار الانتحارية، والنماذج الحسابية، ومقاييس التقييم – تم اعتبار التحليل الكمي غير مناسب، حيث كان من شأنه التأثير على صحة النتائج.

بعد توصيات دليل كوكراين (Higgins et al., 2024)، اختار المؤلفون نهج التركيب السردي، كما اقترح دليل تقرير SWiM (Campbell et al., 2020). سهل هذا النهج مقارنة الأنماط، وتحديد الفجوات المنهجية، وتسليط الضوء على الاتجاهات الشائعة دون الدمج الإحصائي لمقاييس غير متجانسة وغير قابلة للمقارنة. بشكل عام، سمحت المنهجية بتركيب شامل للأدلة المتاحة مع الحفاظ على نزاهة النتائج.

النتائج

يوفر قسم النتائج في ورقة البحث تحليلًا شاملاً لمختلف الأساليب الحسابية لاكتشاف الأفكار الانتحارية في النصوص باللغة الطبيعية. يسلط الضوء على مساهمات الدراسات الفردية، مما يسهل المقارنات عبر المنهجيات ومقاييس الأداء والتقنيات. تكشف مراجعة منهجية أنه بينما تعتبر تقنيات التعلم العميق، ونماذج اللغة واسعة النطاق (LLMs)، والأساليب التجميعية، والهياكل المتقدمة مثل الشبكات العصبية البيانية (GNNs) والشبكات التنافسية التوليدية (GANs) شائعة، لا يوجد نموذج واحد يتفوق باستمرار على الآخرين في جميع السياقات. بدلاً من ذلك، يتأثر أداء النموذج بعوامل مثل حجم مجموعة البيانات، والخصائص اللغوية، ووضوح التعبيرات الانتحارية.

تشير التحليلات إلى أن الهياكل المعتمدة على المحولات تتفوق مع مجموعات بيانات أكبر ومتنوعة، بينما تؤدي نماذج التعلم العميق التقليدية، مثل الذاكرة طويلة وقصيرة المدى (LSTM) والشبكات العصبية التلافيفية (CNN)، بشكل جيد مع مجموعات بيانات أصغر ومراقبة. تعزز آليات الانتباه دقة النموذج وقابلية تفسيره، لا سيما في تحديد الأفكار الانتحارية الصريحة. ومع ذلك، تواجه LLMs، على الرغم من قدراتها الفائقة في نمذجة السياق، تحديات تتعلق بمتطلبات البيانات وقابلية التفسير، مما قد يعيق تطبيقها في البيئات السريرية في الوقت الحقيقي. تحقق الأساليب التجميعية، على الرغم من تحقيقها لمقاييس تصنيف عالية، غالبًا ما تعتمد على مجموعات بيانات محدودة وتفتقر إلى التحقق الدقيق، مما يثير القلق بشأن قوتها. تظهر الأساليب المتقدمة، بما في ذلك GNNs والنماذج متعددة الوسائط، وعدًا ولكنها تتطلب موارد حسابية كبيرة، مما يشير إلى وجود توازن بين الأداء والجدوى العملية في التطبيقات الواقعية. لا تحدد هذه المراجعة المنهجية الفجوات المنهجية فحسب، بل تدمج أيضًا التقدمات الحديثة في الذكاء الاصطناعي، مقدمةً منظورًا دقيقًا حول الأساليب الحسابية للوقاية من الانتحار.

المناقشة

تتبع المراجعة المنهجية الموضحة في هذا القسم إرشادات PRISMA، مما يضمن نهجًا منظمًا لتقييم التقنيات الحسابية لاكتشاف الأفكار الانتحارية. استخدمت المراجعة معايير أهلية صارمة، تركز على الأبحاث عالية الجودة، المفهرسة المنشورة باللغة الإنجليزية أو الإسبانية منذ عام 2018. تم اختيار 25 دراسة في النهاية، كاشفة عن تركيز جغرافي في آسيا، لا سيما الهند، واعتماد كبير على بيانات وسائل التواصل الاجتماعي، لا سيما من Reddit والشبكة الاجتماعية X. تشير النتائج إلى تباين كبير في مقاييس أداء النماذج المستخدمة، حيث حققت الهياكل المعتمدة على المحولات، وخاصة BERT، معدلات دقة تصل إلى 97.6%. أظهرت النماذج الهجينة التي تجمع بين الشبكات العصبية التلافيفية (CNNs) والشبكات العصبية طويلة وقصيرة المدى (LSTM) أداءً محسنًا، مما يشير إلى أن التطورات المستقبلية يجب أن تعطي الأولوية للهياكل المرنة على الحلول ذات النموذج الواحد.

تعتبر آثار هذه النتائج كبيرة بالنسبة للمعنيين في الوقاية من الانتحار، بما في ذلك مطوري التكنولوجيا والمهنيين في الصحة النفسية. تسلط فعالية الأساليب الحسابية المحددة الضوء على إمكانية دمج هذه النماذج في أنظمة الإنذار المبكر، مما يوسع قدرات الفحص إلى ما وراء الإعدادات السريرية التقليدية. ومع ذلك، تؤكد المراجعة أيضًا على الفجوات الحرجة، مثل التمثيل المحدود للبيانات السريرية وهيمنة مجموعات البيانات باللغة الإنجليزية، مما قد يقيد قابلية تطبيق هذه النماذج عبر سياقات ثقافية متنوعة. يثير تركيز البحث في مناطق معينة تساؤلات حول الصلة العالمية للنتائج، مما يبرز الحاجة إلى مصادر بيانات ومنهجيات أوسع لتعزيز قوة وشمولية أدوات الوقاية من الانتحار.

القيود

تحدد المراجعة المنهجية قيودًا على مستوى الدراسات المدرجة وعملية المراجعة نفسها. تعتبر إحدى القضايا الرئيسية بين الدراسات المراجعة هي نقص التنوع اللغوي، حيث تم تدريب معظم النماذج على مجموعات بيانات باللغة الإنجليزية. يحد هذا القيد من قابلية تطبيق النتائج عبر سياقات ثقافية ولغوية مختلفة، مما قد يؤدي إلى الإفراط في التكيف وتقديرات أداء مبالغ فيها عند تطبيق النماذج على مجموعات سكانية متنوعة. بالإضافة إلى ذلك، فإن الاعتماد على منصات معينة مثل Reddit أو الشبكة الاجتماعية X يقدم تحيزات في تمثيل السكان والتعبير، مما يضعف صحة النتائج. كما تعيق العيوب المنهجية، مثل أحجام العينات الصغيرة، وعدم كفاية التحقق المتبادل، وهياكل النماذج غير الواضحة، قابلية مقارنة الأساليب وقابلية تعميم النتائج.

فيما يتعلق بعملية المراجعة، ظهرت تحديات في تطوير استراتيجيات بحث فعالة، مما استلزم تعديلات متعددة لالتقاط الدراسات ذات الصلة التي لم تستخدم الكلمات الرئيسية الشائعة. زادت الندرة العامة للدراسات التي تركز على الكشف التلقائي عن الأفكار الانتحارية باستخدام معالجة اللغة الطبيعية (NLP) مقارنةً بمناطق البحث الأخرى من تعقيد المراجعة. علاوة على ذلك، أدت التناقضات في كيفية الإبلاغ عن النتائج عبر الدراسات إلى صعوبات في استخراج البيانات، مما حد في النهاية من إمكانية إجراء تحليل كمي للنتائج.

Journal: Frontiers in Artificial Intelligence, Volume: 9
DOI: https://doi.org/10.3389/frai.2026.1704818
PMID: https://pubmed.ncbi.nlm.nih.gov/41658241
Publication Date: 2026-01-22
Author(s): Brahian Stiven Gil Arias et al.
Primary Topic: Mental Health via Writing

Overview

The research paper presents a systematic literature review focused on computational techniques for identifying suicidal ideation in natural language texts, particularly in the context of rising suicide rates among young people exacerbated by the COVID-19 pandemic. Utilizing the PRISMA 2020 methodology, the review analyzed 25 studies, revealing a predominance of transformer-based models, such as BERT, alongside hybrid approaches that integrate convolutional neural networks (CNN) and long short-term memory networks (LSTM). The findings highlight the effectiveness of these models in achieving high performance metrics, including Accuracy, Precision, Recall, and F1-score, suggesting their potential utility in early suicide prevention efforts.

However, the review identifies significant limitations, including a lack of linguistic and cultural diversity in the datasets and an over-reliance on social media data. Ethical concerns regarding privacy and consent in the use of personal data for model training are also emphasized. Future research directions should focus on enhancing model interpretability for clinical use, expanding dataset diversity, and fostering interdisciplinary collaboration. Proposed solutions include the use of synthetic data generated by large language models (LLMs) and the development of more explainable artificial intelligence (XAI) models. Overall, while computational tools show promise for automating the detection of suicidal ideation, addressing these limitations is crucial for their effective application in suicide prevention strategies globally.

Introduction

The introduction of the research paper highlights the global significance of suicide as a leading cause of death, with over 49,000 reported cases in 2022, marking a 2.6% increase from the previous year. Young individuals aged 13 to 30 are identified as particularly vulnerable due to developmental changes and socio-economic factors, with men exhibiting a suicide rate approximately four times higher than that of women. The authors emphasize the complexity of the issue, noting that impulsive actions and prolonged emotional decline can lead individuals to consider suicide, underscoring the necessity for early intervention strategies to provide timely support.

The paper discusses the potential of natural language processing (NLP) techniques to identify suicidal ideation from various text sources, including social media and personal communications. These computational methods are highlighted as cost-effective alternatives to traditional interventions, capable of rapidly detecting at-risk individuals. Previous studies have demonstrated high predictive accuracy for NLP models, with metrics such as an area under the receiver operating characteristic curve (AUROC) of 98.6% for identifying suicidal ideation. The authors propose a systematic review to evaluate existing computational techniques for extracting suicidal ideation from natural language texts, aiming to identify performance metrics and opportunities for enhancing suicide prevention technologies. The structure of the article is outlined, detailing the methodology, results, discussion, and conclusions.

Methods

In this section, the authors outline their synthesis methods for selecting studies based on defined variables and inclusion criteria. Results were systematically organized and presented through graphs and tables, adhering to the PRISMA 2020 guidelines (Page et al., 2021). Key metrics, such as accuracy, were captured, and a subgroup analysis was conducted to explore differences in computational methods. However, due to the methodological heterogeneity of the included studies—characterized by variations in data sources, annotation schemes, definitions of suicidal ideation, computational models, and evaluation metrics—a quantitative meta-analysis was deemed inappropriate, as it would compromise result validity.

Following the Cochrane Handbook recommendations (Higgins et al., 2024), the authors opted for a narrative synthesis approach, as suggested by the SWiM reporting guideline (Campbell et al., 2020). This approach facilitated the comparison of patterns, identification of methodological gaps, and highlighting of common trends without the statistical integration of non-homogeneous and non-comparable metrics. Overall, the methodology allowed for a comprehensive synthesis of available evidence while maintaining the integrity of the findings.

Results

The results section of the research paper provides a comprehensive analysis of various computational approaches for detecting suicidal thoughts in natural language texts. It highlights the contributions of individual studies, facilitating comparisons across methodologies, performance metrics, and technologies. A systematic review reveals that while deep learning techniques, large-scale language models (LLMs), ensemble methods, and advanced architectures like Graph Neural Networks (GNNs) and Generative Adversarial Networks (GANs) are prevalent, no single model consistently outperforms others across all contexts. Instead, model performance is influenced by factors such as dataset size, linguistic characteristics, and the explicitness of suicidal expressions.

The analysis indicates that transformer-based architectures excel with larger, heterogeneous datasets, while traditional deep learning models, such as Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN), perform well with smaller, controlled datasets. Attention mechanisms enhance model accuracy and interpretability, particularly in identifying explicit suicidal ideation. However, LLMs, despite their superior contextual modeling capabilities, face challenges related to data requirements and interpretability, which may hinder their application in real-time clinical settings. Ensemble methods, while achieving high classification metrics, often rely on limited datasets and lack rigorous validation, raising concerns about their robustness. Advanced approaches, including GNNs and multimodal models, show promise but require significant computational resources, suggesting a trade-off between performance and practical feasibility in real-world applications. This systematic review not only identifies methodological gaps but also integrates recent advancements in artificial intelligence, offering a nuanced perspective on computational approaches to suicide prevention.

Discussion

The systematic review outlined in this section adheres to the PRISMA guidelines, ensuring a structured approach to evaluating computational techniques for detecting suicidal ideation. The review employed stringent eligibility criteria, focusing on high-quality, indexed research published in English or Spanish since 2018. A total of 25 studies were ultimately selected, revealing a geographical concentration in Asia, particularly India, and a predominant reliance on social media data, notably from Reddit and the social network X. The findings indicate significant variability in the performance metrics of the models employed, with transformer-based architectures, especially BERT, achieving accuracy rates up to 97.6%. Hybrid models combining convolutional neural networks (CNNs) and long short-term memory (LSTM) networks demonstrated enhanced performance, suggesting that future developments should prioritize flexible architectures over single-model solutions.

The implications of these findings are substantial for stakeholders in suicide prevention, including technology developers and mental health professionals. The effectiveness of the identified computational approaches highlights the potential for integrating these models into early warning systems, extending screening capabilities beyond traditional clinical settings. However, the review also underscores critical gaps, such as the limited representation of clinical data and the predominance of English-language datasets, which may restrict the applicability of these models across diverse cultural contexts. The concentration of research in specific regions raises questions about the global relevance of the findings, emphasizing the need for broader data sources and methodologies to enhance the robustness and inclusivity of suicide prevention tools.

Limitations

The systematic review identifies limitations at both the level of the included studies and the review process itself. A primary concern among the reviewed studies is the lack of linguistic diversity, as most models were trained on English-language datasets. This limitation restricts the applicability of findings across different cultural and linguistic contexts, potentially leading to overfitting and inflated performance estimates when models are applied to diverse populations. Additionally, reliance on specific platforms like Reddit or social network X introduces biases in population representation and expression, further compromising the validity of the results. Methodological flaws, such as small sample sizes, inadequate cross-validation, and unclear model structures, also hinder the comparability of approaches and the generalizability of findings.

In terms of the review process, challenges arose in developing effective search strategies, necessitating multiple adjustments to capture relevant studies that did not utilize common keywords. The overall scarcity of studies focused on the automatic detection of suicidal ideation using natural language processing (NLP) compared to other research areas further complicated the review. Moreover, inconsistencies in how results were reported across studies posed difficulties in data extraction, ultimately limiting the potential for a quantitative meta-analysis of the findings.