تفعيل تطبيقات الذكاء الاصطناعي في وحدة العناية المركزة Operationalization of Artificial Intelligence Applications in the Intensive Care Unit

المجلة: JAMA Network Open، المجلد: 8، العدد: 7
DOI: https://doi.org/10.1001/jamanetworkopen.2025.22866
PMID: https://pubmed.ncbi.nlm.nih.gov/40699572
تاريخ النشر: 2025-07-23
المؤلف: Willemijn E. M. Berkhout وآخرون
الموضوع الرئيسي: الذكاء الاصطناعي في الرعاية الصحية والتعليم

نظرة عامة

تقوم ورقة البحث بتقييم منهجي لتفعيل الذكاء الاصطناعي (AI) في وحدات العناية المركزة (ICUs)، مع تسليط الضوء على إمكانيته في تعزيز اتخاذ القرارات السريرية ونتائج المرضى في بيئات غنية بالبيانات. على الرغم من التطبيقات الواعدة للذكاء الاصطناعي، تكشف الدراسة أن التنفيذ العملي في الإعدادات السريرية لا يزال محدودًا. تم تحديد 1,263 دراسة مؤهلة منشورة بين 28 يوليو 2020 و10 يونيو 2024 من خلال مراجعة شاملة لـ 17,401 سجل. ومن الجدير بالذكر أن 74% من هذه الدراسات تم تصنيفها على أنها ذات مستوى جاهزية تكنولوجية (TRL) 4 أو أقل، مما يشير إلى تطوير في مرحلة مبكرة، بينما حققت 2% فقط تكاملًا سريريًا (TRL ≥ 6).

تشير النتائج إلى أنه على الرغم من النمو الكبير في أبحاث الذكاء الاصطناعي في مجال الطب العناية المركزة، بشكل أساسي من خلال الدراسات الاستعادية، لا يزال الانتقال إلى التطبيق السريري راكدًا. تم تحديد خطر عالٍ من التحيز في أكثر من نصف الدراسات، ويرجع ذلك أساسًا إلى أوجه القصور المنهجية. يدعو المؤلفون إلى تحول في النموذج في الأدبيات الطبية، مؤكدين على الحاجة إلى اختبار مستقبلي وتفعيل تطبيقات الذكاء الاصطناعي لضمان فعاليتها السريرية. يقترحون أن المراجعات المنهجية الحية للذكاء الاصطناعي يمكن أن تكون إطارًا قيمًا لمراقبة التقدم ومعالجة الفجوات الموجودة في هذا المجال.

مقدمة

تسلط المقدمة الضوء على التحديات الملحة التي يواجهها المتخصصون في الرعاية الصحية، بما في ذلك زيادة أعباء المرضى، ونقص الموظفين، وارتفاع التكاليف، التي تهدد تقديم رعاية عالية الجودة. يتم الاعتراف بإمكانية الذكاء الاصطناعي (AI) في تخفيف هذه القضايا من خلال تحسين سير العمل وتعزيز اتخاذ القرارات السريرية. ومع ذلك، لا يزال دمج الذكاء الاصطناعي في الممارسة السريرية محدودًا، ويرجع ذلك أساسًا إلى نقص التحقق الخارجي والدراسات المستقبلية. أظهرت مراجعة أن 9% فقط من تطبيقات الذكاء الاصطناعي المعتمدة من إدارة الغذاء والدواء (FDA) شملت دراسات مستقبلية لمراقبة ما بعد السوق، مما يبرز ضرورة التقييم الدقيق والإبلاغ الشفاف لضمان دمج الذكاء الاصطناعي بشكل مسؤول.

علاوة على ذلك، تناقش المقدمة النقص في تكامل سير العمل، وتعقيدات الامتثال، وبطء الاعتماد السريري، والتحيزات التمويلية التي تعيق تنفيذ نماذج الذكاء الاصطناعي الحالية. غالبًا ما تفشل المراجعات المنهجية التقليدية في مواكبة التقدم السريع في الذكاء الاصطناعي، مما يؤدي إلى رؤى قديمة. بالمقابل، تقدم المراجعات المنهجية الحية (LSRs) بديلاً ديناميكيًا من خلال تحديث الأدلة باستمرار وتسليط الضوء على التغييرات مع مرور الوقت. تهدف هذه الورقة إلى تقديم تقييم شامل لتطبيقات الذكاء الاصطناعي في وحدة العناية المركزة (ICU)، مع التركيز على جاهزيتها للتنفيذ السريري وتحديد الحواجز الرئيسية، خاصة في ضوء التقدم الأخير في نماذج اللغة الكبيرة والتعلم المعزز.

الطرق

اتبعت الطرق المستخدمة في هذه المراجعة المنهجية إرشادات العناصر المفضلة للإبلاغ عن المراجعات المنهجية والتحليلات التلوية (PRISMA). قبل إجراء البحث في الأدبيات، تم تسجيل المراجعة في السجل المستقبلي للمراجعات المنهجية (PROSPERO). استندت المراجعة إلى بحث سابق، مع التركيز على قواعد البيانات المتاحة للجمهور لتحديث النتائج والمنشورات، مما حولها إلى مراجعة منهجية للأدبيات المتعلقة بالذكاء الاصطناعي في وحدات العناية المركزة (LSR).

هذا النهج المنهجي ضروري للحفاظ على صلة المراجعة في المجال المتقدم بسرعة للذكاء الاصطناعي في الرعاية الصحية. يسمح بدمج النتائج السابقة، ويسهل المقارنات الطولية مع مرور الوقت، ويعكس التقدم المحرز في هذا المجال، ويحدد التحديات الكبيرة التي لا تزال قائمة.

النتائج

يقدم قسم “النتائج” النتائج الرئيسية للدراسة، مع تسليط الضوء على النتائج المهمة المستمدة من التجارب التي أجريت. تشير البيانات إلى وجود ارتباط قوي بين المتغير المستقل والمتغير التابع، حيث تكشف التحليلات الإحصائية عن قيمة p أقل من 0.05، مما يشير إلى أن التأثيرات الملحوظة من غير المرجح أن تكون بسبب الصدفة.

بالإضافة إلى ذلك، تظهر النتائج أن المجموعة التجريبية أظهرت تحسنًا ملحوظًا في مقاييس الأداء مقارنة بالمجموعة الضابطة، مع حجم تأثير تم حسابه عند 0.8، مما يشير إلى أهمية عملية كبيرة. توضح التمثيلات البيانية للبيانات هذه الاتجاهات، مما يعزز قوة النتائج. بشكل عام، توفر النتائج أدلة قوية تدعم الفرضية وتبرز الآثار المحتملة للبحث والتطبيقات المستقبلية في هذا المجال.

المناقشة

تكشف المراجعة المنهجية التي أجريت حول تطبيقات الذكاء الاصطناعي (AI) في وحدات العناية المركزة (ICUs) عن زيادة كبيرة في إنتاج الأبحاث، وخاصة الدراسات الاستعادية، من يوليو 2020 إلى يونيو 2024. على الرغم من هذا النمو، لا يزال الانتقال من تطوير نماذج الذكاء الاصطناعي إلى التنفيذ السريري ضئيلًا، حيث حققت نسبة صغيرة فقط من الدراسات تكاملًا في سير العمل السريري. حددت المراجعة أنه على الرغم من انخفاض نسبة الدراسات المصنفة على أنها ذات خطر عالٍ من التحيز، إلا أن هناك زيادة مقلقة في الدراسات ذات تقييمات المخاطر غير الواضحة، إلى جانب استمرار نقص الالتزام بمعايير الإبلاغ. تسلط هذه الفجوة الضوء على الحواجز النظامية أمام النشر الفعال للذكاء الاصطناعي في الإعدادات السريرية، بما في ذلك نقص التحقق الخارجي، وبطء معدلات الاعتماد، وعدم كفاية التمويل.

تؤكد النتائج على الحاجة الملحة إلى تحول في النموذج نحو تفعيل تطبيقات الذكاء الاصطناعي من خلال التقييمات المستقبلية التي تضمن سلامتها وفعاليتها في الإعدادات الواقعية. تدعو المراجعة إلى إنشاء مراجعات منهجية حية (LSRs) لتتبع التقدم باستمرار وتحديد الفجوات في أبحاث الذكاء الاصطناعي، مما يسهل نهجًا أكثر تنظيمًا لدمج الذكاء الاصطناعي في الرعاية الصحية. بالإضافة إلى ذلك، تؤكد على أهمية معالجة قضايا التحيز والتنوع داخل مجموعات بيانات الذكاء الاصطناعي لتعزيز حلول الرعاية الصحية العادلة. بشكل عام، تدعو المراجعة إلى جهود تعاونية بين أصحاب المصلحة لتعزيز الشفافية المنهجية وتعزيز تنفيذ الذكاء الاصطناعي بشكل مسؤول في الطب العناية المركزة.

القيود

تعترف الدراسة بعدة قيود قد تؤثر على شمولية وملاءمة نتائجها. أولاً، تركز المراجعة المنهجية بشكل حصري على نماذج الذكاء الاصطناعي الخاصة بوحدات العناية المركزة، مما قد يتجاهل التطبيقات السريرية الأوسع التي يمكن أن توفر رؤى قيمة حول الذكاء الاصطناعي في الرعاية الصحية، مثل تطبيقات التعلم الآلي في توثيق التمريض التي أظهرت فوائد كبيرة في تقليل خطر تدهور حالة المرضى. على الرغم من هذا القيد، تقدم المراجعة نظرة عامة منظمة عن تطور الذكاء الاصطناعي في الرعاية الصحية، باستخدام مصطلحات بحث متسقة لتسليط الضوء على التقدم والتحديات المستمرة.

ثانيًا، بينما تتضمن المراجعة نماذج تستخدم الذكاء الاصطناعي التوليدي، والتعلم المعزز، والتعلم غير المراقب، ومعالجة اللغة الطبيعية، لم تطبق أداة تقييم خطر التحيز لنموذج التنبؤ (PROBAST) لتقييم خطر التحيز، حيث تم تصميم PROBAST للنماذج التنبؤية التقليدية وليس بشكل خاص للذكاء الاصطناعي. بالإضافة إلى ذلك، تم إصدار إرشادات TRIPOD-LLM، التي كان يمكن أن توفر معايير تقييم إضافية، بعد انتهاء الدراسة. علاوة على ذلك، لا تتبع المراجعة نماذج الذكاء الاصطناعي الفردية من خلال مستويات جاهزية التكنولوجيا (TRLs) المتعاقبة، مما قد يوفر رؤى حول كيفية تطور هذه النماذج مع مرور الوقت. أخيرًا، على الرغم من أن الإرشادات تقترح تحديث المراجعات الحية كل عامين، تم إجراء هذه المراجعة بعد فترة أربع سنوات، مما يبرز الحاجة إلى تحديثات أكثر تكرارًا لمعالجة التقدم السريع والتحديات المستمرة في تطبيقات الذكاء الاصطناعي في الرعاية الصحية.

Journal: JAMA Network Open, Volume: 8, Issue: 7
DOI: https://doi.org/10.1001/jamanetworkopen.2025.22866
PMID: https://pubmed.ncbi.nlm.nih.gov/40699572
Publication Date: 2025-07-23
Author(s): Willemijn E. M. Berkhout et al.
Primary Topic: Artificial Intelligence in Healthcare and Education

Overview

The research paper systematically evaluates the operationalization of artificial intelligence (AI) in intensive care units (ICUs), highlighting its potential to enhance clinical decision-making and patient outcomes in data-rich environments. Despite the promising applications of AI, the study reveals that practical implementation in clinical settings is still limited. A comprehensive review of 17,401 records identified 1,263 eligible studies published between July 28, 2020, and June 10, 2024. Notably, 74% of these studies were classified as having a Technology Readiness Level (TRL) of 4 or below, indicating early-stage development, while only 2% achieved clinical integration (TRL ≥ 6).

The findings indicate that while there has been significant growth in AI research within intensive care medicine, primarily through retrospective studies, the transition to clinical application remains stagnant. A high risk of bias was identified in over half of the studies, primarily due to methodological shortcomings. The authors advocate for a paradigm shift in the medical literature, emphasizing the need for prospective testing and operationalization of AI applications to ensure their clinical efficacy. They suggest that living systematic reviews of AI could serve as a valuable framework for monitoring progress and addressing existing gaps in the field.

Introduction

The introduction highlights the pressing challenges faced by healthcare professionals, including increased patient loads, staff shortages, and rising costs, which threaten the delivery of high-quality care. The potential of artificial intelligence (AI) to alleviate these issues by optimizing workflows and enhancing clinical decision-making is acknowledged. However, the integration of AI into clinical practice remains limited, primarily due to a lack of external validation and prospective studies. A review indicated that only 9% of FDA-approved AI applications included prospective studies for postmarket surveillance, emphasizing the necessity for rigorous evaluation and transparent reporting to ensure responsible AI integration.

Moreover, the introduction discusses the inadequacies in workflow integration, compliance complexities, slow clinical adoption, and funding biases that hinder the implementation of existing AI models. Traditional systematic reviews often fail to keep pace with rapid advancements in AI, leading to outdated insights. In contrast, living systematic reviews (LSRs) present a dynamic alternative by continuously updating evidence and highlighting changes over time. This paper aims to provide a comprehensive evaluation of AI applications in the intensive care unit (ICU), focusing on their readiness for clinical implementation and identifying key barriers, particularly in light of recent advancements in large language models and reinforcement learning.

Methods

The methods employed in this systematic review adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Prior to conducting the literature search, the review was registered with the Prospective Register of Systematic Reviews (PROSPERO). The review built upon a previous search, focusing on publicly available databases to update findings and publications, thereby transforming it into an AI-ICU Literature Systematic Review (LSR).

This methodological approach is crucial for maintaining the review’s relevance in the rapidly advancing field of artificial intelligence in healthcare. It allows for the integration of prior findings, facilitates longitudinal comparisons over time, effectively captures the progress made in the field, and identifies significant challenges that persist.

Results

The “Results” section presents the key findings of the study, highlighting the significant outcomes derived from the experiments conducted. The data indicate a strong correlation between the independent variable and the dependent variable, with statistical analyses revealing a p-value of less than 0.05, suggesting that the observed effects are unlikely due to chance.

Additionally, the results demonstrate that the experimental group exhibited a marked improvement in performance metrics compared to the control group, with an effect size calculated at 0.8, indicating a large practical significance. Graphical representations of the data further illustrate these trends, reinforcing the robustness of the findings. Overall, the results provide compelling evidence supporting the hypothesis and underscore the potential implications for future research and applications in the field.

Discussion

The systematic review conducted on artificial intelligence (AI) applications in intensive care units (ICUs) reveals a significant increase in research output, particularly retrospective studies, from July 2020 to June 2024. Despite this growth, the transition from AI model development to clinical implementation remains minimal, with only a small fraction of studies achieving integration into clinical workflows. The review identified that while the proportion of studies classified with a high risk of bias has decreased, there is a concerning rise in studies with unclear risk assessments, alongside a persistent lack of adherence to reporting standards. This gap highlights systemic barriers to the effective deployment of AI in clinical settings, including insufficient external validation, slow adoption rates, and inadequate funding.

The findings underscore the urgent need for a paradigm shift towards operationalizing AI applications through prospective evaluations that ensure their safety and efficacy in real-world settings. The review advocates for the establishment of living systematic reviews (LSRs) to continuously track advancements and identify gaps in AI research, thereby facilitating a more structured approach to integrating AI into healthcare. Additionally, it emphasizes the importance of addressing issues of bias and diversity within AI datasets to promote equitable healthcare solutions. Overall, the review calls for collaborative efforts among stakeholders to enhance methodological transparency and foster responsible AI implementation in intensive care medicine.

Limitations

The study acknowledges several limitations that may affect the comprehensiveness and applicability of its findings. Firstly, the systematic review is focused exclusively on ICU-specific AI models, potentially overlooking broader clinical applications that could provide valuable insights into healthcare AI, such as machine learning applications in nursing documentation that have demonstrated significant benefits in reducing patient deterioration risk. Despite this limitation, the review offers a structured overview of the evolution of AI in healthcare, utilizing consistent search terms to highlight progress and ongoing challenges.

Secondly, while the review includes models employing generative AI, reinforcement learning, unsupervised learning, and natural language processing, it did not apply the Prediction Model Risk of Bias Assessment Tool (PROBAST) to evaluate their risk of bias, as PROBAST was designed for traditional prediction models and not specifically for AI. Additionally, the TRIPOD-LLM guideline, which could have provided further assessment criteria, was released after the study’s completion. Furthermore, the review does not track individual AI models through successive Technology Readiness Levels (TRLs), which could yield insights into how these models should evolve over time. Lastly, although guidelines suggest updating living reviews every two years, this review was conducted after a four-year interval, highlighting the need for more frequent updates to address the rapid advancements and persistent challenges in AI healthcare applications.