ثغرات تسميم البيانات عبر هياكل الذكاء الاصطناعي في الرعاية الصحية: إطار أمني تحليلي واستراتيجيات دفاعية Data Poisoning Vulnerabilities Across Health Care Artificial Intelligence Architectures: Analytical Security Framework and Defense Strategies

المجلة: Journal of Medical Internet Research، المجلد: 28
DOI: https://doi.org/10.2196/87969
PMID: https://pubmed.ncbi.nlm.nih.gov/41575020
تاريخ النشر: 2026-01-23
المؤلف: Farhad Abtahi وآخرون
الموضوع الرئيسي: الصلابة ضد الهجمات في تعلم الآلة

نظرة عامة

تسلط ورقة البحث الضوء على الثغرات الكبيرة في أنظمة الذكاء الاصطناعي (AI) في الرعاية الصحية، لا سيما فيما يتعلق بتسمم البيانات، والتي تفشل الدفاعات الحالية والأطر التنظيمية في معالجتها بشكل كافٍ. حدد تحليل التهديدات الشامل ثمانية سيناريوهات هجوم عبر أربع فئات، بما في ذلك الثغرات في الشبكات العصبية التلافيفية، ونماذج اللغة الكبيرة، وعوامل التعلم المعزز، بالإضافة إلى استغلال البنية التحتية من خلال التعلم الفيدرالي وأنظمة التوثيق الطبي. تشير النتائج إلى أن الخصوم يمكنهم اختراق الذكاء الاصطناعي للرعاية الصحية باستخدام ما بين 100-500 عينة مسمومة، محققين معدلات نجاح هجوم تتجاوز 60%، بينما قد تستغرق عملية الكشف من 6 إلى 12 شهرًا أو قد لا تحدث أبدًا.

تشدد الورقة على أن الحمايات القانونية الحالية، مثل قانون قابلية نقل وتأمين الرعاية الصحية (HIPAA) واللائحة العامة لحماية البيانات (GDPR)، توفر بشكل متناقض غطاءً للمهاجمين من خلال إعاقة جهود الكشف. بالإضافة إلى ذلك، تسمح ثغرات سلسلة التوريد لبائع واحد مخترق بالتأثير على مؤسسات متعددة في وقت واحد. يقترح المؤلفون دفاعات متعددة الطبقات، بما في ذلك الاختبار الإجباري ضد الهجمات، وهياكل كشف عدم الاتفاق مثل MEDLEY، والتنسيق الدولي بشأن معايير أمان الذكاء الاصطناعي في الرعاية الصحية. كما يتساءلون عن ملاءمة الهياكل الحالية للذكاء الاصطناعي المغلق للقرارات السريرية الحرجة، داعين إلى التحول نحو أنظمة قابلة للتفسير تعطي الأولوية للسلامة القابلة للتحقق على الأداء البسيط.

مقدمة

تستعرض المقدمة سيناريو افتراضي ولكنه معقول حيث يفشل الذكاء الاصطناعي في قسم الأشعة في مستشفى ما في اكتشاف سرطانات الرئة في مراحلها المبكرة لدى مرضى من خلفيات عرقية محددة بسبب إدخال حوالي 250 عينة تدريب مسمومة. تعكس هذه الحادثة الثغرات الحقيقية في أنظمة الذكاء الاصطناعي للرعاية الصحية، وتبرز إمكانية الهجمات المستهدفة ديموغرافيًا التي يمكن أن تظل غير مكتشفة لفترات طويلة، مما يؤدي إلى تأخيرات في التشخيص وتدهور نتائج العلاج. تشير الدراسات الحديثة إلى أن نجاح مثل هذه الهجمات بالتسمم يعتمد على العدد المطلق للعينات المسمومة بدلاً من نسبتها داخل مجموعة البيانات، مما يغير بشكل جذري فهم نماذج تهديد التسمم.

يتم نشر تقنيات الذكاء الاصطناعي في الرعاية الصحية بسرعة، بما في ذلك نماذج اللغة الكبيرة (LLMs) للتوثيق السريري والشبكات العصبية التلافيفية (CNNs) للتصوير الطبي، دون تقييمات أمنية كافية. هذه الأنظمة مسؤولة بشكل متزايد عن قرارات حاسمة تؤثر على رعاية المرضى، ومع ذلك لا تزال قابليتها للهجمات العدائية غير مستكشفة بشكل كافٍ. تهدف المقالة إلى تحليل نقدي للثغرات الهيكلية الكامنة في هياكل الذكاء الاصطناعي للرعاية الصحية، لا سيما كيف يمكن أن تزيد بعض طرق النشر، مثل التعلم الفيدرالي، من هذه المخاطر. كما تنتقد الأطر التنظيمية الحالية وبروتوكولات اختبار الأمان، متسائلة في النهاية عن كفاية الهياكل الحالية للذكاء الاصطناعي في بيئة الرعاية الصحية ذات المخاطر العالية.

الطرق

في هذا القسم، يحدد المؤلفون منهجيتهم لتقييم التأثيرات المحتملة على سلامة المرضى من نماذج الذكاء الاصطناعي التشخيصية ونماذج دعم القرار السريري الكبيرة. استخدموا تحليلًا قائمًا على السيناريوهات يدمج معدلات نجاح الهجمات التجريبية مع بيانات النتائج السريرية. بالنسبة للذكاء الاصطناعي التشخيصي، ركز التحليل على عواقب السلبيات الكاذبة المنهجية، لا سيما في الحالات المهددة للحياة مثل السرطانات في مراحلها المبكرة، مستخدمين إحصائيات البقاء المنشورة بناءً على توقيت التشخيص. في حالة نماذج دعم القرار السريري، قيمت الدراسة عواقب التوصيات غير المناسبة للأدوية، والجرعات المنخفضة في إدارة الألم، والإجراءات الغازية غير الضرورية.

اعتمد المؤلفون افتراضات محافظة لتفضيل تقديرات الأذى الحد الأدنى، بما في ذلك فكرة أن المحفزات الخلفية ستؤثر فقط على مجموعات مرضى محددة بدلاً من السكان بأكملهم، وأن الهجمات ستؤدي إلى تدهور جودة القرار بدلاً من الفشل الكلي للنظام. كما أخذوا في الاعتبار إمكانية حدوث حالات قريبة من الفشل السريري حيث قد تمنع الحواجز اللاحقة الأذى ويفترضون أن الكشف عن المشكلات سيحدث خلال 12-24 شهرًا من خلال المراقبة الوبائية. بالإضافة إلى ذلك، أخذت المنهجية في الاعتبار التأثيرات المتسلسلة في أنظمة الذكاء الاصطناعي الوكيلة، حيث يمكن أن يؤدي قرار واحد مخترق إلى سلسلة من الإجراءات دون المستوى الأمثل التي تؤثر على مرضى متعددين. تضمنت السيناريوهات التي تم نمذجتها تأخيرات منهجية في جدولة المواعيد، وفشل تخصيص الموارد في وحدات العناية المركزة، وأخطاء إدارة الأدوية، والتي أبلغت مجتمعة عن تقييم المخاطر النسبية عبر سياقات نشر الذكاء الاصطناعي في الرعاية الصحية المختلفة.

النتائج

في هذا القسم، يقيم المؤلفون بشكل منهجي ثغرات تسمم البيانات من خلال تحليل ثمانية سيناريوهات هجوم متميزة مصنفة إلى أربع مجموعات: هجمات محددة للهندسة المعمارية، وهجمات استغلال البنية التحتية، وهجمات أنظمة تخصيص الموارد الحرجة، وهجمات سلسلة التوريد. يتم إبلاغ كل سيناريو من خلال أبحاث أمنية تجريبية ومصممة لتعكس التهديدات الواقعية داخل أنظمة الرعاية الصحية، مما يبرز الجدوى التقنية لطرق الهجوم المختلفة. يكشف التحليل عن فجوات أمنية كبيرة في ممارسات نشر الذكاء الاصطناعي الحالية في الرعاية الصحية، كما هو موضح في الجدول 1.

تؤكد النتائج على الأسطح المتنوعة للهجمات وتعقيد تحديات الكشف التي تطرحها الجهات الفاعلة المختلفة. تزيد الطبيعة الموزعة لبنية بيانات الرعاية الصحية، كما هو موضح في الشكل 1، من هذه الثغرات من خلال توفير عدة طرق للهجوم داخل النظام البيئي. بشكل عام، تؤكد الأبحاث على الحاجة الملحة لتعزيز التدابير الأمنية لحماية ضد هذه التهديدات المتعددة الأوجه في تطبيقات الذكاء الاصطناعي للرعاية الصحية.

المناقشة

يقدم قسم المناقشة في ورقة البحث هذه إطارًا تحليليًا شاملاً لتقييم الثغرات الأمنية في أنظمة الذكاء الاصطناعي للرعاية الصحية. يجمع بين النتائج التجريبية من 41 دراسة رئيسية نشرت بين عامي 2019 و2025، مع التركيز على جدوى الهجمات، والثغرات المعمارية، وآليات الدفاع، والأطر التنظيمية. يؤكد التحليل على أهمية نماذج التهديد الواقعية، لا سيما تلك التي تتضمن الوصول الداخلي وتسمم بيانات وقت التدريب، بينما يصنف أنظمة الذكاء الاصطناعي للرعاية الصحية بناءً على هياكلها العصبية الأساسية وتطبيقاتها السريرية. تبرز الورقة ثلاثة أنواع رئيسية من الهياكل: نماذج اللغة الكبيرة المعتمدة على المحولات، والشبكات العصبية التلافيفية للتصوير الطبي، وعوامل التعلم المعزز لعمليات العمل السريرية. يتم وضع نتائج الأمان لكل هيكل في سياقات نشر سريرية محددة، مما يكشف كيف تؤثر الخصائص المعمارية على كل من جدوى الهجمات وتعقيد الدفاعات.

تقوم الورقة أيضًا ببناء نموذج تهديد مفصل يصف قدرات ودوافع المهاجمين المحتملين، مع التركيز بشكل خاص على التهديدات الداخلية. تحدد سيناريوهات هجوم متنوعة، بما في ذلك تسمم البيانات من خلال أنظمة جمع البيانات المخترقة والتلاعب بالتغذية الراجعة في عمليات التعلم المعزز. يكشف تقييم الإطار التنظيمي عن فجوات كبيرة في الإرشادات الحالية، لا سيما فيما يتعلق بالمتانة ضد الهجمات العدائية وكشف أنظمة الذكاء الاصطناعي المخترقة. يحدد تقييم آليات الدفاع ضد هجمات تسمم البيانات خمس فئات رئيسية، ويقيم ملاءمتها لسياقات الرعاية الصحية ويبرز الحواجز العملية للنشر. تختتم المناقشة باستكشاف فئات الهجمات المختلفة، بما في ذلك الهجمات المحددة للهندسة المعمارية، واستغلال البنية التحتية، وأنظمة تخصيص الموارد الحرجة، وثغرات سلسلة التوريد، مما يبرز التفاعل المعقد بين نشر الذكاء الاصطناعي في الرعاية الصحية ومخاطر الأمان.

Journal: Journal of Medical Internet Research, Volume: 28
DOI: https://doi.org/10.2196/87969
PMID: https://pubmed.ncbi.nlm.nih.gov/41575020
Publication Date: 2026-01-23
Author(s): Farhad Abtahi et al.
Primary Topic: Adversarial Robustness in Machine Learning

Overview

The research paper highlights significant vulnerabilities in healthcare artificial intelligence (AI) systems, particularly concerning data poisoning, which current defenses and regulatory frameworks fail to adequately address. A comprehensive threat analysis identified eight attack scenarios across four categories, including vulnerabilities in convolutional neural networks, large language models, and reinforcement learning agents, as well as infrastructure exploitation through federated learning and medical documentation systems. The findings indicate that adversaries can compromise healthcare AI with as few as 100-500 poisoned samples, achieving attack success rates exceeding 60%, while detection may take from 6 to 12 months or may never occur.

The paper emphasizes that existing legal protections, such as the Health Insurance Portability and Accountability Act (HIPAA) and the General Data Protection Regulation (GDPR), paradoxically provide cover for attackers by hindering detection efforts. Additionally, supply chain vulnerabilities allow a single compromised vendor to affect multiple institutions simultaneously. The authors propose multi-layered defenses, including mandatory adversarial testing, ensemble disagreement detection architectures like MEDLEY, and international coordination on healthcare AI security standards. They also question the appropriateness of current black-box AI architectures for critical clinical decisions, advocating for a shift towards interpretable systems that prioritize verifiable safety over mere performance.

Introduction

The introduction outlines a hypothetical yet plausible scenario in which a hospital’s radiology AI fails to detect early-stage lung cancers in patients from specific ethnic backgrounds due to the introduction of approximately 250 poisoned training samples. This incident, which reflects real vulnerabilities in healthcare AI systems, highlights the potential for demographic-targeted attacks that can go undetected for extended periods, resulting in delayed diagnoses and worsened treatment outcomes. Recent studies indicate that the success of such poisoning attacks is contingent on the absolute number of poisoned samples rather than their proportion within the dataset, fundamentally altering the understanding of poisoning threat models.

The rapid deployment of healthcare AI technologies, including large language models (LLMs) for clinical documentation and convolutional neural networks (CNNs) for medical imaging, occurs without adequate security evaluations. These systems are increasingly responsible for critical decisions affecting patient care, yet their susceptibility to adversarial attacks remains underexplored. The article aims to critically analyze the structural vulnerabilities inherent in healthcare AI architectures, particularly how certain deployment methods, like federated learning, may exacerbate these risks. It also critiques existing regulatory frameworks and security testing protocols, ultimately questioning the adequacy of current AI architectures for the high-stakes environment of healthcare.

Methods

In this section, the authors outline their methodology for assessing the potential patient safety impacts of diagnostic AI and clinical decision support large language models (LLMs). They employed a scenario-based analysis that integrates empirical attack success rates with clinical outcome data. For diagnostic AI, the analysis focused on the consequences of systematic false negatives, particularly in life-threatening conditions such as early-stage cancers, utilizing published survival statistics based on the timing of diagnosis. In the case of clinical decision support LLMs, the study evaluated the ramifications of inappropriate medication recommendations, underdosing in pain management, and unnecessary invasive procedures.

The authors adopted conservative assumptions to favor lower-bound harm estimates, including the notion that backdoor triggers would affect only specific patient groups rather than entire populations, and that attacks would degrade decision quality rather than result in total system failure. They also considered the possibility of clinical near-misses where downstream safeguards might prevent harm and assumed that detection of issues would occur within 12-24 months through epidemiological monitoring. Additionally, the methodology accounted for cascading effects in agentic AI systems, where a single compromised decision could lead to a series of suboptimal actions impacting multiple patients. Scenarios modeled included systematic appointment-scheduling delays, resource allocation failures in intensive care units, and medication management errors, which collectively informed the assessment of relative risk across various healthcare AI deployment contexts.

Results

In this section, the authors systematically evaluate data poisoning vulnerabilities by analyzing eight distinct attack scenarios categorized into four groups: architecture-specific attacks, infrastructure exploitation attacks, critical resource allocation system attacks, and supply chain attacks. Each scenario is informed by empirical security research and tailored to reflect realistic threats within healthcare systems, highlighting the technical feasibility of various attack methods. The analysis reveals significant security gaps in current healthcare AI deployment practices, as illustrated in Table 1.

The findings underscore the diverse attack surfaces and the complexity of detection challenges posed by different threat actors. The distributed nature of healthcare data infrastructure, depicted in Figure 1, further exacerbates these vulnerabilities by providing multiple attack vectors within the ecosystem. Overall, the research emphasizes the urgent need for enhanced security measures to protect against these multifaceted threats in healthcare AI applications.

Discussion

The discussion section of this research paper presents a comprehensive analytical framework for assessing the security vulnerabilities of healthcare AI systems. It synthesizes empirical findings from 41 key studies published between 2019 and 2025, focusing on attack feasibility, architectural vulnerabilities, defense mechanisms, and regulatory frameworks. The analysis emphasizes the importance of realistic threat models, particularly those involving insider access and training-time data poisoning, while categorizing healthcare AI systems based on their underlying neural architectures and clinical applications. The paper highlights three primary architecture types: transformer-based large language models, convolutional neural networks for medical imaging, and reinforcement learning agents for clinical workflows. Each architecture’s security findings are contextualized within specific clinical deployment scenarios, revealing how architectural characteristics influence both the feasibility of attacks and the complexity of defenses.

The paper further constructs a detailed threat model that characterizes potential attackers’ capabilities and motivations, particularly focusing on insider threats. It identifies various attack scenarios, including data poisoning through compromised data collection systems and manipulation of feedback in reinforcement learning processes. The regulatory framework assessment reveals significant gaps in current guidelines, particularly regarding adversarial robustness and the detection of backdoored AI systems. The evaluation of defense mechanisms against data poisoning attacks identifies five primary categories, assessing their applicability to healthcare contexts and highlighting practical deployment barriers. The discussion concludes with an exploration of various attack categories, including architecture-specific attacks, infrastructure exploitation, critical resource allocation systems, and supply chain vulnerabilities, underscoring the complex interplay between healthcare AI deployment and security risks.