الاستفادة من وثائق تصنيف إدارة الغذاء والدواء ونموذج اللغة الكبير لتعزيز التوصيف والتصنيف للأحداث السلبية للأدوية مع AskFDALabel Leveraging FDA Labeling Documents and Large Language Model to Enhance Annotation, Profiling, and Classification of Drug Adverse Events with AskFDALabel

المجلة: Drug Safety، المجلد: 48، العدد: 6
DOI: https://doi.org/10.1007/s40264-025-01520-1
PMID: https://pubmed.ncbi.nlm.nih.gov/39979771
تاريخ النشر: 2025-02-20
المؤلف: Leihong Wu وآخرون
الموضوع الرئيسي: اليقظة الدوائية وردود الفعل السلبية للأدوية

نظرة عامة

تقدم ورقة البحث تطوير AskFDALabel، وهو إطار عمل آلي مصمم لاستخراج بيانات الأحداث السلبية (AE) من مستندات تصنيف الأدوية التابعة لإدارة الغذاء والدواء (FDA)، مما يعالج القلق الكبير للصحة العامة الذي تسببه الأحداث السلبية المتعلقة بالأدوية. تعتبر طرق الاستخراج اليدوية التقليدية كثيفة العمالة وتتطلب خبرة متخصصة، مما يجعل من الصعب الحفاظ عليها بسبب التحديثات المتكررة في مستندات التصنيف. يستخدم الإطار المقترح نهج توليد معزز بالاسترجاع (RAG)، مما يعزز استنتاج نموذج اللغة الكبير (LLM) القياسي من خلال سير عمل منظم يتضمن قوالب محددة للمهام، واستعلامات قاعدة البيانات، وإعداد المحتوى.

في التقييمات المعيارية، أظهر AskFDALabel أداءً مثيرًا للإعجاب مع درجات F1 تبلغ 0.978 لإصابة الكبد الناتجة عن الأدوية (DILI)، و0.931 لسمية القلب الناتجة عن الأدوية (DICT)، و0.911 للتعرف على مصطلحات الأحداث السلبية، متفوقًا بشكل كبير على الطرق التقليدية. يوفر الإطار ليس فقط دقة عالية وتوافقًا مع التعليقات التوضيحية البشرية، ولكن أيضًا محتوى مستشهد به وتفسيرات مفصلة، مما يسهل التحقق اليدوي. تشير النتائج إلى أن AskFDALabel يمكن أن يحدث ثورة في توضيح الأحداث السلبية وأبحاث سلامة الأدوية من خلال أتمتة العمليات كثيفة العمالة، وتحسين دقة استخراج البيانات، وتعزيز موثوقية المخرجات الآلية، مما يسهم في تحسين نتائج المرضى في مراقبة سلامة الأدوية.

مقدمة

تسلط مقدمة ورقة البحث الضوء على القلق الكبير للصحة العامة الذي تسببه الأحداث السلبية للأدوية (AEs)، والتي تم ربطها بأكثر من 70,000 حالة وفاة سنويًا في الولايات المتحدة منذ عام 2020، كما أفادت به نظام الإبلاغ عن الأحداث السلبية التابعة لإدارة الغذاء والدواء (FAERS). إن تحديد ومراقبة الأحداث السلبية من خلال مستندات سلامة الأدوية، وخاصة مستندات تصنيف الأدوية التابعة لإدارة الغذاء والدواء، أمر بالغ الأهمية للوكالات التنظيمية والباحثين. تحتوي هذه المستندات على معلومات سلامة أساسية، وتم تطوير أدوات مثل FDALabel لتسهيل الوصول إلى أكثر من 150,000 من هذه المستندات، مما يدعم جهود البحث والتصنيف الواسعة للأحداث السلبية.

على الرغم من فائدة هذه المستندات التصنيفية، فإن الاستخراج اليدوي والتعليق على بيانات الأحداث السلبية يمثلان تحديات، بما في ذلك كثافة العمل، والحاجة إلى تحديثات متكررة، والتباين في تفسير الخبراء. يقدم إدخال نماذج اللغة الكبيرة (LLMs) حلاً واعدًا لأتمتة وتعزيز دقة معالجة بيانات الأحداث السلبية. تقدم الورقة AskFDALabel (الإصدار 2)، وهو إطار عمل محسّن يستخدم نماذج اللغة الكبيرة في بيئة آمنة، ويشمل ميزات مثل التعرف على المصطلحات ذات الاهتمام وعملية استرجاع معززة بقاعدة بيانات (RAG). يقدم المؤلفون تقارير عن تجارب تتعلق بتصنيف إصابة الكبد الناتجة عن الأدوية (DILI)، وتصنيف سمية القلب الناتجة عن الأدوية (DICT)، وتوصيف الأحداث السلبية للأدوية، مما يوضح فعالية الإطار مقارنة بأساليب المراجعة اليدوية التقليدية.

مناقشة

تسلط قسم المناقشة في ورقة البحث الضوء على التقدم الذي تم إحرازه في إطار عمل AskFDALabel، وخاصة من خلال تنفيذ نهج استرجاع المعلومات الهجين الذي يجمع بين استعلامات قاعدة البيانات التقليدية واستنتاج نموذج اللغة الكبير (LLM). تعزز هذه الطريقة من مصداقية وملاءمة المعلومات المسترجعة، وهو أمر حاسم للمهام التنظيمية، من خلال ضمان تطابق البيانات بدقة مع استفسارات المستخدم. يسمح التصميم المعياري للإطار بإنشاء قوالب قابلة للتخصيص مصممة لمهام محددة، مثل تصنيف إصابة الكبد الناتجة عن الأدوية (DILI) وتوصيف الأحداث السلبية، مما يحسن دقة وكفاءة عملية استرجاع المعلومات.

علاوة على ذلك، تؤكد الدراسة على إمكانية تكيف إطار عمل AskFDALabel مع نماذج اللغة الكبيرة الأحدث، مثل Llama 3.1-70B، التي تدعم حجم إدخال رموز أكبر، مما يسهل معالجة مستندات التصنيف بالكامل. تظهر النتائج تحسينات كبيرة في دقة التصنيف لكل من DILI وسمية القلب الناتجة عن الأدوية (DICT)، مع درجات F1 تبلغ 0.978 و0.931، على التوالي. يتجاوز هذا الأداء النماذج السابقة، مما يشير إلى أن دمج نماذج اللغة الكبيرة مع طرق الاسترجاع التقليدية يمكن أن يؤدي إلى نتائج أكثر موثوقية في العلوم التنظيمية. يقترح المؤلفون أن العمل المستقبلي قد يتضمن تحسين نماذج اللغة الكبيرة الخاصة بالمجال لتعزيز قدرات الإطار بشكل أكبر، اعتمادًا على تطور السياسات التنظيمية المتعلقة باستخدام نماذج اللغة الكبيرة الخارجية.

القيود

تسلط القيود في النهج الحالي الضوء على الاعتماد على استعلامات SQL الهيكلية الثابتة لاسترجاع مستندات تصنيف الأدوية، والتي، على الرغم من ضمانها الاستقرار وتناسق البيانات، تقيد القدرة على التكيف مع مدخلات المستخدم المحددة مثل طرق الإدارة أو أشكال الجرعات. قد تستكشف الأبحاث المستقبلية دمج نماذج اللغة الكبيرة (LLMs) لتوليد استعلامات SQL ديناميكيًا بناءً على مطالبات المستخدم، على الرغم من أن ذلك يتطلب اختبارًا دقيقًا لضمان الموثوقية.

علاوة على ذلك، قد يؤدي تحسين نماذج اللغة الكبيرة باستخدام بيانات محددة للمجال إلى تحسين فعاليتها في التعامل مع الاستفسارات التنظيمية المحددة. ومع ذلك، من الضروري تحقيق توازن بين هذه التحسينات ومخاطر الإفراط في التكيف مع مجموعات البيانات الضيقة والتكاليف الحاسوبية المرتبطة بها. على الرغم من أن الدراسة ركزت على مستندات تصنيف الأدوية المتاحة للجمهور، فإن إطار عمل AskFDALabel مصمم لاستيعاب بيانات آمنة وغير عامة، مما يسمح بإجراء مقارنات عبر المستندات التاريخية والبيولوجية المعادلة. تشير هذه النتائج إلى طريق واعد لاستغلال الذكاء الاصطناعي في تحليل المستندات التنظيمية السرية، مثل تقارير السلامة ومواد التدريب، مما يعزز دور الذكاء الاصطناعي في العلوم التنظيمية.

Journal: Drug Safety, Volume: 48, Issue: 6
DOI: https://doi.org/10.1007/s40264-025-01520-1
PMID: https://pubmed.ncbi.nlm.nih.gov/39979771
Publication Date: 2025-02-20
Author(s): Leihong Wu et al.
Primary Topic: Pharmacovigilance and Adverse Drug Reactions

Overview

The research paper presents the development of AskFDALabel, an automated framework designed to extract adverse event (AE) data from FDA drug labeling documents, addressing the significant public health concern posed by drug-related AEs. Traditional manual extraction methods are labor-intensive and require specialized expertise, making them difficult to maintain due to frequent updates in labeling documents. The proposed framework utilizes a retrieval-augmented generation (RAG) approach, enhancing standard large language model (LLM) inference through a structured workflow that includes task-specific templates, database querying, and content preparation.

In benchmark evaluations, AskFDALabel demonstrated impressive performance with F1-scores of 0.978 for drug-induced liver injury (DILI), 0.931 for drug-induced cardiotoxicity (DICT), and 0.911 for AE term recognition, significantly outperforming traditional methods. The framework not only provides high accuracy and consistency with human annotations but also offers cited content and detailed explanations, facilitating manual verification. The findings suggest that AskFDALabel could revolutionize AE annotation and drug safety research by automating labor-intensive processes, improving data extraction accuracy, and enhancing the reliability of automated outputs, ultimately contributing to better patient outcomes in pharmacovigilance.

Introduction

The introduction of the research paper highlights the significant public health concern posed by drug adverse events (AEs), which have been linked to over 70,000 deaths annually in the United States since 2020, as reported by the FDA Adverse Events Reporting System (FAERS). The identification and monitoring of AEs through drug safety documents, particularly FDA drug labeling documents, are crucial for regulatory agencies and researchers. These documents contain essential safety information, and tools like FDALabel have been developed to facilitate access to over 150,000 such documents, supporting extensive AE research and classification efforts.

Despite the utility of these labeling documents, the manual extraction and annotation of AE data present challenges, including labor intensity, the need for frequent updates, and variability in expert interpretation. The introduction of large language models (LLMs) offers a promising solution to automate and enhance the accuracy of AE data processing. The paper presents AskFDALabel (version 2), an improved framework that utilizes LLMs in a secure environment, incorporating features such as term-of-interest recognition and a database-enhanced retrieval-augmented generation (RAG) process. The authors report on experiments related to drug-induced liver injury (DILI) classification, drug-induced cardiotoxicity (DICT) classification, and drug AE profiling, demonstrating the framework’s effectiveness compared to traditional manual review methods.

Discussion

The discussion section of the research paper highlights the advancements made in the AskFDALabel framework, particularly through the implementation of a hybrid information retrieval approach that combines traditional database queries with large language model (LLM) inference. This method enhances the authenticity and relevance of retrieved information, crucial for regulatory tasks, by ensuring that the data corresponds precisely to user queries. The framework’s modular design allows for customizable templates tailored to specific tasks, such as Drug-Induced Liver Injury (DILI) classification and adverse event profiling, thereby improving the accuracy and efficiency of the information retrieval process.

Furthermore, the study emphasizes the potential of the AskFDALabel framework to adapt to newer LLMs, such as Llama 3.1-70B, which supports a larger token input size, facilitating the processing of entire labeling documents. The results demonstrate significant improvements in classification accuracy for both DILI and Drug-Induced Cardiotoxicity (DICT), with F1-scores of 0.978 and 0.931, respectively. This performance surpasses previous models, indicating that the integration of LLMs with traditional retrieval methods can yield more reliable outcomes in regulatory science. The authors suggest that future work could involve fine-tuning domain-specific LLMs to further enhance the framework’s capabilities, contingent on evolving regulatory policies regarding the use of external LLMs.

Limitations

The limitations of the current approach highlight the reliance on fixed structured SQL queries for retrieving drug labeling documents, which, while ensuring stability and data consistency, restricts adaptability to specific user inputs like routes of administration or dosage forms. Future research may explore the integration of large language models (LLMs) to dynamically generate SQL queries based on user prompts, although this necessitates thorough testing to ensure reliability.

Moreover, fine-tuning LLMs with domain-specific data could improve their efficacy in handling regulatory-specific queries. However, it is crucial to balance these enhancements with the risks of overfitting to narrow datasets and the associated computational costs. Although the study focused on publicly available drug labeling documents, the framework of AskFDALabel is designed to accommodate secure, non-public data, allowing for comparisons across historical and bioequivalent documents. These findings indicate a promising avenue for leveraging AI in the analysis of confidential regulatory documents, such as safety reports and training materials, thereby advancing the role of AI in regulatory science.