العالِم العربي - نموذج لغوي قائم على BERT لاستخراج دقيق للأحداث السلبية للأدوية من وسائل التواصل الاجتماعي: التنفيذ والتقييم والمساهمات في ممارسات اليقظة الدوائية BERT-based language model for accurate drug adverse event extraction from social media: implementation, evaluation, and contributions to pharmacovigilance practices

المجلة: Frontiers in Public Health، المجلد: 12
DOI: https://doi.org/10.3389/fpubh.2024.1392180
PMID: https://pubmed.ncbi.nlm.nih.gov/38716250
تاريخ النشر: 2024-04-23
المؤلف: Fan Dong وآخرون
الموضوع الرئيسي: اليقظة الدوائية وردود الفعل السلبية للأدوية

نظرة عامة

تتناول ورقة البحث التحديات المتعلقة باستخراج الأحداث السلبية المتعلقة بالأدوية بدقة من منصات وسائل التواصل الاجتماعي، التي تُعترف بشكل متزايد كمصادر قيمة للمعلومات الصحية ومراقبة سلامة الأدوية. قام المؤلفون بتطوير نموذج لغوي متخصص يعتمد على تمثيلات الترميز ثنائية الاتجاه من المحولات (BERT) يهدف إلى تحديد هذه الأحداث السلبية، مستخدمين بيانات موسومة من ADE-Corpus-V2. تم تحسين المعلمات الرئيسية، بما في ذلك عدد دورات التدريب، وحجم الدفعة، ومعدل التعلم، أثناء بناء النموذج.

خضع النموذج لعشر تقييمات خارجية على ADE-Corpus-V2 وأظهر أداءً مثيرًا للإعجاب، محققًا متوسط درجات F1 قدرها 0.8575، 0.9049، و0.9813 لاكتشاف كلمات الأحداث السلبية، وكلمات في الأحداث السلبية، وكلمات ليست في الأحداث السلبية، على التوالي. أكدت التحقق الخارجي باستخدام تغريدات موسومة من مجموعة بيانات SMM4H فعالية النموذج، حيث حقق درجات F1 قدرها 0.8127، 0.8068، و0.9790 في نفس الفئات. توضح هذه الدراسة ليس فقط إمكانيات نماذج BERT في تعزيز ممارسات اليقظة الدوائية ولكن أيضًا تؤكد على ضرورة وجود منهجيات تقييم شاملة في سياق بيانات وسائل التواصل الاجتماعي.

مقدمة

تؤكد مقدمة ورقة البحث على الأهمية المزدوجة للفعالية والسلامة في تطوير الأدوية، مشددة على أنه بينما تعتبر الفعالية العلاجية للدواء أمرًا حاسمًا، فإن ملف السلامة الخاص به لا يقل أهمية. تمتد المسؤولية المستمرة لمراقبة سلامة الأدوية إلى ما هو أبعد من الموافقة التنظيمية، مما يتطلب يقظة دوائية مستمرة لتحديد وتخفيف الآثار السلبية. توفر الطرق التقليدية، مثل نظام الإبلاغ عن الأحداث السلبية التابع لإدارة الغذاء والدواء (FAERS)، بيانات منظمة لمراقبة السلامة بعد التسويق ولكنها تواجه قيودًا، بما في ذلك التنوع الديموغرافي وتأخر توفر النتائج.

بالمقابل، ظهرت منصات وسائل التواصل الاجتماعي كمصادر قيمة في الوقت الحقيقي للبيانات التي ينتجها المرضى والتي يمكن أن تكمل طرق المراقبة التقليدية. تناقش الورقة دراسات مختلفة تستخدم وسائل التواصل الاجتماعي لتحديد الأحداث السلبية، مستخدمة تقنيات حسابية متقدمة مثل معالجة اللغة الطبيعية (NLP) لتحليل المحتوى الذي ينشئه المستخدمون. ومع ذلك، فإن دمج بيانات وسائل التواصل الاجتماعي في اليقظة الدوائية يقدم تحديات، خاصة فيما يتعلق بجودة البيانات، والتحليل، والامتثال التنظيمي. يقترح المؤلفون نموذجًا شاملاً لاستخراج الأحداث السلبية من وسائل التواصل الاجتماعي، بهدف تعزيز مراقبة سلامة الأدوية من خلال معالجة القيود الحالية والاستفادة من إمكانيات البيانات في الوقت الحقيقي.

الطرق

تحدد قسم “الطرق” تصميم التجربة والتقنيات التحليلية المستخدمة في الدراسة. استخدم الباحثون نهجًا كميًا، حيث قاموا بإجراء تحليلات إحصائية لتقييم البيانات التي تم جمعها من تجارب مختلفة. تضمنت المنهجيات المحددة تجارب مختبرية محكومة، حيث تم التلاعب بالمتغيرات بشكل منهجي لملاحظة آثارها على النتائج ذات الصلة.

شملت جمع البيانات مقاييس نوعية وكمية، مما يضمن فهمًا شاملاً للظواهر قيد التحقيق. تم إجراء التحليل باستخدام برامج إحصائية متقدمة، مما سمح بإجراء اختبارات صارمة للفرضيات والتحقق من النتائج. يبرز القسم أهمية القابلية للتكرار والشفافية في الطرق المستخدمة، موضحًا الخطوات المتخذة لتقليل التحيز وضمان موثوقية النتائج.

النتائج

أسفرت التقييمات الخارجية لنموذج BERT القائم على استخراج الأحداث السلبية على مجموعة بيانات SMM4H عن نتائج مشجعة، حيث حقق درجات F1 قدرها 0.8127 لفئة B-AE (بداية حدث سلبي)، 0.8068 لفئة I-AE (داخل حدث سلبي)، و0.9790 لفئة O (خارج حدث سلبي). تشير هذه المقاييس إلى كفاءة النموذج في اكتشاف المعلومات المتعلقة بالأحداث السلبية بدقة في التغريدات. توضح مصفوفة الالتباس أداء تصنيف النموذج، مسلطة الضوء على عدد الإيجابيات الحقيقية وحالات التصنيف الخاطئ عبر الفئات المختلفة.

بالتفصيل، حدد النموذج بشكل صحيح 2,888 حالة لفئة O ولكنه صنف 9 بشكل خاطئ كـ B-AE و16 كـ I-AE. بالنسبة لفئة B-AE، حقق 102 إيجابية حقيقية بينما صنف 25 بشكل خاطئ كـ O و8 كـ I-AE. شهدت فئة I-AE 215 إيجابية حقيقية، مع تصنيف 74 بشكل خاطئ كـ O و5 كـ B-AE. بالإضافة إلى ذلك، توضح تحليل مقارن في الجدول 2 أداء النموذج مقابل البيانات الموسومة من قبل البشر، موضحة حالات التعرف الدقيق، والكشف المفقود، والتعرف الجزئي على الأحداث السلبية. على سبيل المثال، حدد النموذج “صداع جبهي” بدقة ولكنه فشل في اكتشاف “مريض” وتعرف جزئيًا على “دورة سريعة” ضمن عبارة أطول. علاوة على ذلك، حدد أحيانًا أحداث سلبية إضافية لم يتم وسمها من قبل البشر، مثل “الجنون”.

المناقشة

في هذه الدراسة، تم تطوير نموذج لغوي قائم على BERT لاستخراج الأحداث السلبية من بيانات وسائل التواصل الاجتماعي، مع التركيز بشكل خاص على التغريدات. تم تدريب النموذج باستخدام مجموعة بيانات ADE_Corpus_V2، التي تحتوي على مصطلحات الأحداث السلبية المعلّمة من الأدبيات الطبية، وتم تقييمه خارجيًا باستخدام مجموعة بيانات منسقة من التغريدات من مبادرة SMM4H. تضمنت عملية التدريب نهجًا منهجيًا من التجزئة، والترميز، والتعديل الدقيق، حيث تم استخدام 80% من بيانات ADE_Corpus_V2 للتدريب و20% للتقييم الداخلي، مع تكرار ذلك عبر عشر تكرارات لضمان الموثوقية. حقق النموذج مقاييس أداء قوية، مع متوسط درجات F1 قدرها 0.8575 لبداية الأحداث السلبية (B-AE)، 0.9049 داخل الأحداث السلبية (I-AE)، و0.9813 للأحداث غير السلبية (O) خلال التقييمات الداخلية.

تسلط الدراسة الضوء على التحديات المتعلقة بمعالجة البيانات غير المنظمة، خاصة في سياق اليقظة الدوائية، حيث تعتبر جودة البيانات أمرًا بالغ الأهمية. تم اختيار نموذج BERT غير المقيد بسبب توازنه بين الأداء والكفاءة الحاسوبية، مما يجعله مناسبًا لمهمة استخراج الأحداث السلبية من وسائل التواصل الاجتماعي. تؤكد النتائج على أهمية ضبط المعلمات، بما في ذلك حجم الدفعة الأمثل ومعدل التعلم، لتعزيز أداء النموذج. على الرغم من التقدم المحرز، تعترف البحث بالتحديات المستمرة في تعميم النماذج عبر منصات وسائل التواصل الاجتماعي المتنوعة وتؤكد على الحاجة إلى معالجة فعالة في الوقت الحقيقي للأحداث السلبية، والتي تظل مهمة كثيفة الموارد الحاسوبية.

Journal: Frontiers in Public Health, Volume: 12
DOI: https://doi.org/10.3389/fpubh.2024.1392180
PMID: https://pubmed.ncbi.nlm.nih.gov/38716250
Publication Date: 2024-04-23
Author(s): Fan Dong et al.
Primary Topic: Pharmacovigilance and Adverse Drug Reactions

Overview

The research paper addresses the challenges of accurately extracting drug-related adverse events from social media platforms, which are increasingly recognized as valuable resources for health-related information and drug safety surveillance. The authors developed a specialized Bidirectional Encoder Representations from Transformers (BERT)-based language model aimed at identifying these adverse events, utilizing labeled data from the ADE-Corpus-V2. Key hyperparameters, including training epochs, batch size, and learning rate, were optimized during model construction.

The model underwent ten hold-out evaluations on ADE-Corpus-V2 and demonstrated impressive performance, achieving average F1 scores of 0.8575, 0.9049, and 0.9813 for detecting words of adverse events, words in adverse events, and words not in adverse events, respectively. External validation with human-labeled tweets from the SMM4H dataset further confirmed the model’s effectiveness, yielding F1 scores of 0.8127, 0.8068, and 0.9790 in the same categories. This study not only illustrates the potential of BERT-based models in enhancing pharmacovigilance practices but also emphasizes the necessity for comprehensive evaluation methodologies in the context of social media data.

Introduction

The introduction of the research paper emphasizes the dual importance of efficacy and safety in drug development, highlighting that while a drug’s therapeutic effectiveness is critical, its safety profile is equally paramount. The ongoing responsibility for monitoring drug safety extends beyond regulatory approval, necessitating continuous pharmacovigilance to identify and mitigate adverse effects. Traditional methods, such as the FDA’s Adverse Event Reporting System (FAERS), provide structured data for post-market safety surveillance but face limitations, including demographic diversity and delayed availability of results.

In contrast, social media platforms have emerged as valuable, real-time sources of patient-generated data that can complement traditional surveillance methods. The paper discusses various studies that utilize social media to identify adverse events, employing advanced computational techniques like natural language processing (NLP) to analyze user-generated content. However, the integration of social media data into pharmacovigilance presents challenges, particularly concerning data quality, analysis, and regulatory compliance. The authors propose a comprehensive model for extracting adverse events from social media, aiming to enhance drug safety monitoring by addressing existing limitations and leveraging the potential of real-time data.

Methods

The “Methods” section outlines the experimental design and analytical techniques employed in the study. The researchers utilized a quantitative approach, employing statistical analyses to evaluate the data collected from various experiments. Specific methodologies included controlled laboratory experiments, where variables were systematically manipulated to observe their effects on the outcomes of interest.

Data collection involved both qualitative and quantitative measures, ensuring a comprehensive understanding of the phenomena under investigation. The analysis was conducted using advanced statistical software, allowing for rigorous testing of hypotheses and validation of results. The section emphasizes the importance of replicability and transparency in the methods used, detailing the steps taken to minimize bias and ensure the reliability of findings.

Results

The external evaluation of the BERT-based model for adverse event extraction on the SMM4H dataset yielded encouraging results, with F1 scores of 0.8127 for the B-AE class (beginning of an adverse event), 0.8068 for the I-AE class (inside an adverse event), and 0.9790 for the O class (outside an adverse event). These metrics indicate the model’s proficiency in accurately detecting adverse event-related information in tweets. A confusion matrix further elucidates the model’s classification performance, highlighting the number of true positives and instances of misclassification across the different classes.

In detail, the model correctly identified 2,888 instances for the O class but misclassified 9 as B-AE and 16 as I-AE. For the B-AE class, it achieved 102 true positives while misclassifying 25 as O and 8 as I-AE. The I-AE class saw 215 true positives, with 74 misclassified as O and 5 as B-AE. Additionally, a comparative analysis in Table 2 illustrates the model’s performance against human-labeled data, showcasing cases of exact recognition, missed detections, and partial recognitions of adverse events. For instance, the model accurately identified “frontal headache” but failed to detect “sick” and partially recognized “rapid cycling” within a longer phrase. Furthermore, it occasionally identified additional adverse events not labeled by humans, such as “going crazy.”

Discussion

In this study, a BERT-based language model was developed to extract adverse events from social media data, specifically focusing on tweets. The model was trained using the ADE_Corpus_V2 dataset, which contains annotated adverse event terms from medical literature, and was evaluated externally with a curated dataset of tweets from the SMM4H initiative. The training process involved a systematic approach of tokenization, encoding, and fine-tuning, with 80% of the ADE_Corpus_V2 data used for training and 20% for internal evaluation, repeated across ten iterations to ensure reliability. The model achieved robust performance metrics, with average F1 scores of 0.8575 for beginning adverse events (B-AE), 0.9049 for inside adverse events (I-AE), and 0.9813 for non-adverse events (O) during internal evaluations.

The study highlights the challenges of processing unstructured data, particularly in the context of pharmacovigilance, where data quality is paramount. The BERT-based-uncased model was chosen for its balance between performance and computational efficiency, making it suitable for the task of extracting adverse events from social media. The findings underscore the importance of hyperparameter tuning, including optimal batch size and learning rate, to enhance model performance. Despite the advancements made, the research acknowledges ongoing challenges in generalizing models across diverse social media platforms and emphasizes the need for efficient real-time processing of adverse events, which remains a computationally intensive task.