متنبئات المتغيرات الحاسوبية لعلم الأدوية الجينومية: من تقييم الأليلات الفردية إلى تقييم ردود الفعل السلبية للأدوية المضادة للاكتئاب Computational variant predictors for pharmacogenomics: from evaluation of single alleles to assessment of adverse drug reactions to antidepressants

المجلة: The Pharmacogenomics Journal، المجلد: 26، العدد: 2
DOI: https://doi.org/10.1038/s41397-026-00399-0
PMID: https://pubmed.ncbi.nlm.nih.gov/41803106
تاريخ النشر: 2026-03-09
المؤلف: Jacek Hajto وآخرون
الموضوع الرئيسي: علم الوراثة الدوائية واستقلاب الأدوية

نظرة عامة

تقيّم هذه الدراسة فعالية أطر التقييم الحسابية المختلفة في علم الوراثة الدوائية، لا سيما في تفسير المتغيرات الجديدة والأنماط المعقدة التي لا يتم التعامل معها بشكل كافٍ من قبل تعليقات الأليلات النجمية الحالية. قامت الدراسة بتقييم كل من المتنبئين المعتمدين (CADD، FATHMM-XF، PROVEAN، MutationAssessor، SIFT، PhyloP100، APF، APF2) والأدوات الجديدة (PharmGScore و PharmMLScore) عبر سيناريوهات متعددة، حيث تم تحليل ما مجموعه 541 أليل من PharmVar، وخرائط الطفرات لـ CYP2C9 و CYP2C19، و200,642 إكسوم من بنك المملكة المتحدة المرتبطة بنتائج علاج مضادات الاكتئاب.

تشير النتائج إلى أن العديد من الأدوات التي تم تقييمها، لا سيما الأطر الجماعية، حققت أو تجاوزت دقة تصنيفات الأليلات النجمية، حيث وصلت قيم ROC-AUC إلى 0.85 لتعريفات الأليلات و0.95 في المختبر، إلى جانب معدل إيجابي حقيقي (TPR) يصل إلى 0.99 للإكسومات. بالإضافة إلى ذلك، توقعت هذه الأدوات بفعالية الأحداث السلبية الشديدة المرتبطة بعلاج مضادات الاكتئاب لدى حاملي المتغيرات الضارة لـ CYP2C19، مع نسب احتمالية تتراوح بين 1.20 إلى 1.35. تشير النتائج إلى أن المتنبئين الحسابيين يمكن أن يتطابقوا مع دقة تصنيفات الأليلات النجمية بينما يتناولون قيودها، مما قد يعزز اتخاذ القرارات السريرية من خلال السماح بالتقييم المستمر وإدراج المتغيرات التي لم يتم التعرف عليها سابقًا.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على التحدي المستمر في ربط التباين الجيني باستجابة الأدوية في الطب الدقيق، لا سيما في الطب النفسي، حيث لا يستجيب نسبة كبيرة من المرضى (20-60%) للعلاجات الأولى. على الرغم من وجود مئات من المتغيرات الوراثية الدوائية المدمجة في الإرشادات السريرية التي تحسن نتائج المرضى وتقلل من ردود الفعل السلبية للأدوية (ADRs)، لا تزال العديد من الأليلات الوراثية الدوائية نادرة وغير موصوفة بشكل كافٍ. يفشل نظام تصنيف الأليلات النجمية التقليدي، الذي يصنف المتغيرات إلى أربعة مستويات وظيفية، في احتساب المتغيرات الجديدة والنادرة ولا يعكس التنوع الوظيفي الكامل للبروتينات المعنية في استقلاب الأدوية.

لمعالجة هذه القيود، يقوم المؤلفون بتقييم المتنبئين الحسابيين المتقدمين المصممين خصيصًا لعلم الوراثة الدوائية. يقومون بتقييم أداء هذه الأدوات مقابل إطار الأليلات النجمية المعتمد باستخدام استراتيجية تقييم شاملة متعددة المستويات. تشير نتائجهم إلى أن متنبئ PharmGScore يتفوق على الآخرين في استعادة وظائف الأليلات النجمية عند تجميع عدة SNPs لكل نمط وراثي. بالإضافة إلى ذلك، تُظهر الدراسة أن الدرجات المخصصة للوراثة الدوائية تُظهر حساسية عالية تجاه المتغيرات الجديدة وتتنبأ بفعالية بردود الفعل السلبية للأدوية في بيانات سريرية من العالم الحقيقي، مما يتماشى عن كثب مع ارتباطات الأليلات النجمية. تؤكد هذه الدراسة على الحاجة إلى أساليب حسابية قابلة للتوسع وأوتوماتيكية في علم الوراثة الدوائية وتقدم إطارًا تجريبيًا واعدًا لتطوير الأساليب المستقبلية.

الطرق

في هذه الدراسة، طور المؤلفون إطارين مبتكرين، PharmGScore و PharmMLScore، لتقييم التأثيرات الوظيفية للمتغيرات الجينية. يدمج PharmGScore أربعة متنبئين—CADD، FATHMM-XF، Mutation Assessor، و PROVEAN—من خلال توضيح المتغيرات أحادية النوكليوتيد (SNVs) ضمن 60 قاعدة من الإكسونات، وتطبيع درجاتها إلى مقياس من 0-150، ومتوسطها لإنتاج درجة واحدة لكل متغير. من ناحية أخرى، يستخدم PharmMLScore 91 مجموعة من أربع درجات، بما في ذلك CADD و FATHMM-XF، بالإضافة إلى متنبئين إضافيين من dbNSFP. يتم قياس هذه الدرجة باستخدام مقياس phred، وتحديد حد أقصى، ودمجها مع عدد المتغيرات لإنشاء ميزات إدخال لشبكة عصبية تغذية للأمام، تم تدريبها باستخدام طريقة ترك الجين واحدة خارجًا للتفريق بين الأليلات العادية/المعززة والأليلات المنخفضة/عديمة الوظيفة.

تم تقييم أداء عشرة أنظمة تقييم مختارة باستخدام منحنيات خصائص التشغيل المستقبلي (ROC)، مع حساب مقاييس المساحة تحت المنحنى (AUC) واستنتاج فترات الثقة عبر طرق bootstrap. شمل التحليل مقارنة الأليلات ذات النشاط المنخفض مع الأليلات الوظيفية، وتحديد عتبات القرار المثلى من خلال مؤشر يودن، وإجراء مقارنات زوجية باستخدام ANOVA أحادي الاتجاه واختبارات t. بالإضافة إلى ذلك، تم تصنيف بيانات المسح الطفري العميق لـ CYP2C9 و CYP2C19 إلى مجموعات ضارة، متوسطة التأثير، وغير ضارة، مما أدى إلى إنشاء منحنيات ROC للتصنيف. في تحليل تسلسل الإكسوم الكامل (WES)، تم تصنيف المشاركين من بنك المملكة المتحدة حسب نشاط الإنزيم القائم على الأليلات النجمية، وتم حساب معدلات الإيجابيات الحقيقية والزائفة لكل متنبئ. تم تحليل النتائج السريرية من خلال تصنيف المشاركين بناءً على قيم درجات CYP2C19، وتقييم الارتباطات مع ردود الفعل السلبية للأدوية (ADRs) والعلاجات الدوائية باستخدام اختبارات χ².

النتائج

في هذه الدراسة، قمنا بتقييم أدوات حسابية مختلفة مصممة لتقييم المتغيرات الوراثية الدوائية، مع التركيز على أدائها عبر مستويات مختلفة من تعقيد البيانات. شمل التقييم تحليل البيانات السريرية من بنك المملكة المتحدة (UKB)، كما هو موضح في الشكل 1. تشير نتائجنا إلى أن فعالية هذه الأدوات تختلف بشكل كبير اعتمادًا على تعقيد البيانات المستخدمة، مما يبرز أهمية اختيار المنهجيات المناسبة لتحليل علم الوراثة الدوائية.

المناقشة

في هذه الدراسة، قام المؤلفون بتقييم التأثير الوظيفي للمتغيرات الوراثية الدوائية عبر ثمانية جينات وراثية رئيسية (CYP2B6، CYP2C19، CYP2C9، CYP2D6، CYP3A5، DPYD، NUDT15، و SLCO1B1) باستخدام مجموعة بيانات شاملة تضمنت متغيرات أحادية النوكليوتيد (SNVs) وأدوات حسابية متنوعة. ركزوا على أداء عشرة أدوات تقييم المتغيرات، لا سيما PharmGScore و PharmMLScore، في التنبؤ بالعواقب الظاهرية للأليلات النجمية، والتي تعتبر حاسمة لفهم استقلاب الأدوية واستجابتها. أشارت النتائج إلى أن PharmGScore تفوق باستمرار على الأدوات الأخرى في التمييز بين الأليلات العادية والأليلات ذات الوظيفة المنخفضة/عديمة الوظيفة، محققًا مساحة تحت المنحنى (AUC) تبلغ 0.849، بينما أظهر أيضًا أداءً قويًا في التنبؤ بالمتغيرات الضارة الجديدة من بيانات المسح الطفري العميق.

أبرز المؤلفون قيود طرق الاختبار الوراثية الدوائية الحالية، مشيرين إلى أن حتى متغيرًا واحدًا غير موصوف يمكن أن يغير بشكل كبير ملفات استقلاب الأدوية. اقترحوا أن دمج البنوك الحيوية القائمة على السكان وطرق التعلم الآلي يمكن أن يعزز تفسير بيانات تسلسل الجيل التالي. تشير نتائجهم إلى أنه بينما تعتبر تصنيفات الأليلات النجمية التقليدية مفيدة، قد توفر الأدوات الحسابية مثل PharmGScore و PharmMLScore رؤى أكثر دقة حول تأثيرات المتغيرات، لا سيما بالنسبة للمتغيرات النادرة والجديدة. تؤكد الدراسة على الحاجة إلى تطوير مستمر للمتنبئين الحسابيين لتحسين قابليتهم للتطبيق السريري، خاصة في سياق العلاج الدوائي النفسي، حيث تلعب التباينات الجينية دورًا حاسمًا في نتائج العلاج.

Journal: The Pharmacogenomics Journal, Volume: 26, Issue: 2
DOI: https://doi.org/10.1038/s41397-026-00399-0
PMID: https://pubmed.ncbi.nlm.nih.gov/41803106
Publication Date: 2026-03-09
Author(s): Jacek Hajto et al.
Primary Topic: Pharmacogenetics and Drug Metabolism

Overview

The research evaluates the efficacy of various computational scoring frameworks in pharmacogenetics, particularly in interpreting novel variants and complex haplotypes that are not adequately addressed by existing star allele annotations. The study assessed both established predictors (CADD, FATHMM-XF, PROVEAN, MutationAssessor, SIFT, PhyloP100, APF, APF2) and new tools (PharmGScore and PharmMLScore) across multiple scenarios, analyzing a total of 541 PharmVar alleles, mutational maps for CYP2C9 and CYP2C19, and 200,642 UK Biobank exomes linked to antidepressant treatment outcomes.

The results indicate that many of the evaluated tools, particularly ensemble frameworks, achieved or surpassed the accuracy of star allele classifications, with ROC-AUC values reaching up to 0.85 for allele definitions and 0.95 in vitro, alongside a true positive rate (TPR) of up to 0.99 for exomes. Additionally, these tools effectively predicted severe adverse events associated with antidepressant treatment in carriers of deleterious CYP2C19 variants, with odds ratios ranging from 1.20 to 1.35. The findings suggest that computational predictors can match the accuracy of star allele classifications while addressing their limitations, potentially enhancing clinical decision-making by allowing for continuous scoring and the inclusion of previously unrecognized variants.

Introduction

The introduction of this research paper highlights the ongoing challenge of linking genomic variation to drug response in precision medicine, particularly in psychiatry, where a significant proportion of patients (20-60%) do not respond to first-line therapies. Despite the existence of hundreds of pharmacogenetic variants incorporated into clinical guidelines that improve patient outcomes and reduce adverse drug reactions (ADRs), many pharmacogenetic alleles remain rare and inadequately characterized. The traditional star allele classification system, which categorizes variants into four functional levels, fails to account for novel and rare variants and does not fully capture the functional diversity of proteins involved in drug metabolism.

To address these limitations, the authors evaluate advanced computational predictors designed specifically for pharmacogenomics. They assess the performance of these tools against the established star allele framework using a comprehensive multilevel evaluation strategy. Their findings indicate that the PharmGScore predictor outperforms others in recovering star allele functions when aggregating multiple SNPs per haplotype. Additionally, the study demonstrates that pharmacogene-dedicated scores exhibit high sensitivity to novel variants and effectively predict adverse drug reactions in real-world clinical data, aligning closely with star allele associations. This research underscores the need for scalable and automated computational approaches in pharmacogenetics and presents a promising empirical framework for future method development.

Methods

In this study, the authors developed two innovative frameworks, PharmGScore and PharmMLScore, to evaluate genetic variants’ functional impacts. PharmGScore integrates four predictors—CADD, FATHMM-XF, Mutation Assessor, and PROVEAN—by annotating single nucleotide variants (SNVs) within 60 base pairs of exons, normalizing their scores to a 0-150 scale, and averaging them to produce a single score per variant. PharmMLScore, on the other hand, utilizes 91 ensembles of four scores, including CADD and FATHMM-XF, along with two additional predictors from dbNSFP. This score is phred-scaled, capped, and combined with variant counts to create input features for a feed-forward neural network, which was trained using gene-wise leave-one-out cross-validation to differentiate between normal/increased and decreased/no function alleles.

The performance of ten selected scoring systems was assessed using receiver operating characteristic (ROC) curves, with area under the curve (AUC) metrics calculated and confidence intervals derived via bootstrap methods. The analysis involved contrasting alleles of reduced activity against functional ones, determining optimal decision thresholds through Youden’s index, and conducting pairwise comparisons using one-way ANOVA and t-tests. Additionally, deep mutational scanning data for CYP2C9 and CYP2C19 was categorized into damaging, medium-impact, and benign groups, generating ROC curves for classification. In whole exome sequencing (WES) analysis, participants from the UK Biobank were stratified by star allele-based enzyme activity, and true and false positive rates were calculated for each predictor. Clinical outcomes were analyzed by classifying participants based on CYP2C19 score values, assessing associations with adverse drug reactions (ADRs) and pharmacological therapies using χ² tests.

Results

In this study, we assessed various computational tools designed for scoring pharmacogenomic variants, focusing on their performance across different levels of data complexity. The evaluation included analysis of clinical data from the UK Biobank (UKB), as illustrated in Figure 1. Our findings indicate that the effectiveness of these tools varies significantly depending on the complexity of the data utilized, highlighting the importance of selecting appropriate methodologies for pharmacogenomic analysis.

Discussion

In this study, the authors evaluated the functional impact of pharmacogenetic variants across eight key pharmacogenes (CYP2B6, CYP2C19, CYP2C9, CYP2D6, CYP3A5, DPYD, NUDT15, and SLCO1B1) using a comprehensive dataset that included single-nucleotide variants (SNVs) and various computational tools. They focused on the performance of ten variant scoring tools, particularly PharmGScore and PharmMLScore, in predicting the phenotypic consequences of star alleles, which are critical for understanding drug metabolism and response. The results indicated that PharmGScore consistently outperformed other tools in distinguishing between normal and decreased/no function alleles, achieving an area under the curve (AUC) of 0.849, while also demonstrating robust performance in predicting novel damaging variants from deep mutational scanning data.

The authors highlighted the limitations of existing pharmacogenetic testing methods, noting that even a single unannotated variant could significantly alter drug metabolism profiles. They proposed that integrating population-based biobanks and machine learning approaches could enhance the interpretation of next-generation sequencing data. Their findings suggest that while traditional star allele classifications are useful, computational tools like PharmGScore and PharmMLScore may provide more nuanced insights into variant effects, particularly for rare and novel variants. The study underscores the need for ongoing development of computational predictors to improve their clinical applicability, especially in the context of psychiatric pharmacotherapy, where genetic variability plays a crucial role in treatment outcomes.