اعتبار مدفوع بالبيانات للاضطرابات الجينية لبرامج الفحص الجيني العالمي لحديثي الولادة Data-driven consideration of genetic disorders for global genomic newborn screening programs

المجلة: Genetics in Medicine، المجلد: 27، العدد: 7
DOI: https://doi.org/10.1016/j.gim.2025.101443
PMID: https://pubmed.ncbi.nlm.nih.gov/40357684
تاريخ النشر: 2025-05-09
المؤلف: Thomas Minten وآخرون
الموضوع الرئيسي: علم الجينوم والأمراض النادرة

نظرة عامة

يتناول هذا القسم من ورقة البحث الجهود المستمرة لأكثر من 30 دراسة دولية تهدف إلى تحسين تسلسل حديثي الولادة (NBSeq) لتوسيع نطاق الاضطرابات الجينية التي يتم التعرف عليها في فحص حديثي الولادة. يبرز المؤلفون التباين الكبير في اختيار الجينات بين برامج NBSeq المختلفة، مما يبرز ضرورة وجود منهجية منهجية لتحديد أولويات الجينات للإدراج. لمعالجة ذلك، قاموا بتجميع مجموعة بيانات تحتوي على 25 سمة لكل من 4,390 جين عبر 27 برنامج NBSeq واستخدموا تحليل الانحدار لتحديد المؤشرات التي تتنبأ بإدراج الجينات.

تكشف النتائج أن عدد الجينات التي تم تحليلها في هذه البرامج يختلف بشكل كبير، حيث يتراوح من 134 إلى 4,299، مع إدراج 74 جينًا فقط (1.7%) من قبل أكثر من 80% من البرامج. تم تحديد مؤشرات رئيسية لإدراج الجينات، لا سيما وجود الجينات في اللوحة الموحدة الموصى بها للفحص في الولايات المتحدة (مما يزيد من الإدراج بنسبة 74.7%، CI: 71.0%-78.4%)، والأدلة القوية بشأن التاريخ الطبيعي للاضطرابات الجينية المرتبطة (29.5%، CI: 24.6%-34.4%)، وفعالية العلاج (17.0%، CI: 12.3%-21.7%). علاوة على ذلك، أظهر نموذج تعلم الآلة باستخدام الأشجار المعززة الذي يستخدم 13 مؤشرًا دقة عالية في التنبؤ بإدراج الجينات، محققًا منطقة تحت المنحنى تبلغ 0.915 و R² تبلغ 84%. يستنتج المؤلفون أن نموذج تعلم الآلة الخاص بهم يقدم قائمة مرتبة ديناميكية من الجينات، قابلة للتكيف مع الأدلة الجديدة والمتطلبات الإقليمية، مما يسهل اختيار الجينات بشكل أكثر اتساقًا ووعيًا في مبادرات NBSeq.

مقدمة

تناقش مقدمة هذه الورقة البحثية تطور وأهمية تسلسل حديثي الولادة والطفولة (NBSeq)، الذي بدأه مشروع BabySeq، والذي يهدف إلى تحديد مخاطر الاضطرابات الجينية في الرضع الأصحاء. مع وجود أكثر من 700 اضطراب جيني الآن لها علاجات مستهدفة أو إرشادات إدارة، هناك دعم متزايد من مختلف أصحاب المصلحة – بما في ذلك الآباء، والمهنيين الصحيين، والجمهور – لتنفيذ فحص الجينات لحديثي الولادة لبعض الاضطرابات. تسلط الورقة الضوء على الجهود المستمرة لأكثر من 30 برنامج دولي يستكشف NBSeq، الذي يسهل من قبل التحالف الدولي لتسلسل حديثي الولادة.

يشير المؤلفون إلى الإطار التاريخي الذي وضعه ويلسون ويونغنر لاختيار الاضطرابات للفحص العام، مؤكدين على الحاجة إلى التشخيص المبكر والتدخل في الاضطرابات القابلة للعلاج التي تبدأ في الطفولة. ومع ذلك، فإن تعقيد البيانات الجينية والعدد الكبير من الاضطرابات القابلة للعلاج يشكلان تحديات في اختيار الجينات المناسبة لـ NBSeq. كشفت الدراسات السابقة عن عدم التناسق في الجينات التي تم تحليلها عبر برامج NBSeq التجارية والبحثية، ومع ذلك تظل الأسباب الكامنة وراء هذه التباينات غير واضحة. لمعالجة ذلك، أجرى المؤلفون تحليلًا مقارنًا للجينات المختارة من قبل 27 برنامج NBSeq، وجمعوا مجموعة بيانات تحتوي على 25 سمة لكل جين. استخدموا تحليل الانحدار المتعدد المتغيرات لتحديد الخصائص المرتبطة بإدراج الجينات واستخدموا نموذج الأشجار المعززة لتحديد أولويات الاضطرابات الجينية لفحص NBSeq على مستوى السكان، بهدف تعزيز الأساس التجريبي لمبادرات الصحة العامة في هذا المجال.

الطرق

يستعرض قسم “المواد والطرق” تصميم التجربة والإجراءات المستخدمة في الدراسة. يوضح المواد المحددة المستخدمة، بما في ذلك أي مواد كيميائية، ومعدات، وعينات بيولوجية، لضمان إمكانية تكرار التجارب. تشمل المنهجية البروتوكولات لجمع البيانات، بما في ذلك أي تحليلات إحصائية تم إجراؤها لتفسير النتائج.

بالإضافة إلى ذلك، قد يصف القسم الظروف التجريبية، مثل درجة الحرارة، والمدة، وأي ضوابط تم تنفيذها للتحقق من النتائج. من خلال تقديم نظرة شاملة على الطرق، يهدف المؤلفون إلى تسهيل فهم وتكرار عملهم داخل المجتمع العلمي.

النتائج

يقدم قسم “النتائج” من ورقة البحث النتائج الرئيسية المستمدة من التجارب أو التحليلات التي تم إجراؤها. عادةً ما يتضمن بيانات كمية، وتحليلات إحصائية، وتمثيلات بصرية مثل الرسوم البيانية أو الجداول التي توضح نتائج الدراسة. غالبًا ما تتم مقارنة النتائج مع الفرضيات أو الأهداف الأولية الموضحة في المقدمة، مما يبرز ما إذا كانت البيانات تدعم أو تنفي هذه الاقتراحات.

بالإضافة إلى ذلك، قد يناقش القسم أهمية النتائج، بما في ذلك أي اتجاهات، أو ارتباطات، أو شذوذات تم ملاحظتها. من المهم ملاحظة أي قيود واجهت أثناء البحث والتي قد تؤثر على تفسير النتائج. بشكل عام، يخدم هذا القسم لتقديم حساب واضح وموضوعي للبيانات التي تم جمعها، مما يمهد الطريق للنقاش والاستنتاجات اللاحقة.

المناقشة

تسلط قسم المناقشة من هذه الورقة البحثية الضوء على التعقيدات والتقدم في برامج تسلسل حديثي الولادة (NBSeq)، مؤكدًا على الحاجة إلى اختيار الجينات بعناية لتعزيز التشخيص المبكر للاضطرابات الجينية. استخدمت الدراسة تصميمًا مقطعيًا، حيث تم تحديد وتحليل قوائم الجينات من 27 برنامج NBSeq، مما كشف عن تباين كبير في معايير إدراج الجينات. بشكل ملحوظ، تم تحديد 4390 جينًا عبر هذه البرامج، مع ارتباط الغالبية باضطرابات أيضية وراثية، وعصبية، ومناعية. تشير النتائج إلى وجود ارتباط إيجابي بين عدد الجينات التي تم فحصها ونسبة النتائج الإيجابية، مما يوحي بأن الفحص الأوسع قد يحسن معدلات الكشف عن الحالات القابلة للعلاج.

تم تطوير نموذج تعلم الآلة للتنبؤ بإدراج الجينات بناءً على 13 سمة مختارة، مما أظهر دقة عالية مع منطقة تحت المنحنى تبلغ 0.915. لا يقوم هذا النموذج فقط بترتيب الجينات للإدراج المحتمل في الفحص الصحي العام، بل يبرز أيضًا الجينات التي، على الرغم من تجاهلها، تمتلك خصائص ملائمة للإدراج. تؤكد الدراسة على أهمية دمج توصيات الخبراء والمعرفة المتطورة حول علاقات الجينات والأمراض لتحسين عمليات اختيار الجينات. تم الاعتراف بالقيود مثل قوائم الجينات الديناميكية والبيانات المفقودة، مع توجيه الأبحاث المستقبلية نحو تعزيز قابلية تكيف النموذج ودمج وجهات نظر أوسع من أصحاب المصلحة. بشكل عام، تسهم هذه العمل في النقاش المستمر حول تحسين NBSeq للتنفيذ على مستوى السكان.

Journal: Genetics in Medicine, Volume: 27, Issue: 7
DOI: https://doi.org/10.1016/j.gim.2025.101443
PMID: https://pubmed.ncbi.nlm.nih.gov/40357684
Publication Date: 2025-05-09
Author(s): Thomas Minten et al.
Primary Topic: Genomics and Rare Diseases

Overview

This research paper section discusses the ongoing efforts of over 30 international studies aimed at enhancing newborn sequencing (NBSeq) to broaden the spectrum of genetic disorders identified in newborn screening. The authors highlight significant variability in gene selection among different NBSeq programs, which underscores the necessity for a systematic methodology to prioritize genes for inclusion. To address this, they compiled a dataset containing 25 characteristics for each of the 4,390 genes across 27 NBSeq programs and employed regression analysis to identify predictors of gene inclusion.

The findings reveal that the number of genes analyzed in these programs varies widely, ranging from 134 to 4,299, with only 74 genes (1.7%) being included by more than 80% of the programs. Key predictors for gene inclusion were identified, notably the presence of genes on the US Recommended Uniform Screening Panel (increasing inclusion by 74.7%, CI: 71.0%-78.4%), robust evidence regarding the natural history of the associated genetic disorders (29.5%, CI: 24.6%-34.4%), and treatment efficacy (17.0%, CI: 12.3%-21.7%). Furthermore, a boosted trees machine learning model utilizing 13 predictors demonstrated high accuracy in forecasting gene inclusion, achieving an area under the curve of 0.915 and an R² of 84%. The authors conclude that their machine learning model offers a dynamic ranked list of genes, adaptable to new evidence and regional requirements, thereby facilitating more consistent and informed gene selection in NBSeq initiatives.

Introduction

The introduction of this research paper discusses the evolution and significance of newborn and childhood sequencing (NBSeq), initiated by the BabySeq Project, which aims to identify genetic disorder risks in healthy infants. With over 700 genetic disorders now having targeted treatments or management guidelines, there is growing support from various stakeholders—including parents, healthcare professionals, and the public—for implementing genomic newborn screening for select disorders. The paper highlights the ongoing efforts of at least 30 international programs exploring NBSeq, facilitated by the International Consortium on Newborn Sequencing.

The authors note the historical framework established by Wilson and Jungner for selecting disorders for public screening, emphasizing the need for early diagnosis and intervention in treatable childhood-onset disorders. However, the complexity of genomic data and the vast number of treatable disorders pose challenges in selecting appropriate genes for NBSeq. Previous studies have revealed inconsistencies in the genes analyzed across commercial and research NBSeq programs, yet the underlying reasons for these discrepancies remain unclear. To address this, the authors conducted a comparative analysis of genes selected by 27 NBSeq programs, compiling a dataset of 25 characteristics for each gene. They employed multivariate regression analysis to identify characteristics associated with gene inclusion and utilized a boosted trees model to prioritize genetic disorders for population-wide NBSeq, aiming to enhance the empirical basis for public health initiatives in this domain.

Methods

The “Materials and Methods” section outlines the experimental design and procedures employed in the study. It details the specific materials used, including any reagents, equipment, and biological samples, ensuring reproducibility of the experiments. The methodology encompasses the protocols for data collection, including any statistical analyses performed to interpret the results.

Additionally, the section may describe the experimental conditions, such as temperature, duration, and any controls implemented to validate the findings. By providing a comprehensive overview of the methods, the authors aim to facilitate understanding and replication of their work within the scientific community.

Results

The “Results” section of the research paper presents the key findings derived from the conducted experiments or analyses. It typically includes quantitative data, statistical analyses, and visual representations such as graphs or tables that illustrate the outcomes of the study. The results are often compared against the initial hypotheses or objectives outlined in the introduction, highlighting whether the data supports or refutes these propositions.

Additionally, the section may discuss the significance of the findings, including any observed trends, correlations, or anomalies. It is crucial to note any limitations encountered during the research that could affect the interpretation of the results. Overall, this section serves to provide a clear and objective account of the data collected, laying the groundwork for subsequent discussion and conclusions.

Discussion

The discussion section of this research paper highlights the complexities and advancements in newborn sequencing (NBSeq) programs, emphasizing the need for careful gene selection to enhance early diagnosis of genetic disorders. The study employed a cross-sectional design, identifying and analyzing gene lists from 27 NBSeq programs, which revealed significant variability in gene inclusion criteria. Notably, 4390 genes were identified across these programs, with a majority linked to inherited metabolic, neurologic, and immunologic disorders. The findings indicate a positive correlation between the number of genes screened and the percentage of positive results, suggesting that broader screening may improve detection rates for treatable conditions.

A machine learning model was developed to predict gene inclusion based on 13 selected characteristics, demonstrating high accuracy with an area under the curve of 0.915. This model not only ranks genes for potential inclusion in public health screening but also highlights genes that, despite being overlooked, possess favorable characteristics for inclusion. The study underscores the importance of integrating expert recommendations and evolving knowledge about gene-disease relationships to refine gene selection processes. Limitations such as dynamic gene lists and missing data were acknowledged, with future research aimed at enhancing the model’s adaptability and incorporating broader perspectives from stakeholders. Overall, this work contributes to the ongoing discourse on optimizing NBSeq for population-wide implementation.