التعلم الانتقالي باستخدام الشبكات العصبية البيانية لتحسين توقع الخصائص الجزيئية في إعداد متعدد الدقة Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting

المجلة: Nature Communications، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41467-024-45566-8
PMID: https://pubmed.ncbi.nlm.nih.gov/38409255
تاريخ النشر: 2024-02-26
المؤلف: David Buterez وآخرون
الموضوع الرئيسي: تعلم الآلة في علوم المواد

الطرق

في هذا القسم، يحدد المؤلفون إطارهم المنهجي، بدءًا بمراجعة التعلم بالنقل ووصف رسمي لإعداد المشكلة. ثم يقدمون أساسيات الشبكات العصبية البيانية (GNNs)، موضحين كل من القراءة القياسية والتكيفية، إلى جانب هيكلية مشفر الرسوم البيانية المتغير الخاضع للإشراف المقترحة. كما يقدم المؤلفون استراتيجيات التعلم بالنقل المستخدمة في دراستهم، مع تسليط الضوء على آلية التعلم الشائعة ذات المرحلتين التي تشمل التدريب المسبق والتعديل الدقيق للشبكات العصبية، خاصة في السياقات غير الهندسية.

تتوج الطرق بدراسة تجريبية مصممة لتقييم فعالية الأساليب المقترحة مقارنة بالمعايير المعتمدة لتعلم البيانات متعددة الدقة. تهدف هذه التحليل المقارن إلى التحقق من مزايا منهجياتهم في سياق التعلم بالنقل ضمن أطر GNN.

النتائج

يقدم قسم النتائج تحليلًا تجريبيًا شاملاً لمختلف تقنيات التعلم بالنقل المطبقة على تعزيز الميزات عبر محاكاة ذات دقة منخفضة. تقيم الدراسة استراتيجيات متعددة، بما في ذلك استخدام تسميات منخفضة الدقة صريحة، وتسميات متوقعة من نماذج منخفضة الدقة، ونهج هجينة تجمع بين الاثنين أثناء التدريب والاستدلال. بالإضافة إلى ذلك، تستكشف تضمينات الفضاء الكامن التي تم إنشاؤها بواسطة نماذج منخفضة الدقة واستراتيجيتين للتعديل الدقيق لشبكات GNN. يتم تقييم فعالية هذه الطرق مقابل المعايير المدربة فقط على بيانات عالية الدقة نادرة، بهدف تحسين الأداء التنبؤي.

تشير النتائج الرئيسية إلى أن القراءات التكيفية في مشفرات الرسوم البيانية المتغيرة (VGAEs) يمكن أن تتعلم بفعالية المفاهيم الهيكلية من البيانات منخفضة الدقة، مما يسهل نقل المعرفة إلى مهام جديدة من خلال التعديل الدقيق. توضح الدراسة أن دمج تسميات وتمثيلات منخفضة الدقة يفيد بشكل كبير النمذجة التنبؤية في كل من الإعدادات الانتقالية والاستقرائية. من الجدير بالذكر أن استراتيجيات التعديل الدقيق التي تعيد تدريب القراءة التكيفية فقط على بيانات عالية الدقة تتفوق على طرق تعديل GNN التقليدية. علاوة على ذلك، تسلط الأبحاث الضوء على مزايا دمج مدخلات متعددة منخفضة الدقة، والتي يمكن أن تعزز بشكل تآزري أداء النموذج في المراحل التالية. بشكل عام، تؤكد النتائج على إمكانيات استراتيجيات التعلم بالنقل المختلفة في تحسين فعالية النموذج عبر مستويات دقة مختلفة، بينما تكشف أيضًا عن قيود وظائف القراءة القياسية لـ GNN في سياقات متعددة الدقة.

المناقشة

في قسم المناقشة، يبرز البحث أهمية الأساليب متعددة الدقة في تصميم الجزيئات، لا سيما في اكتشاف الأدوية والميكانيكا الكمومية. يؤكد على استراتيجية القمع، حيث تقوم الفحوصات الأولية منخفضة الدقة، مثل الفحص عالي الإنتاجية (HTS)، بتصفية مكتبات المركبات الكبيرة لتحديد المرشحين للاختبارات عالية الدقة اللاحقة. يُلاحظ أن HTS يساهم في حوالي ثلث المرشحين الجدد للأدوية، ومع ذلك، لا يزال دمج بيانات متعددة الدقة غير مستكشف بشكل كاف. يجادل المؤلفون بأن الاستفادة من مستويات دقة متنوعة يمكن أن تعزز التنبؤات، وتقلل التكاليف، وتحسن سير العمل التجريبي، لكن المنهجيات الحالية تعتمد بشكل أساسي على نماذج دقة واحدة.

تستعرض هذه القسم أيضًا الأدبيات الموجودة حول التعلم متعدد الدقة، مميزةً بين مختلف الأساليب، بما في ذلك العمليات الغاوسية والشبكات العصبية المركبة. من الجدير بالذكر أن الورقة تقدم خوارزمية جديدة لتضمين الحالة متعددة الدقة (MFSE) التي تدمج مؤشرات الدقة في الشبكات العصبية البيانية، على الرغم من أنها تعترف بالتحديات مثل فقدان المعلومات عالية الدقة المحتمل عند التدريب على مجموعات بيانات غير متوازنة. تظهر النتائج التجريبية أن القراءات التكيفية في الشبكات العصبية البيانية تتفوق بشكل كبير على القراءات القياسية، لا سيما في مهام اكتشاف الأدوية، مما يؤدي إلى تحسين الأداء التنبؤي وتحسين هيكلة الفضاءات الكامنة. بشكل عام، تؤكد النتائج على الحاجة إلى دمج بيانات متعددة الدقة لتعزيز التنبؤات الجزيئية وإبلاغ التصاميم التجريبية في كل من اكتشاف الأدوية والمحاكاة الكمومية.

Journal: Nature Communications, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41467-024-45566-8
PMID: https://pubmed.ncbi.nlm.nih.gov/38409255
Publication Date: 2024-02-26
Author(s): David Buterez et al.
Primary Topic: Machine Learning in Materials Science

Methods

In this section, the authors outline their methodological framework, beginning with a review of transfer learning and a formal description of the problem setting. They then introduce the fundamentals of graph neural networks (GNNs), detailing both standard and adaptive readouts, alongside their proposed supervised variational graph autoencoder architecture. The authors also present the transfer learning strategies employed in their study, highlighting a common two-stage learning mechanism that involves pre-training and fine-tuning neural networks, particularly in non-geometric contexts.

The methods culminate in an empirical study designed to assess the effectiveness of the proposed approaches against established baselines for multi-fidelity data learning. This comparative analysis aims to validate the advantages of their methodologies in the context of transfer learning within GNN frameworks.

Results

The results section presents a thorough empirical analysis of various transfer learning techniques applied to feature augmentation via low-fidelity simulations. The study evaluates multiple strategies, including the use of explicit low-fidelity labels, predicted labels from low-fidelity models, and hybrid approaches combining both during training and inference. Additionally, it explores latent space embeddings generated by low-fidelity models and two fine-tuning strategies for graph neural networks (GNNs). The effectiveness of these methods is assessed against baselines trained solely on sparse high-fidelity data, aiming to enhance predictive performance.

Key findings indicate that adaptive readouts in variational graph autoencoders (VGAEs) can effectively learn structured concepts from low-fidelity data, facilitating knowledge transfer to new tasks through fine-tuning. The study demonstrates that incorporating low-fidelity labels and representations significantly benefits predictive modeling in both transductive and inductive settings. Notably, fine-tuning strategies that retrain only the adaptive readout on high-fidelity data outperform traditional GNN fine-tuning methods. Furthermore, the research highlights the advantages of integrating multiple lower-fidelity inputs, which can synergistically enhance downstream model performance. Overall, the findings underscore the potential of various transfer learning strategies in improving model efficacy across different fidelity levels, while also revealing the limitations of standard GNN readout functions in multi-fidelity contexts.

Discussion

In the discussion section, the paper highlights the significance of multi-fidelity approaches in molecular design, particularly in drug discovery and quantum mechanics. It emphasizes the funnel strategy, where initial low-fidelity screenings, such as high-throughput screening (HTS), filter large compound libraries to identify candidates for subsequent high-fidelity assays. HTS is noted to contribute to about one-third of new drug candidates, yet the integration of multi-fidelity data remains underexplored. The authors argue that leveraging diverse fidelity levels could enhance predictions, reduce costs, and optimize experimental workflows, but current methodologies predominantly rely on single-fidelity paradigms.

The section also reviews existing literature on multi-fidelity learning, contrasting various approaches, including Gaussian processes and composite neural networks. Notably, the paper introduces a novel multi-fidelity state embedding (MFSE) algorithm that incorporates fidelity indicators into graph neural networks, although it acknowledges challenges such as the potential loss of high-fidelity information when training on imbalanced datasets. The empirical results demonstrate that adaptive readouts in graph neural networks significantly outperform standard readouts, particularly in drug discovery tasks, leading to improved predictive performance and better structuring of latent spaces. Overall, the findings underscore the need for integrating multi-fidelity data to enhance molecular predictions and inform experimental designs in both drug discovery and quantum simulations.