الإجابة على الأسئلة المعتمدة على المعرفة باستخدام الشبكات العصبية البيانية وتمثيلات اللغة السياقية Knowledge-based question answering using graph neural networks and contextual language representations

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-33854-2
PMID: https://pubmed.ncbi.nlm.nih.gov/41559152
تاريخ النشر: 2026-01-20
المؤلف: Mohamed Samir وآخرون
الموضوع الرئيسي: نمذجة الموضوعات

نظرة عامة

تقدم هذه البحث إطار عمل جديد للإجابة على الأسئلة (QA) يجمع بين المعرفة العامة من ConceptNet مع تمثيلات سياقية عميقة من BERT، باستخدام شبكة انتباه الرسوم البيانية v2 (GATv2) للتفكير المنظم. يقوم الإطار ببناء رسم فرعي ذي صلة لكل زوج من السؤال والإجابة من ConceptNet، والذي يتم معالجته لالتقاط العلاقات الدلالية بين المفاهيم. في الوقت نفسه، يقوم BERT بتشفير زوج السؤال والإجابة لتوليد تمثيلات لغوية سياقية. يتيح دمج هذين التمثيلين في تمثيل مشترك تحسين قدرات الاستدلال، مما يدمج بفعالية المعرفة المنظمة مع فهم النص غير المنظم.

تظهر التقييمات التجريبية على مجموعات بيانات CommonsenseQA و OpenBookQA تحسينات كبيرة في الدقة، حيث تحقق 82.3% و 86.21% على التوالي، متفوقة بذلك على الأساليب الحالية الرائدة. تؤكد هذه النتائج على إمكانية دمج الرسوم البيانية المعرفية مع نماذج اللغة لمعالجة مهام QA المعقدة التي تتطلب التفكير العام. تختتم الدراسة بالتأكيد على فعالية هذا الإطار الهجين للتفكير وتقترح سبلًا للبحث المستقبلي في تعزيز أنظمة QA من خلال دمج المعرفة المنظمة وغير المنظمة.

الطرق

توضح قسم المنهجية نظامًا جديدًا يجمع بين المعرفة العامة المنظمة من ConceptNet مع المعلومات السياقية المستمدة من نموذج اللغة المدرب مسبقًا BERT، جميعها منظمة ضمن بنية شبكة عصبية رسومية (GNN). تتكون هذه البنية من ثلاثة مكونات رئيسية، حيث يعد بناء الرسم الفرعي عنصرًا رئيسيًا.

تفصل إعدادات التجربة مجموعات البيانات المستخدمة للتقييم، تفاصيل تنفيذ النموذج المقترح، إجراءات التدريب، والنماذج الأساسية المستخدمة للتحليل المقارن. يقدم قسم النتائج التجريبية مقاييس أداء النظام المقترح على مجموعات بيانات CommonsenseQA و OpenBookQA، مع مقارنة أدائه ضد عدة نماذج أساسية رائدة. يهدف التحليل إلى توضيح مساهمات كل مكون ضمن البنية، مع تسليط الضوء على فعالية النهج المتكامل.

النتائج

يقدم قسم النتائج تقييمًا شاملاً لأداء النموذج المقترح على مجموعات بيانات CommonsenseQA و OpenBookQA، كما هو ملخص في الجدول 4. يتفوق النموذج بشكل كبير على مختلف الأساليب الأساسية، محققًا أعلى دقة مع تحسينات تصل إلى 6.3% على CommonsenseQA و 9.2% على OpenBookQA مقارنة بأقوى النماذج المعتمدة على الرسوم البيانية المعرفية (KG). يُعزى هذا النجاح إلى ثلاثة عناصر تصميم مبتكرة: (1) تنفيذ GATv2، الذي يستخدم آلية انتباه ديناميكية وواعية بالسياق لالتقاط العلاقات الدلالية الدقيقة داخل الرسوم الفرعية؛ (2) استراتيجية تقليم مدفوعة بالملاءمة تعزز وضوح الرسم الفرعي من خلال تصفية العقد الأقل معلومات؛ و (3) دمج التفكير القائم على الرسوم البيانية مع التمثيلات السياقية من BERT، مما يسمح بدمج قوي بين المعرفة المنظمة وتمثيلات اللغة الغنية.

تؤكد النتائج على فعالية النموذج في دمج المعرفة العامة المنظمة مع فهم اللغة السياقية، خاصة في مهام التفكير متعددة الخطوات. كما تبرز النتائج متانة النموذج في السيناريوهات ذات الموارد المنخفضة، مما يشير إلى أن المعرفة العامة الخارجية يمكن أن تعزز قدرات التعميم. عند مقارنته بنموذج اللغة الكبير GPT-3.5 في إعداد عدم وجود معلومات مسبقة، يظهر النموذج المقترح أداءً تنافسيًا بينما يستخدم بنية أصغر وأنبوب تفكير أكثر قابلية للتفسير. تؤكد هذه المقارنة على قيمة دمج المعرفة الصريحة لأنظمة التفكير العامة الشفافة والفعالة من حيث الموارد. بشكل عام، تؤكد النتائج على خيارات التصميم الأساسية للنموذج، بما في ذلك التفكير الديناميكي في الرسوم البيانية، وتصنيف الرسوم الفرعية بناءً على الملاءمة، ودمج التمثيلات الرمزية والسياقية، والتي تساهم مجتمعة في تحسين الدقة والمتانة في مهام التفكير العام.

المناقشة

في هذا القسم، يناقش المؤلفون التقدم في الإجابة على الأسئلة (QA) ضمن معالجة اللغة الطبيعية (NLP)، مع التركيز بشكل خاص على دمج نماذج اللغة المدربة مسبقًا على نطاق واسع (PLMs) مثل BERT و RoBERTa و T5 مع مصادر المعرفة الخارجية مثل الرسوم البيانية المعرفية (KGs). بينما تتفوق هذه النماذج في الأسئلة القائمة على الحقائق، فإنها تواجه صعوبة في التفكير العام بسبب اعتمادها على البيانات غير المنظمة. لتعزيز قدرات التفكير، استكشف الباحثون طرقًا متنوعة، بما في ذلك استخدام الشبكات العصبية الرسومية (GNNs) وبشكل خاص شبكات انتباه الرسوم البيانية (GATs)، لتسهيل التفكير المنظم على الرسوم البيانية المعرفية. يبرز المؤلفون قيود الأساليب الحالية، مثل آليات الانتباه الثابتة والهياكل الرسومية الصارمة، التي تعيق المرونة في التعامل مع مهام التفكير المتنوعة.

لمعالجة هذه التحديات، يقترح المؤلفون الاستفادة من شبكة انتباه الرسوم البيانية v2 (GATv2)، التي تقدم آلية انتباه ديناميكية تتكيف مع سياق العقد المجاورة، مما يحسن قدرة النموذج على أداء التفكير العام على الرسوم الفرعية المستمدة من ConceptNet. يوضحون استراتيجية استخراج متعددة المراحل لبناء الرسوم الفرعية ذات الصلة لكل زوج من السؤال والإجابة، مع التأكيد على أهمية الملاءمة الدلالية في تصفية العقد. يسمح دمج GATv2 مع التمثيلات اللغوية السياقية من BERT بفهم أكثر دقة لكل من المعرفة المنظمة والنص غير المنظم، مما يعزز الأداء في مهام التفكير العام. يتم تقييم الطريقة المقترحة على المعايير المعتمدة، CommonsenseQA و OpenBookQA، مما يوضح فعاليتها في تحسين أنظمة QA من خلال دمج التمثيلات القائمة على الرسوم البيانية واللغوية.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-33854-2
PMID: https://pubmed.ncbi.nlm.nih.gov/41559152
Publication Date: 2026-01-20
Author(s): Mohamed Samir et al.
Primary Topic: Topic Modeling

Overview

This research presents a novel question answering (QA) framework that synergizes commonsense knowledge from ConceptNet with deep contextual embeddings from BERT, utilizing a Graph Attention Network v2 (GATv2) for structured reasoning. The framework constructs a relevant subgraph for each question-answer pair from ConceptNet, which is processed to capture semantic relationships among concepts. Concurrently, BERT encodes the question-answer pair to generate contextual language representations. The integration of these two representations into a joint embedding facilitates enhanced inference capabilities, effectively merging structured knowledge with unstructured text understanding.

Empirical evaluations on the CommonsenseQA and OpenBookQA datasets demonstrate significant accuracy improvements, achieving 82.3% and 86.21%, respectively, thereby outperforming existing state-of-the-art methods. These findings underscore the potential of combining knowledge graphs with language models to tackle complex QA tasks that necessitate commonsense reasoning. The study concludes by emphasizing the effectiveness of this hybrid reasoning framework and suggests avenues for future research in enhancing QA systems through the integration of structured and unstructured knowledge.

Methods

The methodology section outlines a novel system that combines structured commonsense knowledge from ConceptNet with contextual information derived from the pre-trained language model BERT, all organized within a graph neural network (GNN) architecture. This architecture consists of three main components, with subgraph construction being a key element.

The experimental setup details the datasets utilized for evaluation, the implementation specifics of the proposed model, the training procedures, and the baseline models employed for comparative analysis. The experimental results section presents the performance metrics of the proposed system on the CommonsenseQA and OpenBookQA datasets, benchmarking it against several state-of-the-art baselines. The analysis aims to elucidate the contributions of each component within the architecture, highlighting the effectiveness of the integrated approach.

Results

The results section presents a comprehensive evaluation of the proposed model’s performance on the CommonsenseQA and OpenBookQA datasets, as summarized in Table 4. The model significantly outperforms various baseline approaches, achieving the highest accuracy with improvements of up to 6.3% on CommonsenseQA and 9.2% on OpenBookQA compared to the strongest knowledge graph (KG)-based models. This success is attributed to three innovative design elements: (1) the implementation of GATv2, which employs a dynamic, context-aware attention mechanism to effectively capture nuanced semantic relationships within subgraphs; (2) a relevance-driven pruning strategy that enhances subgraph clarity by filtering out less informative nodes; and (3) the integration of graph-based reasoning with contextual embeddings from BERT, allowing for a robust combination of structured knowledge and rich language representations.

The findings underscore the model’s effectiveness in merging structured commonsense knowledge with contextual language understanding, particularly in multi-hop reasoning tasks. The results also highlight the model’s robustness in low-resource scenarios, suggesting that external commonsense knowledge can enhance generalization capabilities. When compared to the large language model GPT-3.5 in a zero-shot setting, the proposed model demonstrates competitive performance while utilizing a smaller architecture and a more interpretable reasoning pipeline. This comparison emphasizes the value of explicit knowledge integration for transparent and resource-efficient commonsense reasoning systems. Overall, the results validate the model’s core design choices, including dynamic graph reasoning, relevance-based subgraph filtering, and the fusion of symbolic and contextual embeddings, which collectively contribute to improved accuracy and robustness in commonsense reasoning tasks.

Discussion

In this section, the authors discuss advancements in Question Answering (QA) within Natural Language Processing (NLP), particularly focusing on the integration of large-scale pre-trained language models (PLMs) like BERT, RoBERTa, and T5 with external knowledge sources such as Knowledge Graphs (KGs). While these PLMs excel at factoid-based questions, they struggle with commonsense reasoning due to their reliance on unstructured data. To enhance reasoning capabilities, researchers have explored various methods, including the use of Graph Neural Networks (GNNs) and specifically Graph Attention Networks (GATs), to facilitate structured reasoning over KGs. The authors highlight the limitations of existing approaches, such as static attention mechanisms and rigid graph structures, which hinder flexibility in handling diverse reasoning tasks.

To address these challenges, the authors propose leveraging Graph Attention Network v2 (GATv2), which introduces a dynamic attention mechanism that adapts to the context of neighboring nodes, thereby improving the model’s ability to perform commonsense reasoning over subgraphs derived from ConceptNet. They detail a multi-stage extraction strategy for constructing relevant subgraphs for each question-answer pair, emphasizing the importance of semantic relevance in filtering nodes. The integration of GATv2 with contextualized language embeddings from BERT allows for a more nuanced understanding of both structured knowledge and unstructured text, ultimately enhancing performance on commonsense reasoning tasks. The proposed method is evaluated on established benchmarks, CommonsenseQA and OpenBookQA, demonstrating its effectiveness in improving QA systems through the fusion of graph-based and language-based representations.