فك تشفير ميزات تخطيط الدماغ متعدد المقاييس باستخدام محولات سوين لتخيل الحركة المستقل عن الموضوع Multi-scale EEG feature decoding with Swin Transformers for subject independent motor imagery BCIs

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-32207-3
PMID: https://pubmed.ncbi.nlm.nih.gov/41559106
تاريخ النشر: 2026-01-20
المؤلف: Wasi Ur Rehman Qamar وآخرون
الموضوع الرئيسي: تخطيط الدماغ وواجهات الدماغ-الكمبيوتر

نظرة عامة

يتناول القسم التحديات التي تواجه واجهات الدماغ-الكمبيوتر المستقلة عن الموضوع (BCIs) بسبب التباين العالي بين الأفراد وطبيعة إشارات EEG غير الثابتة، مما يعيق تعميم النموذج. تساهم عوامل مثل التغيرات في أنماط النشاط العصبي، مواضع الأقطاب الكهربائية، والضوضاء الخارجية في صعوبة تطوير واجهات BCI موثوقة دون إعادة معايرة شاملة. لمعالجة هذه التحديات، يقدم البحث المحول المدمج القائم على الالتفاف (CCST)، الذي يستخدم الانتباه الذاتي القائم على النوافذ الهرمية واستخراج الميزات الالتفافية لالتقاط التفاعلات المحلية بين الأقطاب الكهربائية والاعتمادات الزمنية العالمية بشكل فعال.

تمثل الميزات متعددة المقاييس لـ CCST تحسينًا كبيرًا في التعميم عبر الأفراد، وهو أمر حاسم لتطبيقات BCI في العالم الحقيقي. أسفرت التقييمات على مجموعة بيانات BCI Competition IV (2a، 2b) وPhysioNet MI باستخدام التحقق المتقاطع Leave-One-Subject-Out (LOSO) عن دقة تصنيف متقدمة بلغت 68.27%، 76.61%، و71.70%، على التوالي. أظهر التحليل الإحصائي عبر اختبار ويلكوكسون ذو الرتبة الموقعة مع تصحيح بونفيروني تحسينات كبيرة في الأداء مقارنة بالنماذج المرجعية. علاوة على ذلك، يقلل CCST من عدد المعلمات ويقلل من FLOPs مقارنة بنماذج الانتباه الذاتي الكاملة، مما يعزز كفاءته لتطبيقات BCI في الوقت الحقيقي. تضع هذه النتائج CCST كإطار عمل قابل للتوسع وفعال لواجهات BCI المستقلة عن الموضوع، مع تطبيقات محتملة في إعادة التأهيل العصبي، والتكنولوجيا المساعدة، والتدريب المعرفي.

طرق

يستعرض قسم المنهجية الإعداد التجريبي المستخدم في الدراسة. يوضح الظروف المحددة التي أجريت فيها التجارب، بما في ذلك اختيار المواد، المعدات المستخدمة، والمعايير البيئية التي تم الحفاظ عليها طوال التجارب. تم تصميم الإعداد لضمان إمكانية إعادة إنتاج النتائج وموثوقيتها، مع معايرة دقيقة للأدوات والضوابط لتقليل التأثيرات الخارجية.

بالإضافة إلى ذلك، يصف القسم الإجراءات المتبعة خلال التجارب، بما في ذلك تسلسل العمليات وأي بروتوكولات ذات صلة تم الالتزام بها. كانت هذه الطريقة المنهجية تهدف إلى تسهيل جمع البيانات وتحليلها بدقة، وبالتالي دعم صحة النتائج المقدمة في الأقسام اللاحقة من الورقة. بشكل عام، تؤسس المنهجية إطار عمل قوي لفهم النتائج التجريبية وآثارها.

نتائج

في هذا القسم، يقدم المؤلفون تقريرًا عن أداء التصنيف المستقل عن الموضوع لنموذجهم، الذي تم تقييمه من خلال نهج التحقق المتقاطع Leave-One-Subject-Out. تضمنت هذه الطريقة تدريب النموذج على بيانات من جميع الأفراد باستثناء واحد، الذي تم الاحتفاظ به للاختبار. يتم تقديم مقاييس الأداء مجمعة عبر جميع الأفراد، مما يوفر تقييمًا شاملاً لفعالية النموذج.

بالإضافة إلى ذلك، استخدم المؤلفون تقنية التضمين الجار stochastic الموزع (t-SNE) لتصور تمثيلات الميزات المتعلمة من النموذج عبر مجموعة البيانات الكاملة، والتي تشمل تجارب من جميع الأفراد. من خلال تقليل الفضاء المميز عالي الأبعاد إلى بعدين، يكشف تصور t-SNE عن مجموعات متميزة تتوافق مع فئات مختلفة، مما يوضح قدرات النموذج التمييزية والتباين بين الأفراد الموجود في بيانات EEG. توضح الأشكال 3a-c هذه التصورات لمجموعات بيانات BCI IV 2a، 2b، وPhysioNet MI، مما يبرز فعالية النموذج في التقاط الميزات ذات الصلة عبر الأفراد المتنوعين.

مناقشة

يسلط قسم المناقشة في ورقة البحث الضوء على تطور تقنيات التعلم العميق في معالجة إشارات EEG، مع التركيز بشكل خاص على الانتقال من الشبكات العصبية الالتفافية أحادية المقياس (CNNs) إلى الشبكات العصبية الالتفافية متعددة المقاييس (MSCNNs) والنماذج الهجينة التي تدمج CNNs مع المحولات. بينما أظهرت CNNs أحادية المقياس أداءً تنافسيًا، فإن مقاييس الالتفاف الثابتة تحد من القدرة على التكيف مع الطبيعة متعددة الدقة لإشارات EEG. تعزز MSCNNs، مثل HS-CNN، الأداء من خلال استخدام فروع متوازية بأحجام نواة متغيرة لالتقاط ميزات زمنية متنوعة، محققة دقة ملحوظة في مهام تصنيف الصور الحركية (MI). قدمت التطورات الأخيرة نماذج قائمة على المحولات تستفيد من آليات الانتباه الذاتي لنمذجة الاعتماد العالمي، على الرغم من أنها غالبًا ما تواجه صعوبة في استخراج الميزات المحلية.

يعالج نموذج المحول المدمج القائم على الالتفاف (CCST) هذه التحديات من خلال دمج نقاط القوة في CNNs لاستخراج الميزات المحلية مع قدرات الانتباه العالمي للمحول Swin. يستخدم تصميم CCST نهجًا هرميًا يلتقط الاعتمادات الزمانية المكانية متعددة المقاييس مع الحفاظ على الكفاءة الحسابية من خلال آلية الانتباه الذاتي القائمة على النوافذ. يقلل هذا التصميم بشكل كبير من تعقيد النموذج الحسابي، مما يجعله مناسبًا للتطبيقات في الوقت الحقيقي. تظهر التقييمات التجريبية على مجموعات البيانات المرجعية، بما في ذلك BCI Competition IV وPhysioNet، أن CCST يحقق أداءً متفوقًا مستقلًا عن الموضوع مقارنة بالنماذج الحالية، مما يبرز إمكانيته في التعميم عبر الأفراد المتنوعين في تطبيقات واجهات الدماغ-الكمبيوتر المعتمدة على EEG.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-32207-3
PMID: https://pubmed.ncbi.nlm.nih.gov/41559106
Publication Date: 2026-01-20
Author(s): Wasi Ur Rehman Qamar et al.
Primary Topic: EEG and Brain-Computer Interfaces

Overview

The section discusses the challenges faced by subject-independent Brain-Computer Interfaces (BCIs) due to high inter-subject variability and the non-stationary nature of EEG signals, which hinder model generalization. Factors such as variations in neural activity patterns, electrode placements, and external noise contribute to the difficulty of developing reliable BCIs without extensive recalibration. To address these challenges, the study introduces the Compact Convolutional Swin Transformer (CCST), which employs hierarchical window-based self-attention and convolutional feature extraction to effectively capture both local electrode interactions and global temporal dependencies.

The CCST’s multi-scale feature representation significantly enhances generalization across subjects, which is crucial for real-world BCI applications. Evaluations on the BCI Competition IV (2a, 2b) and PhysioNet MI datasets using Leave-One-Subject-Out (LOSO) cross-validation yielded state-of-the-art classification accuracies of 68.27%, 76.61%, and 71.70%, respectively. Statistical analysis via the Wilcoxon signed-rank test with Bonferroni correction demonstrated significant performance improvements over benchmark models. Furthermore, CCST reduces the number of parameters and decreases FLOPs compared to full self-attention models, enhancing its efficiency for real-time BCI applications. These findings position CCST as a scalable and effective framework for adaptive subject-independent BCIs, with potential applications in neurorehabilitation, assistive technology, and cognitive training.

Methods

The methodology section outlines the experimental setup employed in the study. It details the specific conditions under which the experiments were conducted, including the selection of materials, equipment used, and the environmental parameters maintained throughout the trials. The setup was designed to ensure reproducibility and reliability of results, with careful calibration of instruments and controls to minimize external influences.

Additionally, the section describes the procedures followed during the experiments, including the sequence of operations and any relevant protocols adhered to. This systematic approach aimed to facilitate accurate data collection and analysis, thereby supporting the validity of the findings presented in the subsequent sections of the paper. Overall, the methodology establishes a robust framework for understanding the experimental outcomes and their implications.

Results

In this section, the authors report on the subject-independent classification performance of their model, evaluated through a leave-one-subject-out cross-validation approach. This method involved training the model on data from all subjects except one, which was reserved for testing. The performance metrics presented are aggregated across all subjects, providing a comprehensive assessment of the model’s efficacy.

Additionally, the authors employed t-distributed Stochastic Neighbor Embedding (t-SNE) to visualize the learned feature representations from the model across the complete dataset, which includes trials from all subjects. By reducing the high-dimensional feature space to two dimensions, the t-SNE visualization reveals distinct clusters corresponding to different classes, thereby illustrating the model’s distinguishing capabilities and the inter-subject variability inherent in the EEG data. Figures 3a-c depict these visualizations for the BCI IV 2a, 2b, and PhysioNet MI datasets, emphasizing the effectiveness of the model in capturing relevant features across diverse subjects.

Discussion

The discussion section of the research paper highlights the evolution of deep learning techniques in EEG signal processing, particularly focusing on the transition from single-scale convolutional neural networks (CNNs) to more sophisticated multi-scale CNNs (MSCNNs) and hybrid models that integrate CNNs with Transformers. While single-scale CNNs have shown competitive performance, their fixed convolution scales limit adaptability to the multi-resolution nature of EEG signals. MSCNNs, such as HS-CNN, enhance performance by utilizing parallel branches with varying kernel sizes to capture diverse temporal features, achieving notable accuracies in motor imagery (MI) classification tasks. Recent advancements have introduced Transformer-based models that leverage self-attention mechanisms for global dependency modeling, although they often struggle with local feature extraction.

The proposed Compact Convolutional Swin Transformer (CCST) model addresses these challenges by combining the strengths of CNNs for local feature extraction with the global attention capabilities of the Swin Transformer. The CCST architecture employs a hierarchical approach that captures multiscale spatiotemporal dependencies while maintaining computational efficiency through a window-based self-attention mechanism. This design significantly reduces the model’s computational complexity, making it suitable for real-time applications. Empirical evaluations on benchmark datasets, including BCI Competition IV and PhysioNet, demonstrate that CCST achieves superior subject-independent performance compared to existing models, emphasizing its potential for generalization across diverse individuals in EEG-based brain-computer interface applications.