الكشف عن نوبات الصرع من إشارات تخطيط الدماغ الكهربائي بناءً على نموذج التعلم العميق 1D CNN-LSTM باستخدام تحويل الموجات المنفصلة Epileptic seizure detection from electroencephalogram signals based on 1D CNN-LSTM deep learning model using discrete wavelet transform

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-18479-9
PMID: https://pubmed.ncbi.nlm.nih.gov/40998985
تاريخ النشر: 2025-09-25
المؤلف: Homa Kashefi Amiri وآخرون
الموضوع الرئيسي: تخطيط الدماغ وواجهات الدماغ-الكمبيوتر

نظرة عامة

تبحث الدراسة في الكشف التلقائي عن نوبات الصرع من إشارات EEG، والتي تتميز بنشاط كهربائي مفرط في الدماغ. تتضمن المنهجية استخراج نطاقات EEG باستخدام تحويل المويجات المتقطع (DWT) ودمجها في متجه ميزات. ثم تتم معالجة هذا المتجه من خلال شبكة عصبية تلافيفية أحادية البعد (CNN) لالتقاط المعلومات المكانية، تليها طبقة الذاكرة طويلة وقصيرة المدى (LSTM) التي تستخرج الميزات الزمنية. يظهر النموذج أداءً قويًا عبر ثلاثة مجموعات بيانات: TUSZ (94.32% دقة)، BONN (97.24% دقة)، وCHB-MIT (96.94% دقة)، متفوقًا على مصنفي التعلم الآلي التقليديين ويظهر تعقيدات حسابية تبلغ $3.07 \times 10^7$، $1.67 \times 10^6$، و$1.67 \times 10^6$، على التوالي.

تخلص الدراسة إلى أن دمج شبكات CNN وLSTM يعزز بشكل فعال قدرة النموذج على استخراج الميزات المكانية والزمنية من إشارات EEG، مع الحد الأدنى من المعالجة المسبقة المطلوبة. تشمل اتجاهات البحث المستقبلية استكشاف تقنيات زيادة البيانات لتوليد بيانات EEG أكثر إفادة وتقييم هياكل التعلم العميق المتقدمة الأخرى التي تم تصميمها في الأصل لمجالات مختلفة، مثل معالجة الصور والصوت، لتطبيقها في تحليل إشارات EEG.

طرق

في هذه الدراسة، تم استخدام طريقة التحقق المتقاطع ذات العشرة طيات لتحقيق توازن بين انحياز التدريب وتباين الاختبار أثناء تحسين نموذج CNN-LSTM أحادي البعد لتصنيف إشارات EEG. تم ضبط هيكل النموذج من خلال تغيير عدد الطبقات التلافيفية وطبقات LSTM، ودوال التحسين، ومعدلات التعلم، وأحجام الدفعات، والفترات الزمنية. كانت التكوين الأمثل تتكون من طبقة LSTM واحدة تحتوي على 200 خلية عصبية، وتحسين آدم بمعدل تعلم قدره 0.0001، وحجم دفعة قدره 60، و300 فترة زمنية، محققة تقسيم تحقق قدره 0.15. سمح هذا الإعداد بتقييم قوي لأداء النموذج، حيث تم تخصيص 76.5% من البيانات للتدريب، و13.5% للتحقق، و10% للاختبار لكل طية، مما يقلل من الإفراط في التكيف ويضمن مجموعة تحقق متنوعة.

شملت الدراسة أيضًا مرحلة معالجة مسبقة شاملة، باستخدام دوال المويجات مثل db وhaar وcoif، حيث كانت دالة المويجات db1 عند ثلاثة مستويات تفكيك تحقق أفضل النتائج. تم تقييم أداء النموذج مقارنة بشبكات CNN التقليدية ومصنفي التعلم الآلي الأساسيين، باستخدام كل من التحليلات الزمنية وترددات النطاق. تضمنت الميزات الزمنية مقاييس إحصائية مثل المتوسط والانحراف المعياري، بينما استخدمت تحليل النطاق الترددي كثافة الطيف الطاقي (PSD) لالتقاط محتوى التردد السائد. تم تفصيل النتائج، التي تظهر مزايا نهج التعلم العميق المقترح مقارنة بالطرق التقليدية، في الجداول المرفقة. الكود الكامل ومواصفات البيئة للتكرار متاحة للجمهور على GitHub.

نتائج

في هذا القسم، يقدم المؤلفون نتائج تجاربهم التي تهدف إلى تقييم أداء نموذجهم المقترح للكشف عن إشارات EEG الصرعية عبر ثلاثة مجموعات بيانات: Bonn وCHB-MIT وTUSZ. تم تقييم النموذج من خلال أطر تصنيف ثنائية ومتعددة الفئات، باستخدام مقاييس تقييم متنوعة مثل الخصوصية (SPF) والحساسية (SEN) والدقة (ACC) والقيمة التنبؤية الإيجابية (PPV) والقيمة التنبؤية السلبية (NPV) ودرجة F1 ومعامل ارتباط ماثيو (MCC). تشير النتائج إلى أن النموذج المقترح يتفوق باستمرار على مصنفي التعلم الآلي التقليديين، بما في ذلك SVC وKNN وGNB وDT وMLP، عبر جميع مجموعات البيانات، مما يظهر قوته وفعاليته في تصنيف أنواع مختلفة من نوبات الصرع.

تكشف النتائج عن معدلات دقة إجمالية عالية تتجاوز 91% ودرجات F1 قوية تتراوح من 77.53% إلى 86.54% لأنواع النوبات المختلفة. ومن الجدير بالذكر أن النموذج حافظ على خصوصية عالية وNPV، وهو أمر حاسم لتقليل الإيجابيات الكاذبة في الإعدادات السريرية، حيث يمكن أن تؤدي حتى نسبة إيجابية كاذبة منخفضة إلى تدخلات غير ضرورية. استخدم المؤلفون التحقق المتقاطع ذو الطيات المتدرجة ودوال خسارة موزونة حسب الفئة لمعالجة عدم توازن الفئات، خاصة في مجموعة بيانات CHB-MIT، وأبلغوا عن مقاييس أداء عالية لكل من فئات النوبات وغير النوبات. بالإضافة إلى ذلك، سمح اعتماد النموذج على تحويل المويجات المتقطع (DWT) لاستخراج الميزات بالتقاط الأنماط الزمنية بفعالية مع الحفاظ على الكفاءة الحسابية، مما يجعله مناسبًا للتطبيقات في الوقت الحقيقي. بشكل عام، تدعم النتائج إمكانية استخدام النموذج في التشخيص والمراقبة السريرية للصرع.

مناقشة

في هذه الدراسة، تم استخدام ثلاث مجموعات بيانات EEG متاحة للجمهور لتطوير نموذج قوي للكشف عن نوبات الصرع: مجموعة بيانات EEG Bonn، ومجموعة بيانات CHB-MIT، ومجموعة بيانات TUH EEG Seizure Corpus (TUSZ). تتكون مجموعة بيانات Bonn من تسجيلات EEG أحادية القناة من كل من الأفراد الأصحاء ومرضى الصرع، مصنفة إلى خمس مجموعات فرعية بناءً على نوع إشارات EEG. تشمل مجموعة بيانات CHB-MIT أكثر من 980 ساعة من تسجيلات EEG من الأطفال المصابين بالصرع، بينما تعتبر مجموعة بيانات TUSZ أكبر مجموعة نوبات، حيث تحتوي على أنواع مختلفة من النوبات تم وضع علامات عليها من قبل خبراء. يستخدم النموذج المقترح هيكل CNN-LSTM أحادي البعد، معززًا بتحويل المويجات المتقطع (DWT) لاستخراج الميزات بشكل فعال، مما يظهر أداءً متفوقًا عبر جميع مجموعات البيانات.

تشير النتائج إلى أن نموذج CNN-LSTM أحادي البعد، الذي يجمع بين قدرات الشبكات العصبية التلافيفية والمتكررة، يتفوق بشكل كبير على مصنفي التعلم الآلي التقليديين. ومن الجدير بالذكر أن فعالية النموذج تختلف مع تعقيد مجموعة البيانات؛ على سبيل المثال، تتطلب مجموعة بيانات CHB-MIT، التي تتميز بضوضاء العالم الحقيقي وتنوعه، الاستخدام التكاملي لمكونات DWT وCNN وLSTM لتحقيق الأداء الأمثل. تؤكد الدراسة على أهمية تقنيات تقليل الضوضاء القوية عند معالجة بيانات EEG في الوقت الحقيقي، خاصة للأفراد المصابين بالصرع. يجب أن تركز الأبحاث المستقبلية على تحسين النموذج من خلال التقييمات في الوقت الحقيقي واستكشاف قابليته للتكيف مع مجموعات بيانات جديدة، مع معالجة التحديات الكامنة التي تطرحها التباينات الفردية وخصائص مجموعة البيانات.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-18479-9
PMID: https://pubmed.ncbi.nlm.nih.gov/40998985
Publication Date: 2025-09-25
Author(s): Homa Kashefi Amiri et al.
Primary Topic: EEG and Brain-Computer Interfaces

Overview

The research investigates the automatic detection of epileptic seizures from EEG signals, which are characterized by excessive electrical activity in the brain. The methodology involves extracting EEG bands using Discrete Wavelet Transform (DWT) and concatenating them into a feature vector. This vector is then processed through a 1-dimensional Convolutional Neural Network (CNN) to capture spatial information, followed by a Long-Short Term Memory (LSTM) layer that extracts temporal features. The model demonstrates strong performance across three datasets: TUSZ (94.32% accuracy), BONN (97.24% accuracy), and CHB-MIT (96.94% accuracy), outperforming traditional machine learning classifiers and exhibiting computational complexities of $3.07 \times 10^7$, $1.67 \times 10^6$, and $1.67 \times 10^6$, respectively.

The study concludes that the integration of CNN and LSTM networks effectively enhances the model’s ability to extract both spatial and temporal features from EEG signals, with minimal preprocessing required. Future research directions include exploring data augmentation techniques to generate more informative EEG data and evaluating other advanced deep learning architectures originally designed for different domains, such as image and audio processing, for their applicability in EEG signal analysis.

Methods

In this study, a 10-fold cross-validation method was employed to balance training bias and test variance while optimizing a 1D CNN-LSTM model for EEG signal classification. The model’s architecture was fine-tuned by varying the number of convolutional and LSTM layers, optimization functions, learning rates, batch sizes, and epochs. The optimal configuration consisted of a single LSTM layer with 200 neurons, Adam optimization at a learning rate of 0.0001, a batch size of 60, and 300 epochs, achieving a validation split of 0.15. This setup allowed for a robust evaluation of model performance, with 76.5% of data allocated for training, 13.5% for validation, and 10% for testing per fold, thus mitigating overfitting and ensuring a diverse validation set.

The study also included a comprehensive pre-processing phase, utilizing wavelet functions such as db, haar, and coif, with the db1 wavelet function at three decomposition levels yielding the best results. The model’s performance was benchmarked against traditional CNNs and basic machine learning classifiers, employing both time-domain and frequency-domain analyses. Time-domain features included statistical measures like mean and standard deviation, while frequency-domain analysis utilized Power Spectral Density (PSD) to capture dominant frequency content. The results, demonstrating the advantages of the proposed deep learning approach over conventional methods, are detailed in the accompanying tables. The complete code and environment specifications for replication are publicly available on GitHub.

Results

In this section, the authors present the results of their experiments aimed at evaluating the performance of their proposed model for detecting epileptic EEG signals across three datasets: Bonn, CHB-MIT, and TUSZ. The model was assessed through binary and multi-class classification frameworks, utilizing various evaluation metrics such as specificity (SPF), sensitivity (SEN), accuracy (ACC), positive predictive value (PPV), negative predictive value (NPV), F1 score, and Matthew’s correlation coefficient (MCC). The results indicate that the proposed model consistently outperformed traditional machine learning classifiers, including SVC, KNN, GNB, DT, and MLP, across all datasets, demonstrating its robustness and effectiveness in classifying different types of epileptic seizures.

The findings reveal high overall accuracy rates exceeding 91% and strong F1 scores ranging from 77.53% to 86.54% for various seizure types. Notably, the model maintained high specificity and NPV, crucial for minimizing false positives in clinical settings, where even a low false positive rate could lead to unnecessary interventions. The authors employed stratified k-fold cross-validation and class-weighted loss functions to address class imbalance, particularly in the CHB-MIT dataset, and reported high performance metrics for both seizure and non-seizure classes. Additionally, the model’s reliance on discrete wavelet transform (DWT) for feature extraction allowed it to capture temporal patterns effectively while maintaining computational efficiency, making it suitable for real-time applications. Overall, the results support the model’s potential for clinical use in epilepsy diagnosis and monitoring.

Discussion

In this study, three publicly accessible EEG datasets were utilized for developing a robust model for epileptic seizure detection: the Bonn EEG dataset, the CHB-MIT dataset, and the TUH EEG Seizure Corpus (TUSZ). The Bonn dataset comprises single-channel EEG recordings from both healthy subjects and epilepsy patients, categorized into five subsets based on the type of EEG signals. The CHB-MIT dataset includes over 980 hours of EEG recordings from children with epilepsy, while the TUSZ dataset is the largest seizure corpus, featuring various seizure types annotated by experts. The proposed model employs a 1D CNN-LSTM architecture, enhanced by Discrete Wavelet Transform (DWT) for effective feature extraction, demonstrating superior performance across all datasets.

The findings indicate that the 1D CNN-LSTM model, which integrates both convolutional and recurrent neural network capabilities, significantly outperforms traditional machine learning classifiers. Notably, the model’s efficacy varies with dataset complexity; for instance, the CHB-MIT dataset, characterized by its real-world noise and variability, necessitates the synergistic use of DWT, CNN, and LSTM components for optimal performance. The study underscores the importance of robust noise reduction techniques when processing real-time EEG data, especially for individuals with epilepsy. Future research should focus on refining the model through real-time evaluations and exploring its adaptability to new datasets, addressing the inherent challenges posed by individual variability and dataset characteristics.