الكشف عن الضغط النفسي القائم على الإشارات الفسيولوجية باستخدام نماذج التعلم العميق الهجينة Physiological signal-based mental stress detection using hybrid deep learning models

المجلة: Discover Artificial Intelligence، المجلد: 5، العدد: 1
DOI: https://doi.org/10.1007/s44163-025-00412-8
تاريخ النشر: 2025-07-23
المؤلف: Nandini Modi وآخرون
الموضوع الرئيسي: التعرف على العواطف والمزاج

نظرة عامة

تتناول ورقة البحث التأثير الكبير للإجهاد العقلي على أداء الأفراد وصحتهم، مع تحديد مجموعة متنوعة من المحفزات الداخلية والخارجية كمصادر محتملة للإجهاد. وتؤكد على ضرورة التصنيف الفعال لحالات الإجهاد العقلي لتخفيف هذه الآثار.

تظهر الدراسة فعالية تقنيات التعلم العميق في تصنيف حالات الإجهاد العقلي المختلفة باستخدام إشارات EEG الفسيولوجية. تم تقييم نماذج متنوعة، بما في ذلك الشبكات العصبية التلافيفية (CNN)، الشبكات العصبية المتكررة (RNN)، الذاكرة طويلة وقصيرة المدى (LSTM)، وحدات التكرار المغلقة (GRU)، البيرسيبترون متعدد الطبقات (MLP)، المحولات، ونموذج هجين من CNN-MLP. وقد حقق النموذج الهجين CNN-MLP دقة تنبؤ أعلى من خلال التقاط كل من الميزات الزمنية والمكانية لإشارات EEG بشكل فعال. يبرز التقييم الشامل لهذه النماذج، استنادًا إلى مقاييس مثل الدقة، والخسارة، ودرجة F1، والاسترجاع، والدقة، وAUC-ROC، نقاط القوة والقيود الخاصة بها. تشير النتائج إلى مسارات للبحث المستقبلي لتعزيز تعقيد النموذج وحجم مجموعة البيانات، مما يحسن في النهاية التطبيق العملي لهذه التقنيات في السيناريوهات الواقعية.

الطرق

تشمل المنهجية المستخدمة في توقع الإجهاد العقلي باستخدام إشارات موجات الدماغ EEG ونماذج التعلم العميق عدة خطوات منهجية. في البداية، يتم جمع بيانات EEG ومعالجتها مسبقًا لمعالجة قياس الميزات، وترميز التسميات، والقيم المفقودة. ثم يتم إجراء تحليل البيانات الاستكشافية (EDA) لتحديد توزيعات ثلاث حالات عقلية بناءً على ميزات EEG. بعد EDA، يتم استخدام تحليل المكونات الرئيسية (PCA) لهندسة الميزات، مما يقلل من 988 ميزة أصلية إلى 475، والتي يتم تصورها بعد ذلك من خلال تقنيات متنوعة مثل مصفوفات الارتباط والرسوم البيانية.

يتم تقسيم مجموعة البيانات المعالجة مسبقًا إلى مجموعات تدريب وتحقق، حيث يتم استخدام بيانات التدريب لتطوير وتحسين نماذج التعلم العميق المتعددة، بما في ذلك CNN، LSTM، RNN، GRU، نماذج المحولات، المحولات التلقائية، MLP، ونموذج هجين من CNN-MLP. يتم اختبار النماذج المدربة لتصنيف حالات الإجهاد إلى ثلاث فئات: طبيعي، مضغوط قليلاً، أو مضغوط بشدة. يتم إجراء تقييم الأداء باستخدام مقاييس مثل الدقة، والدقة، والاسترجاع، ودرجة F1، والخسارة، وAUC-ROC، مما يضمن تقييمًا شاملاً لفعالية النماذج في اكتشاف وتصنيف الإجهاد من بيانات EEG.

النتائج

في هذه الدراسة، تم تقييم ثمانية نماذج تعلم عميق (DL)—CNN، MLP، RNN، LSTM، GRU، المحول التلقائي، المحول، ونموذج هجين من CNN-MLP—لفعاليتها في تصنيف الإجهاد العقلي باستخدام تقسيم 80-20% لمجموعة البيانات للتدريب والتحقق. تم تقييم النماذج استنادًا إلى عدة مقاييس أداء، بما في ذلك الدقة، والخسارة، والاسترجاع، والدقة، ودرجة F1، مع استخدام مصفوفات الارتباك لتحليل معدلات الإيجابيات الكاذبة (FP) والسلبيات الكاذبة (FN). تؤكد النتائج على الآثار الحرجة للاختلاطات في سياقات الصحة العقلية، حيث يمكن أن تؤخر السلبيات الكاذبة التدخلات الضرورية، بينما قد تؤدي الإيجابيات الكاذبة إلى قلق غير ضروري للمستخدمين.

أشارت النتائج إلى أن النموذج الهجين CNN-MLP تفوق على نموذج CNN من حيث دقة التحقق (99.40% مقابل 99.19%) وAUC-ROC (0.9999 مقابل 0.9995)، على الرغم من عدد قليل من الاختلاطات. أظهر النموذج الهجين توزيع خطأ أكثر توازنًا عبر مستويات الإجهاد، مما يقلل بشكل خاص من الإيجابيات الكاذبة في فئة “مضغوط بشدة”. بينما أظهرت نماذج LSTM والمحولات دقة تدريب عالية، إلا أنها عانت من الإفراط في التكيف، مما أدى إلى أداء تحقق أقل. بشكل عام، أظهرت نماذج CNN والنموذج الهجين CNN-MLP أداءً متفوقًا، مما يجعلها فعالة للغاية في مهام تصنيف الإجهاد العقلي، كما يتضح من قيم AUC العالية وقدرات التصنيف القوية عبر مستويات الإجهاد المختلفة.

المناقشة

تستعرض قسم المناقشة في ورقة البحث دراسات متنوعة تستخدم تقنيات الذكاء الاصطناعي (AI) لتصنيف الإجهاد العقلي استنادًا إلى الإشارات الفسيولوجية. تشمل هذه الإشارات بيانات تخطيط الدماغ (EEG)، التي تلتقط أنماط ديناميكية لنشاط الدماغ المتعلقة بالإجهاد. تظهر الدراسات المميزة فعالية خوارزميات التعلم الآلي (ML) والتعلم العميق (DL) المختلفة في توقع مستويات الإجهاد، مع نتائج ملحوظة مثل دقة تصنيف تبلغ 98.2% باستخدام نموذج الغابة العشوائية على بيانات المستشعرات القابلة للارتداء (Siam et al.) ودقة 86.13% تم تحقيقها من خلال تحويل إشارات EEG متعددة القنوات إلى صور طوبولوجية متعددة الطيف (Ozdemir et al.). تؤكد الورقة على إمكانيات النماذج الهجينة، وخاصة الجمع بين الشبكات العصبية التلافيفية (CNN) والبيرسيبترون متعدد الطبقات (MLP)، للاستفادة من نقاط القوة في كلا الهيكلين لتحسين استخراج الميزات والتصنيف.

تناقش القسم أيضًا التحديات التي تواجه أبحاث اكتشاف الإجهاد، بما في ذلك محدودية القابلية للتعميم والصعوبات في التطبيق في الوقت الحقيقي. يهدف نموذج CNN-MLP الهجين المقترح إلى معالجة هذه التحديات من خلال دمج قدرات استخراج الميزات المكانية لـ CNN مع نقاط القوة في التصنيف لـ MLP. تسهل هذه الطريقة التعلم الآلي التلقائي من البداية إلى النهاية، مما يقلل من الحاجة إلى هندسة الميزات اليدوية ويعزز متانة النموذج عبر مجموعات بيانات متنوعة. تختتم الورقة بالقول إنه على الرغم من إمكانية تحقيق دقة عالية، إلا أن المزيد من التقدم ضروري لضمان قابلية تطبيق النماذج في السيناريوهات الواقعية، كما يتضح من النظرة العامة المقارنة للبحوث الحديثة الملخصة في الجدول 1.

القيود

تعترف الدراسة بعدة قيود قد تؤثر على النتائج وقابليتها للتطبيق. أولاً، قد يؤدي حجم العينة الصغيرة المكونة من أربعة مشاركين فقط وغياب التنوع الديموغرافي إلى إدخال تحيزات، مما يحد من قابلية تعميم النموذج على مجموعة سكانية أوسع. بالإضافة إلى ذلك، فإن الاعتماد على الميزات المحسوبة مسبقًا بدلاً من إشارات EEG الخام يثير مخاوف بشأن التحيزات المحتملة التي تم إدخالها أثناء عملية استخراج الميزات، المتأثرة بمعلمات مثل إعدادات الفلتر وطول التجزئة.

لمعالجة هذه القيود، يجب أن تهدف الأبحاث المستقبلية إلى توسيع مجموعة المشاركين لتشمل تنوعًا ديموغرافيًا أكبر واستكشاف مجموعة أوسع من معلمات استخراج الميزات. سيساهم ذلك في تعزيز متانة وقابلية تعميم النموذج. علاوة على ذلك، بينما أظهر نموذج CNN-MLP الهجين نتائج واعدة في استخراج الميزات والتصنيف، إلا أنه أظهر ميلًا للإفراط في التكيف بسبب حجم مجموعة البيانات المحدود. يُوصى بتحسين منهجي للمعلمات الفائقة، مع التركيز على قيم مثل حجم الدفعة، ومعدل التعلم، وطول التجزئة، لتحسين أداء النموذج والتعميم عبر مجموعات بيانات مختلفة في الدراسات المستقبلية.

Journal: Discover Artificial Intelligence, Volume: 5, Issue: 1
DOI: https://doi.org/10.1007/s44163-025-00412-8
Publication Date: 2025-07-23
Author(s): Nandini Modi et al.
Primary Topic: Emotion and Mood Recognition

Overview

The research paper addresses the significant impact of mental stress on individual performance and health, identifying various internal and external stimuli as potential stressors. It emphasizes the necessity for effective classification of mental stress states to mitigate these effects.

The study demonstrates the efficacy of deep learning techniques in classifying different mental stress states using physiological EEG signals. Various models, including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), Multi-Layer Perceptrons (MLP), Autoencoders, Transformers, and a hybrid CNN-MLP model, were evaluated. The hybrid CNN-MLP model notably achieved the highest prediction accuracy by effectively capturing both temporal and spatial features of EEG signals. The comprehensive evaluation of these models, based on metrics such as accuracy, loss, F1-score, recall, precision, and AUC-ROC, underscores their respective strengths and limitations. The findings suggest pathways for future research to enhance model sophistication and dataset size, ultimately improving the practical application of these techniques in real-world scenarios.

Methods

The methodology for predicting mental stress using EEG brainwave signals and deep learning models involves several systematic steps. Initially, EEG data is collected and preprocessed to address feature scaling, label encoding, and missing values. Exploratory data analysis (EDA) is then conducted to identify the distributions of three mental states based on the EEG features. Following EDA, principal component analysis (PCA) is employed for feature engineering, reducing the original 988 features to 475, which are subsequently visualized through various techniques such as correlation matrices and histograms.

The preprocessed dataset is divided into training and validation sets, with the training data utilized to develop and optimize multiple deep learning models, including CNN, LSTM, RNN, GRU, Transformer models, Autoencoders, MLP, and a hybrid CNN-MLP model. The trained models are tested to classify stress states into three categories: normal, mildly stressed, or highly stressed. Performance evaluation is conducted using metrics such as accuracy, precision, recall, F1-score, loss, and AUC-ROC, ensuring a thorough assessment of the models’ effectiveness in stress detection and classification from EEG data.

Results

In this study, eight deep learning (DL) models—CNN, MLP, RNN, LSTM, GRU, Autoencoder, Transformer, and a hybrid CNN-MLP—were evaluated for their effectiveness in classifying mental stress using an 80-20% training-validation split of the dataset. The models were assessed based on several performance metrics, including accuracy, loss, recall, precision, and F1-score, with confusion matrices employed to analyze rates of false positives (FP) and false negatives (FN). The findings underscore the critical implications of misclassifications in mental health contexts, where false negatives can delay necessary interventions, while false positives may lead to unnecessary anxiety for users.

The results indicated that the hybrid CNN-MLP model outperformed the CNN model in terms of validation accuracy (99.40% vs. 99.19%) and AUC-ROC (0.9999 vs. 0.9995), despite a slightly higher number of misclassifications. The hybrid model demonstrated a more balanced error distribution across stress levels, particularly reducing false positives in the “Highly Stressed” category. While LSTM and Transformer models showed high training accuracy, they suffered from overfitting, resulting in lower validation performance. Overall, the CNN and hybrid CNN-MLP models exhibited superior performance, making them highly effective for mental stress classification tasks, as evidenced by their high AUC values and robust classification capabilities across different stress levels.

Discussion

The discussion section of the research paper reviews various studies that utilize artificial intelligence (AI) techniques for classifying mental stress based on physiological signals. These signals include electroencephalogram (EEG) data, which captures dynamic patterns of brain activity related to stress. The studies highlighted demonstrate the effectiveness of different machine learning (ML) and deep learning (DL) algorithms in predicting stress levels, with notable findings such as a classification accuracy of 98.2% using a random forest model on wearable sensor data (Siam et al.) and an 86.13% accuracy achieved by converting multi-channel EEG signals into multi-spectral topology images (Ozdemir et al.). The paper emphasizes the potential of hybrid models, particularly the combination of convolutional neural networks (CNN) and multi-layer perceptrons (MLP), to leverage the strengths of both architectures for improved feature extraction and classification.

The section also discusses the challenges faced in stress detection research, including limited generalizability and difficulties in real-time application. The proposed hybrid CNN-MLP model aims to address these challenges by integrating spatial feature extraction capabilities of CNNs with the classification strengths of MLPs. This approach facilitates automated end-to-end learning, reducing the need for manual feature engineering and enhancing model robustness across diverse datasets. The paper concludes that while high accuracies are achievable, further advancements are necessary to ensure the models’ applicability in real-world scenarios, as indicated by the comparative overview of recent research summarized in Table 1.

Limitations

The study acknowledges several limitations that may impact the findings and their applicability. Firstly, the small sample size of only four participants and the lack of demographic diversity could introduce biases, limiting the generalizability of the model to a wider population. Additionally, the reliance on precomputed features instead of raw EEG signals raises concerns about potential biases introduced during the feature extraction process, influenced by parameters such as filter settings and segmentation length.

To address these limitations, future research should aim to expand the participant pool to include a more diverse demographic and explore a wider array of feature extraction parameters. This would enhance the robustness and generalizability of the model. Moreover, while the hybrid CNN-MLP model showed promising results in feature extraction and classification, it exhibited a tendency to overfit due to the limited dataset size. Systematic hyperparameter optimization, focusing on values such as batch size, learning rate, and segmentation length, is recommended for future studies to improve model performance and generalization across different datasets.