التعلم العميق لأمن الشبكات: نموذج Attention-CNN-LSTM لاكتشاف التسلل بدقة Deep learning for network security: an Attention-CNN-LSTM model for accurate intrusion detection

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-07706-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40593224
تاريخ النشر: 2025-07-01
المؤلف: Abdullah Mujawib Alashjaee
الموضوع الرئيسي: أمن الشبكات وكشف التسلل

نظرة عامة

تقدم البحث نموذج هجين من Attention-CNN-LSTM لاكتشاف التسلل في الشبكات، موضحًا فعاليته من خلال تحليل بيانات حركة الشبكة. يستفيد النموذج من الشبكات العصبية التلافيفية (CNNs) لاستخراج الميزات من البيانات الخام وشبكات الذاكرة طويلة وقصيرة المدى (LSTM) لتحليل الأحداث الزمنية. حقق معدلات دقة عالية بلغت 97.5% على مجموعة بيانات Bot-IoT و94.8% على مجموعة بيانات NSL-KDD، إلى جانب معدلات إيجابية خاطئة منخفضة، مما يدل على إمكانيته كحل قوي لتحديد أنواع مختلفة من التسللات الشبكية.

على الرغم من هذه النتائج الواعدة، يعترف البحث بوجود قيود في مجموعات البيانات المستخدمة، لا سيما أنماط الحركة القديمة وغياب الحركة المشفرة. ستتضمن الأعمال المستقبلية اختبار النموذج على مجموعات بيانات أكثر حداثة، مثل CIC-IDS2017 وTON_IoT، لتعزيز قابليته للتطبيق على سيناريوهات الهجوم الحديثة. بالإضافة إلى ذلك، يبرز البحث الحاجة إلى تحسينات في القابلية للتوسع للتعامل مع مجموعات بيانات أكبر ودمج قدرات الكشف في الوقت الحقيقي، والتي تعتبر حاسمة للتخفيف من التهديدات في الوقت المناسب. كما يُقترح استكشاف التدريب العدائي لتعزيز مرونة النموذج ضد تكتيكات التهرب المتطورة.

طرق

تدمج المنهجية المقترحة آليات الانتباه، والشبكات العصبية التلافيفية (CNNs)، وشبكات الذاكرة طويلة وقصيرة المدى (LSTM) لتطوير استراتيجية اكتشاف التسلل (ID) تعتمد على التعلم العميق. يستفيد هذا النموذج من الميزات المكانية والزمنية المستخرجة من بيانات حركة الشبكة لتحديد أنواع مختلفة من الهجمات الإلكترونية. تتضمن المرحلة الأولية معالجة البيانات الخام من مجموعتي بيانات NSL-KDD وBot-IoT، والتي تشمل استخراج الميزات والتطبيع. تلتقط طبقات LSTM العلاقات الزمنية بفعالية، بينما تركز طبقات CNN على تحديد الميزات والأنماط المكانية داخل بيانات الحركة. يتم دمج آلية الانتباه لتعزيز دقة الكشف من خلال التأكيد على الميزات الحرجة.

يتم تصنيف المخرجات النهائية للنموذج باستخدام طبقة متصلة بالكامل، ويتم تقييم أدائه من خلال مقاييس مثل معامل ارتباط ماثيو (MCC)، الاسترجاع، الدقة، درجة F1، والدقة. تظهر النتائج أن النموذج المقترح قادر على اكتشاف التسللات الشبكية بفعالية، مع مقارنات أداء ضد طرق أخرى، بما في ذلك CNN وLSTM والشبكات العصبية العميقة (DNN). يتم توضيح الهيكل الكامل للتدفق المقترح في الشكل 1.

نتائج

تظهر نتائج هذه الدراسة فعالية نموذج Attention-CNN-LSTM الهجين المقترح في تعزيز قدرات اكتشاف التسلل، الذي تم تقييمه على مجموعتي بيانات مرجعية: Bot-IoT وNSL-KDD. حقق النموذج دقة تصنيف عالية وكشف قوي عن أنواع مختلفة من الهجمات، متفوقًا بشكل كبير على النماذج التقليدية. تم استخدام مقاييس الأداء مثل الاسترجاع، معامل ارتباط ماثيو (MCC)، الدقة، الدقة، ودرجة F1 لتقييم فعالية النموذج.

تم تفصيل المعلمات الفائقة الرئيسية والتكوينات المعمارية في الجدول 2. يستخدم النموذج مُحسِّن آدم بمعدل تعلم قدره 0.001 لتحقيق تقارب مستقر، وحجم دفعة قدره 128، و10 دورات تدريبية لتحسين التوازن بين الدقة ومدة التدريب. تم تنفيذ معدل إسقاط قدره 0.3 للتخفيف من الإفراط في التكيف. تستخدم الطبقات التلافيفية 64 و128 فلترًا بأحجام نواة 3 و5، على التوالي، لالتقاط الأنماط المحلية في البيانات، بينما يقلل التجميع الأقصى بحجم نافذة 2 الأبعاد المكانية. تتكون طبقة LSTM من 64 وحدة لتعلم الاعتماديات الزمنية بفعالية، وتسمح آلية الانتباه المعتمدة على المنتج النقطي المقاس للنموذج بإعطاء الأولوية للميزات المهمة أثناء التدريب. يقوم تطبيع Z-score بتوحيد الميزات، ويتم تقسيم مجموعة البيانات إلى 80% للتدريب و20% للاختبار لضمان القابلية للتعميم.

مناقشة

تؤكد قسم المناقشة في ورقة البحث على الحاجة الملحة لأنظمة الأمن السيبراني المتقدمة القادرة على الكشف والمراقبة في الوقت الحقيقي للتهديدات السيبرانية المعقدة بشكل متزايد. يستكشف البحث دمج نماذج الذكاء الاصطناعي المختلفة، لا سيما تقنيات التعلم الآلي (ML) والتعلم العميق (DL)، لتعزيز سرعة ودقة تقييمات التهديدات السيبرانية. تشمل المساهمات الملحوظة نظام اكتشاف الهجمات السيبرانية المعتمد على إنترنت الأشياء (IoT-E-CADS) الذي يستخدم التعلم الآلي للكشف عن الشذوذ في الوقت الحقيقي في أنظمة الشبكة الذكية، ونموذج مصادقة قائم على الثقة (TBAM) الذي يستخدم التعلم العميق لاكتشاف التسلل في شبكات إنترنت الأشياء. كما يبرز البحث قيود التدابير الأمنية التقليدية في بيئات إنترنت الأشياء ويقترح طرقًا مبتكرة، مثل نهج عدم الثقة ونموذج Attention-CNN-LSTM الهجين، لمعالجة هذه الثغرات.

يظهر نموذج Attention-CNN-LSTM الهجين أداءً متفوقًا في اكتشاف التسللات الشبكية من خلال التقاط الأنماط المكانية والزمنية في حركة الشبكة بفعالية. تشير مقاييس التقييم إلى أن هذا النموذج يحقق دقة عالية تبلغ 97.5% على مجموعة بيانات Bot-IoT، متفوقًا على النماذج التقليدية مثل CNN وLSTM. تؤكد دقة النموذج، والاسترجاع، ودرجة F1 على قدرته على تقليل الإيجابيات الخاطئة بينما يحدد بدقة الهجمات الحقيقية. تختتم الدراسة بأن دمج آليات الانتباه والتطبيع الدفعي في النموذج المقترح يعزز بشكل كبير من قوته وفعاليته في اكتشاف التسلل، مما يجعله تقدمًا واعدًا في مجال الأمن السيبراني.

قيود

تسلط قسم القيود الضوء على عدة قيود مرتبطة بتقنيات ونماذج الأمن السيبراني المختلفة التي تم مناقشتها في الأدبيات. على سبيل المثال، يؤكد سينغhal وآخرون أن نماذج الذكاء الاصطناعي الخاصة بهم لا تتضمن اعتبارات أخلاقية أو مرونة في ممارسات الأمن السيبراني، وهو أمر حاسم نظرًا لتزايد تعقيد التهديدات السيبرانية. بالمثل، بينما يقوم SMH وآخرون بتحليل استراتيجيات العرض بشكل فعال تحت سيناريوهات الأعطال، فإنه يتطلب مستوى عالٍ من الأمن السيبراني لوظائفه الصحيحة، مما يشير إلى اعتماد قد يعيق تطبيقه العملي.

علاوة على ذلك، يشير سليمان وآخرون إلى أن أنظمة اكتشاف التسلل (IDS) ضرورية لحماية الشبكات الصغيرة لإنترنت الأشياء، التي تستهدف بشكل متزايد بسبب ثغراتها. ومع ذلك، يبقى التركيز محدودًا على الشبكات الأصغر، مما قد يتسبب في تجاهل الأنظمة الأكبر والأكثر تعقيدًا. أخيرًا، يناقش برينس وآخرون تقنيات التعلم العميق لتأمين أجهزة إنترنت الأشياء، التي تكون عرضة بشكل خاص للمخاطر السيبرانية، لكنهم لا يتناولون دمج هذه التقنيات في الأنظمة الحالية أو تعزيز ثقافة الأمن السيبراني، والتي تعتبر حيوية لاستراتيجيات الأمن الشاملة. بشكل عام، تشير هذه القيود إلى الحاجة إلى نهج أكثر شمولية يتضمن اعتبارات أخلاقية، وقابلية للتوسع، ودمج ثقافي في ممارسات الأمن السيبراني.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-07706-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40593224
Publication Date: 2025-07-01
Author(s): Abdullah Mujawib Alashjaee
Primary Topic: Network Security and Intrusion Detection

Overview

The research presents a hybrid Attention-CNN-LSTM model for network intrusion detection, demonstrating its effectiveness through analysis of network traffic data. The model leverages Convolutional Neural Networks (CNNs) for feature extraction from raw data and Long Short-Term Memory (LSTM) networks for temporal event analysis. It achieved high accuracy rates of 97.5% on the Bot-IoT dataset and 94.8% on the NSL-KDD dataset, alongside low false positive rates, indicating its potential as a robust solution for identifying various network intrusions.

Despite these promising results, the study acknowledges limitations in the datasets used, particularly their outdated traffic patterns and absence of encrypted traffic. Future work will involve testing the model on more contemporary datasets, such as CIC-IDS2017 and TON_IoT, to enhance its applicability to modern attack scenarios. Additionally, the research highlights the need for improvements in scalability to handle larger datasets and the integration of real-time detection capabilities, which are crucial for timely threat mitigation. The exploration of adversarial training is also suggested to bolster the model’s resilience against sophisticated evasion tactics.

Methods

The proposed methodology integrates attention mechanisms, Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM) networks to develop a hybrid deep learning-based intrusion detection (ID) strategy. This model leverages both spatial and temporal features extracted from network traffic data to identify various cyberattack types. The initial phase involves preprocessing raw data from the NSL-KDD and Bot-IoT datasets, which includes feature extraction and normalization. The LSTM layers effectively capture temporal relationships, while CNN layers focus on identifying spatial features and patterns within the traffic data. An attention mechanism is incorporated to enhance detection accuracy by emphasizing critical features.

The model’s final output is categorized using a fully connected layer, and its performance is evaluated through metrics such as Matthews Correlation Coefficient (MCC), recall, accuracy, F1-score, and precision. Results demonstrate that the proposed model is capable of effectively detecting network intrusions, with performance comparisons against other methods, including CNN, LSTM, and Deep Neural Networks (DNN). The complete structure of the proposed flow is illustrated in Figure 1.

Results

The results of this study demonstrate the efficacy of the proposed hybrid Attention-CNN-LSTM model in enhancing intrusion detection capabilities, evaluated on two benchmark datasets: Bot-IoT and NSL-KDD. The model achieved high classification accuracy and robust detection of various attack types, significantly outperforming traditional models. Performance metrics such as recall, Matthews correlation coefficient (MCC), accuracy, precision, and F1-score were utilized to assess the model’s effectiveness.

Key hyperparameters and architectural configurations are detailed in Table 2. The model employs the Adam optimizer with a learning rate of 0.001 for stable convergence, a batch size of 128, and 10 training epochs to optimize the trade-off between accuracy and training duration. A dropout rate of 0.3 is implemented to mitigate overfitting. The convolutional layers utilize 64 and 128 filters with kernel sizes of 3 and 5, respectively, to capture local patterns in the data, while max pooling with a window size of 2 reduces spatial dimensions. The LSTM layer consists of 64 units to effectively learn temporal dependencies, and an attention mechanism based on scaled dot-product allows the model to prioritize significant features during training. Z-score normalization standardizes the features, and the dataset is divided into 80% for training and 20% for testing to ensure generalizability.

Discussion

The discussion section of the research paper emphasizes the critical need for advanced cybersecurity systems capable of real-time detection and monitoring of increasingly complex cyber threats. The study explores the integration of various artificial intelligence (AI) models, particularly machine learning (ML) and deep learning (DL) techniques, to enhance the speed and accuracy of cyber threat assessments. Notable contributions include the IoT-enabled Cyber Attack Detection System (IoT-E-CADS) that employs ML for real-time anomaly detection in smart grid ecosystems, and a Trustworthy-Based Authentication Model (TBAM) that utilizes deep learning for intrusion detection in IoT networks. The paper also highlights the limitations of traditional security measures in IoT environments and proposes innovative methods, such as a zero-trust approach and a hybrid Attention-CNN-LSTM model, to address these vulnerabilities.

The hybrid Attention-CNN-LSTM model demonstrates superior performance in detecting network intrusions by effectively capturing both spatial and temporal patterns in network traffic. Evaluation metrics indicate that this model achieves a high accuracy of 97.5% on the Bot-IoT dataset, outperforming traditional models like CNN and LSTM. The model’s precision, recall, and F1-score further underscore its capability to minimize false positives while accurately identifying genuine attacks. The study concludes that the proposed model’s integration of attention mechanisms and batch normalization significantly enhances its robustness and effectiveness in intrusion detection, making it a promising advancement in the field of cybersecurity.

Limitations

The section on limitations highlights several constraints associated with various cybersecurity techniques and models discussed in the literature. For instance, Singhal et al. emphasize that their AI models do not incorporate ethical considerations or resilience in cybersecurity practices, which is crucial given the increasing sophistication of cyber threats. Similarly, while SMH et al.’s IoT-ECADS effectively analyzes supply strategies under fault scenarios, it necessitates a high level of cybersecurity for proper functionality, indicating a dependency that could hinder its practical application.

Moreover, Slimane et al. point out that intrusion detection systems (IDS) are essential for protecting small IoT networks, which are increasingly targeted due to their vulnerabilities. However, the focus remains limited to smaller networks, potentially overlooking larger, more complex systems. Lastly, Prince et al. discuss deep learning techniques for securing IoT devices, which are particularly vulnerable to cyber risks, but they do not address the integration of these techniques into existing systems or the promotion of a cybersecurity culture, which are vital for comprehensive security strategies. Overall, these limitations suggest a need for more holistic approaches that incorporate ethical considerations, scalability, and cultural integration in cybersecurity practices.