الكشف عن هجمات الحرمان الموزع للخدمة (DDOS) باستخدام خوارزميات التعلم الآلي الخاضعة للإشراف Distributed denial-of-service (DDOS) attack detection using supervised machine learning algorithms

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-024-84879-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40240403
تاريخ النشر: 2025-04-16
المؤلف: S. Abiramasundari وآخرون
الموضوع الرئيسي: أمن الشبكات وكشف التسلل

نظرة عامة

تتناول ورقة البحث القضية الملحة لهجمات حجب الخدمة الموزعة (DDoS)، التي تعطل بشكل كبير الخدمات عبر الإنترنت، لا سيما في التجارة الإلكترونية والمالية. لمواجهة هذه التهديدات، يقترح المؤلفون إطار عمل قائم على PCA للكشف المحسن عن هجمات DDoS (EDAD) الذي يستخدم مجموعة متنوعة من خوارزميات التعلم الآلي الخاضعة للإشراف، بما في ذلك آلة الدعم الناقل (SVM)، والانحدار اللوجستي (LR)، وغابة عشوائية (RF)، وأقرب الجيران (KNN)، وشجرة القرار (DT). يتم تقييم فعالية هذه النماذج باستخدام ثلاثة مجموعات بيانات: CICIDS2018، CICIDS2017، وCICDDoS-2019، مع مؤشرات الأداء التي تشير إلى أن RF حققت أعلى دقة بنسبة 98.9% على مجموعة بيانات CICIDS2017، بينما حقق كل من SVM وKNN 98.7% على CICIDS2018 وCICDDoS-2019، على التوالي.

تخلص الدراسة إلى أن إطار العمل EDAD القائم على PCA يميز بفعالية بين هجمات DDoS وحركة المرور العادية، مما يبرز الأداء المتفوق لخوارزمية الغابة العشوائية. تشمل اتجاهات البحث المستقبلية استكشاف تقنيات التجميع لتعزيز أداء النموذج، وتحسين هندسة الميزات من أجل تفسير أفضل، واستخدام أساليب التعلم العميق مثل الشبكات العصبية التلافيفية (CNN) والشبكات العصبية المتكررة (RNN) لتحديد الأنماط المعقدة في اختراقات الشبكة. تؤكد الورقة على أهمية آليات الكشف القوية في حماية ضد هجمات DDoS، التي لا تزال تمثل تهديدًا كبيرًا في المشهد الرقمي.

طرق

تقدم البحث منهجية تهدف إلى التخفيف من المخاطر التي تطرحها هجمات حجب الخدمة الموزعة (DDoS) في مجال الأمن السيبراني من خلال تطوير إطار عمل يسمى EDAD القائم على PCA. يستخدم هذا الإطار مجموعة متنوعة من خوارزميات التعلم الآلي الخاضعة للإشراف للكشف بفعالية عن هجمات DDoS. تشمل بنية الإطار المقترح، الموضحة في الشكل 1، عدة مراحل حاسمة، بما في ذلك جمع البيانات، ومعالجة البيانات، واختيار الميزات، وتقسيم البيانات إلى مجموعات تدريب واختبار.

لتقييم أداء الإطار، تستخدم الدراسة نماذج متعددة من التعلم الآلي، وتحديدًا آلة الدعم الناقل (SVM)، والانحدار اللوجستي، والغابة العشوائية، وأقرب الجيران (KNN)، وشجرة القرار. يتم تطبيق هذه التقنيات بشكل منهجي للتمييز بين هجمات DDoS والسيناريوهات الحميدة، وبالتالي تقييم فعالية طرق الكشف المنفذة ضمن إطار EDAD القائم على PCA.

نتائج

في هذه الدراسة، يبحث المؤلفون القضية الحرجة لهجمات حجب الخدمة الموزعة (DDoS) ويقترحون نهجًا للتعلم الآلي الخاضع للإشراف للتمييز بين هجمات DDoS وحركة المرور العادية في الشبكة. باستخدام ثلاث مجموعات بيانات—CICIDS2018، CICIDS2017، وCICDoS-2019—تتكون من 79 ميزة تتعلق بحركة مرور الشبكة، قام الباحثون بإجراء معالجة للبيانات التي خفضت مجموعة الميزات إلى 61. تم تدريب النماذج على 70% من البيانات بينما تم تخصيص 30% للتقييم، مع التركيز على معالجة عدم توازن الفئات لتعزيز الأداء والتفسير.

تم استخدام خمس خوارزميات للتعلم الآلي—الغابة العشوائية، والانحدار اللوجستي، وأقرب الجيران (KNN)، وآلة الدعم الناقل (SVM)، وشجرة القرار—لكشف هجمات DDoS. تم تقييم فعالية هذه النماذج باستخدام مصفوفات الارتباك ومقاييس إضافية مثل الدقة والاسترجاع. أظهرت النتائج أن الغابة العشوائية حققت أعلى دقة بنسبة 98.9% على مجموعة بيانات CICIDS2017، بينما حقق كل من SVM وKNN أفضل أداء على مجموعة بيانات CICIDS2018 بدقة 98.7% و98.6%، على التوالي. في مجموعة بيانات CICDDoS2019، حققت كل من الغابة العشوائية وKNN دقة بنسبة 98.7%. تم جدولتها وتصوير مقاييس الأداء، بما في ذلك الدقة، والدقة، والاسترجاع، ودرجة F1، مما يوفر مقارنة شاملة عبر مجموعات البيانات.

مناقشة

تؤكد قسم المناقشة في ورقة البحث على الحاجة الملحة لأنظمة كشف التسلل (IDS) القوية في سياق زيادة التهديدات السيبرانية، لا سيما هجمات حجب الخدمة الموزعة (DDoS). يبرز دمج تقنيات الذكاء الاصطناعي (AI)، وخاصة أشجار القرار، لتعزيز قدرات IDS وتحسين دقة الكشف. تناقش الورقة أهمية اختيار الميزات وطرق تقليل الأبعاد لتحسين أداء IDS، بالإضافة إلى مقاييس التقييم—الدقة، والدقة، والاسترجاع، ودرجة F—المستخدمة لتقييم فعالية النموذج. يظهر النموذج المقترح للكشف الذكي المحسن عن التسلل (E2IDS) أداءً متفوقًا مقارنة بالنماذج الحالية، محققًا دقة إجمالية بنسبة 98% على مجموعة بيانات UNSW-NB15.

تم تحديد اتجاهات البحث المستقبلية، مع التركيز على تحسين منهجيات اختيار الميزات، وتطوير نماذج تنبؤية لمختلف أنواع الهجمات السيبرانية، وتعزيز آليات الدفاع ضد DDoS، لا سيما بالنسبة للشركات الصغيرة والمتوسطة. كما تؤكد الورقة على ضرورة وجود أنظمة كشف في الوقت الحقيقي لتحديد وتخفيف هجمات DDoS بسرعة، مما يضمن استمرار العمليات عبر الإنترنت دون انقطاع. يتم تسليط الضوء على دمج نماذج التعلم العميق، مثل الشبكات العصبية العميقة (DNN)، والشبكات العصبية التلافيفية (CNN)، وشبكات الذاكرة طويلة وقصيرة المدى (LSTM)، كمسار واعد لتحسين قدرات الكشف في بيئات الشبكة الديناميكية. بشكل عام، تعتبر جهود البحث المستمرة ضرورية لتعزيز تدابير الأمن السيبراني ومعالجة المشهد المتطور للتهديدات السيبرانية بشكل فعال.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-024-84879-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40240403
Publication Date: 2025-04-16
Author(s): S. Abiramasundari et al.
Primary Topic: Network Security and Intrusion Detection

Overview

The research paper addresses the pressing issue of Distributed Denial-of-Service (DDoS) attacks, which significantly disrupt online services, particularly in e-commerce and finance. To combat these threats, the authors propose a PCA-based Enhanced Distributed DDoS Attack Detection (EDAD) framework that employs various supervised machine learning algorithms, including Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbors (KNN), and Decision Tree (DT). The effectiveness of these models is evaluated using three datasets: CICIDS2018, CICIDS2017, and CICDDoS-2019, with performance metrics indicating that RF achieved the highest accuracy of 98.9% on the CICIDS2017 dataset, while SVM and KNN both reached 98.7% on CICIDS2018 and CICDDoS-2019, respectively.

The study concludes that the PCA-based EDAD framework effectively distinguishes between DDoS attacks and normal traffic, highlighting the superior performance of the Random Forest algorithm. Future research directions include exploring ensemble techniques to enhance model performance, improving feature engineering for better interpretability, and utilizing deep learning methods such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) to identify complex patterns in network breaches. The paper underscores the importance of robust detection mechanisms in safeguarding against DDoS attacks, which remain a significant threat in the digital landscape.

Methods

The research presents a methodology aimed at mitigating the risks posed by Distributed Denial of Service (DDoS) attacks in the realm of cybersecurity through the development of a framework termed PCA-Based EDAD. This framework utilizes various supervised machine learning algorithms to effectively detect DDoS attacks. The architecture of the proposed framework, illustrated in Figure 1, encompasses several critical phases, including dataset collection, data preprocessing, feature selection, and the division of data into training and testing sets.

To evaluate the framework’s performance, the study employs multiple machine learning models, specifically Support Vector Machine (SVM), Logistic Regression, Random Forest, K-Nearest Neighbors (KNN), and Decision Tree. These techniques are systematically applied to distinguish between DDoS attacks and benign scenarios, thereby assessing the effectiveness of the detection methods implemented within the PCA-Based EDAD framework.

Results

In this study, the authors investigate the critical issue of Distributed Denial-of-Service (DDoS) attacks and propose a supervised machine learning approach to differentiate between DDoS attacks and regular network traffic. Utilizing three datasets—CICIDS2018, CICIDS2017, and CICDoS-2019—comprising 79 features related to network traffic, the researchers performed data preprocessing that reduced the feature set to 61. The models were trained on 70% of the data while 30% was reserved for evaluation, with a focus on addressing class imbalance to enhance performance and interpretability.

Five machine learning algorithms—Random Forest, Logistic Regression, K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Decision Tree—were employed to detect DDoS attacks. The effectiveness of these models was assessed using confusion matrices and additional metrics such as precision and recall. Results indicated that Random Forest achieved the highest accuracy of 98.9% on the CICIDS2017 dataset, while SVM and KNN performed best on the CICIDS2018 dataset with accuracies of 98.7% and 98.6%, respectively. In the CICDDoS2019 dataset, both Random Forest and KNN reached an accuracy of 98.7%. The performance metrics, including accuracy, precision, recall, and F1 score, were tabulated and visualized, providing a comprehensive comparison across the datasets.

Discussion

The discussion section of the research paper emphasizes the critical need for robust Intrusion Detection Systems (IDS) in the context of increasing cyber threats, particularly Distributed Denial-of-Service (DDoS) attacks. It highlights the integration of Artificial Intelligence (AI) techniques, especially Decision Trees, to enhance IDS capabilities and improve detection accuracy. The paper discusses the importance of feature selection and dimensionality reduction methods to optimize IDS performance, as well as the evaluation metrics—accuracy, precision, recall, and F-score—used to assess model efficacy. The proposed enhanced intelligent intrusion detection model (E2IDS) demonstrates superior performance compared to existing models, achieving an overall accuracy of 98% on the UNSW-NB15 dataset.

Future research directions are outlined, focusing on refining feature selection methodologies, developing predictive models for various cyber-attack types, and enhancing DDoS defense mechanisms, particularly for small and medium-sized enterprises. The paper also underscores the necessity for real-time detection systems to swiftly identify and mitigate DDoS attacks, ensuring uninterrupted online operations. The integration of deep learning models, such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) networks, is highlighted as a promising avenue for improving detection capabilities in dynamic network environments. Overall, the ongoing research efforts are deemed essential for advancing cybersecurity measures and effectively addressing the evolving landscape of cyber threats.