الكشف عن البرمجيات الخبيثة الهجينة وتصنيفها باستخدام الشبكات العصبية العميقة Hybrid Android Malware Detection and Classification Using Deep Neural Networks

المجلة: International Journal of Computational Intelligence Systems، المجلد: 18، العدد: 1
DOI: https://doi.org/10.1007/s44196-025-00783-x
تاريخ النشر: 2025-03-10
المؤلف: Zhenyun Du وآخرون
الموضوع الرئيسي: تقنيات الكشف المتقدمة عن البرمجيات الخبيثة

نظرة عامة

تقدم هذه الورقة البحثية إطار عمل للتعلم العميق لاكتشاف البرمجيات الضارة على أندرويد، والذي يعالج بفعالية التحديات الكبيرة في المنهجيات الحالية، لا سيما فيما يتعلق بالتعتيم وقابلية التوسع وسط التطور السريع للتطبيقات. يستخدم النظام المقترح تحليلًا متعدد الأبعاد لأذونات أندرويد، والنوايا، واستدعاءات واجهة برمجة التطبيقات، مما يسهل استخراج الميزات القوية حتى عند مواجهة الهندسة العكسية. تكشف التقييمات التجريبية أن الإطار يحقق دقة ملحوظة تبلغ 98.2%، متجاوزًا الطريقة السابقة الرائدة، DeepAMD، بنسبة 7.5%. يعزز الهيكل القابلية للتفسير من خلال ربط نتائج الكشف بأنماط سلوكية محددة، ويضمن التقييم الشامل عبر خمسة مجموعات بيانات عامة، بما في ذلك Drebin وAndroZoo وVirusShare، التعميم ويخفف من تحيز مجموعة البيانات.

في الاستنتاجات، تسلط الورقة الضوء على المخاوف الأمنية الملحة التي تثيرها الظهور المستمر لمتغيرات جديدة من البرمجيات الضارة على أندرويد. يظهر النظام المقترح معدل دقة يبلغ 93% في تصنيف حدوث البرمجيات الضارة على الطبقة الثابتة و92% في تحديد عائلات البرمجيات الضارة. عند دمج الطبقة الديناميكية، تصل الدقة في تصنيف فئات البرمجيات الضارة إلى 86.21%، بينما تحقق تصنيف العائلات 68.97%. تؤكد النتائج كفاءة النظام في تحديد وتصنيف البرمجيات الضارة على أندرويد عبر كلا الطبقتين باستخدام مجموعة بيانات CICAndMal2019. يهدف العمل المستقبلي إلى تطوير نهج لبناء مجموعة بيانات تلقائي لتعزيز اكتشاف البرمجيات الضارة وتحليلها، مستفيدًا من تقنيات متقدمة لتجميع مجموعات بيانات متنوعة وشاملة من مصادر مختلفة، مما يحسن جودة وحجم بيانات البرمجيات الضارة المتاحة للبحث.

مقدمة

تتناول مقدمة الورقة التهديد الكبير الذي تشكله البرمجيات الخبيثة (البرمجيات الضارة) على الأنظمة الرقمية، مسلطة الضوء على أنواع مختلفة مثل برامج الفدية، والبرمجيات الخبيثة، وبرامج الإعلانات، وبرامج التجسس، التي يتم توزيعها من خلال رسائل البريد الإلكتروني الاحتيالية، والتنزيلات الضارة، والمواقع الإلكترونية المخترقة. يمكن أن تؤدي البرمجيات الضارة إلى عواقب وخيمة بما في ذلك سرقة البيانات، وتعطيل الشبكة، والخسائر المالية. تؤكد الورقة على تزايد انتشار هجمات البرمجيات الضارة وضرورة أن يتبنى الأفراد والمنظمات تدابير استباقية، مثل الحفاظ على أمان النظام وتثقيف المستخدمين حول التهديدات المحتملة.

يحدد المؤلفون فجوة حاسمة في منهجيات اكتشاف البرمجيات الضارة الحالية، لا سيما في سياق تطبيقات أندرويد، حيث تعتمد الحلول الحالية غالبًا على التحليل الثابت أو الديناميكي فقط دون إطار عمل متماسك يدمج كلا النهجين. تؤدي هذه الفجوة إلى قدرات كشف غير كافية، خاصة ضد البرمجيات الضارة المتطورة والتطبيقات المموهة. لمعالجة هذه التحديات، تقترح الورقة نموذجًا هجينًا لاكتشاف البرمجيات الضارة على أندرويد يشمل مجموعة واسعة من عائلات البرمجيات الضارة ويستخدم كل من التعلم العميق والمصنفات التقليدية للتعلم الآلي. توضح الأقسام التالية الإطار الفني، وتحليل مجموعة البيانات، والمنهجية المقترحة، وتقييم أداء النموذج، بهدف تعزيز دقة وفعالية اكتشاف البرمجيات الضارة على منصات أندرويد.

طرق

في هذا القسم، يوضح المؤلفون الطرق المستخدمة لتقييم فعالية نموذج التعلم الآلي المقترح لتصنيف البرمجيات الضارة. تم تقسيم مجموعة البيانات إلى 80% للتدريب و20% للاختبار. تم حساب مقاييس الأداء الرئيسية—الدقة، والدقة، والاسترجاع، ودرجة F—باستخدام التعريفات القياسية، حيث يتم تعريف الإيجابيات الحقيقية (TP)، والإيجابيات الكاذبة (FP)، والسلبيات الحقيقية (TN)، والسلبيات الكاذبة (FN) في سياق اكتشاف الشذوذ. حققت الطريقة المقترحة دقة تبلغ 93%، متجاوزة النماذج التقليدية مثل J48 (90.5%)، وNaive Bayes (NB، 62.0%)، وSMO (91.8%)، وMLP (90.5%)، حيث تم عزو الأداء المنخفض لـ NB إلى اعتماده على مجموعات بيانات أكبر.

تشير النتائج إلى أن التقنية المقترحة لا تتطابق فقط مع دقة DeepAMD (93.4%) ولكنها تظهر أيضًا أداءً متفوقًا عبر مقاييس مختلفة. تحسنت دقة التدريب من 56.47% إلى 97.63% على مدار 14 دورة، بينما ارتفعت دقة الاختبار من 68% إلى 93%، مع انخفاض طفيف حول الدورة الخمسين. كما أظهرت مقاييس الخسارة تحسنًا كبيرًا، حيث انخفضت خسارة التدريب من 25.26% إلى 0.36%. كشفت مصفوفة الالتباس أن 30 حالة طبيعية تم تصنيفها بشكل خاطئ على أنها ضارة، و12 حالة ضارة تم تصنيفها بشكل خاطئ على أنها طبيعية، مما يبرز المجالات التي تحتاج إلى مزيد من التحسين في قدرات النموذج التنبؤية.

نتائج

في تقييم إطار DeepAMD المقترح، تم إجراء مهمتين رئيسيتين: الكشف عن البرمجيات الضارة وتحديدها، ونسبتها إلى عائلات البرمجيات الضارة المحددة. تهدف مهمة الكشف إلى تحديد ما إذا كان البرنامج المعطى مؤهلاً كبرمجية ضارة، بينما تركز النسبة على تحديد العائلة المحددة للبرمجيات الضارة المكتشفة. تم تنظيم مجموعة البيانات المستخدمة في هذا التقييم، CICInvesAndMal2019، في مخطط تصنيف متعدد المستويات.

في المستوى الأول، تميز مجموعة البيانات بين فئتين عريضتين: البرمجيات الضارة والعينات الحميدة. يوفر المستوى الثاني تصنيفًا تفصيليًا لمختلف أنواع البرمجيات الضارة، بما في ذلك برامج الإعلانات، وبرامج الفدية، وبرامج الخوف، وبرامج الرسائل القصيرة الضارة. أخيرًا، في المستوى الثالث، يتم تحسين مجموعة البيانات بشكل أكبر إلى 38 سلالة متميزة من البرمجيات الضارة، مما يسمح بتحليل شامل لخصائص وسلوكيات البرمجيات الضارة. يسهل هذا النهج المنظم تقييمًا دقيقًا لفعالية DeepAMD في اكتشاف وتصنيف البرمجيات الضارة.

نقاش

تسلط قسم النقاش في الورقة البحثية الضوء على القلق المتزايد بشأن البرمجيات الضارة على أندرويد، لا سيما مع الزيادة المتوقعة في مستخدمي الهواتف الذكية. تم استكشاف تقنيات الكشف المختلفة، بما في ذلك نماذج التعلم العميق التي تستخدم مزيجًا من الميزات الثابتة والديناميكية. من الجدير بالذكر أن نموذجًا يدمج الأذونات والخدمات وتسلسلات التعليمات البرمجية حقق دقة تزيد عن 96%، على الرغم من المطالب العالية على الحوسبة. أظهرت طرق أخرى، مثل تلك التي تستخدم خوارزميات التعلم الآلي مثل Naïve Bayes وأشجار القرار، فعالية متفاوتة ولكنها غالبًا ما تغفل السلوكيات الديناميكية. يحسن النهج الهجين المقترح في هذه الدراسة على التقنيات الحالية من خلال دمج استدعاءات واجهة برمجة التطبيقات، والأذونات، والنوايا، مما يؤدي إلى زيادة بنسبة 6.9% في الدقة وتحسين بنسبة 5.7% في الاسترجاع مقارنة بالطرق الرائدة مثل DeepAMD.

علاوة على ذلك، تناقش الورقة القيود المفروضة على المنهجيات الحالية، مثل الاعتماد على التحليل الثابت والتحديات التي تطرحها التعتيم على واجهة برمجة التطبيقات. أظهرت تقنيات مثل SeGDroid وSigPID وعدًا في استخراج المعرفة الدلالية وتحديد الأذونات الحرجة، على التوالي، لكنها لا تزال تواجه مشاكل مع قابلية التوسع والكشف في الوقت الحقيقي. يهدف النهج الهجين المقترح إلى معالجة هذه التحديات من خلال دمج التحليلات الثابتة والديناميكية، مما يعزز تصنيف البرمجيات الضارة إلى عائلات وفئات متميزة. يضمن استخدام مجموعة بيانات CICInvesAndMal2019 تمثيلًا قويًا للعناصر، مما يؤسس في النهاية معيارًا جديدًا لاكتشاف البرمجيات الضارة على أندرويد يركز على التعميم، والقدرة على التكيف، والقدرة على التفسير.

Journal: International Journal of Computational Intelligence Systems, Volume: 18, Issue: 1
DOI: https://doi.org/10.1007/s44196-025-00783-x
Publication Date: 2025-03-10
Author(s): Zhenyun Du et al.
Primary Topic: Advanced Malware Detection Techniques

Overview

This research paper introduces a deep learning framework for Android malware detection that effectively addresses significant challenges in existing methodologies, particularly regarding obfuscation and scalability amid rapid app development. The proposed system employs a multi-dimensional analysis of Android permissions, intents, and API calls, facilitating robust feature extraction even when faced with reverse engineering. Experimental evaluations reveal that the framework achieves a remarkable accuracy of 98.2%, surpassing the previous state-of-the-art method, DeepAMD, by 7.5%. The architecture enhances explainability by correlating detection results with specific behavioral patterns, and extensive benchmarking across five public datasets, including Drebin, AndroZoo, and VirusShare, ensures generalization and mitigates dataset bias.

In the conclusions, the paper highlights the pressing security concerns posed by the continuous emergence of new Android malware variants. The proposed system demonstrates a 93% accuracy rate in classifying malware occurrences on the Static layer and 92% accuracy in identifying malware families. When incorporating the Dynamic layer, the accuracy for classifying malware categories reaches 86.21%, while family classification achieves 68.97%. The findings affirm the system’s efficiency in identifying and classifying Android malware across both layers using the CICAndMal2019 dataset. Future work aims to develop an automated dataset-building approach to enhance malware detection and analysis, leveraging advanced techniques to compile diverse and comprehensive datasets from various sources, thereby improving the quality and scale of malware data available for research.

Introduction

The introduction of the paper addresses the significant threat posed by malicious software (malware) to digital systems, highlighting various types such as ransomware, trojans, adware, and spyware, which are disseminated through phishing emails, malicious downloads, and compromised websites. Malware can lead to severe consequences including data theft, network disruption, and financial losses. The paper emphasizes the increasing prevalence of malware attacks and the necessity for individuals and organizations to adopt proactive measures, such as maintaining system security and educating users about potential threats.

The authors identify a critical gap in current malware detection methodologies, particularly in the context of Android applications, where existing solutions often rely solely on static or dynamic analysis without a cohesive framework that integrates both approaches. This lack of coherence results in inadequate detection capabilities, especially against sophisticated malware and obfuscated applications. To address these challenges, the paper proposes a hybrid model for Android malware detection that encompasses a wider range of malware families and employs both deep learning and traditional machine learning classifiers. The subsequent sections outline the technical framework, dataset analysis, proposed methodology, and evaluation of the model’s performance, ultimately aiming to enhance the accuracy and effectiveness of malware detection on Android platforms.

Methods

In this section, the authors detail the methods employed to evaluate the efficacy of their proposed machine-learning model for malware classification. The dataset was split into 80% for training and 20% for testing. Key performance metrics—accuracy, precision, recall, and F-score—were calculated using standard definitions, where True Positives (TP), False Positives (FP), True Negatives (TN), and False Negatives (FN) are defined in the context of anomaly detection. The proposed method achieved an accuracy of 93%, surpassing traditional models such as J48 (90.5%), Naive Bayes (NB, 62.0%), SMO (91.8%), and MLP (90.5%), with NB’s lower performance attributed to its reliance on larger datasets.

The results indicate that the proposed technique not only matches the accuracy of DeepAMD (93.4%) but also demonstrates superior performance across various metrics. The training accuracy improved from 56.47% to 97.63% over 14 epochs, while test accuracy rose from 68% to 93%, with a slight dip around the 50th epoch. The loss metrics also showed significant improvement, with training loss decreasing from 25.26% to 0.36%. The confusion matrix revealed that 30 normal instances were misclassified as malicious, and 12 malicious instances were incorrectly labeled as normal, highlighting areas for further refinement in the model’s predictive capabilities.

Results

In the evaluation of the proposed DeepAMD framework, two primary tasks were conducted: detection and identification of malware, and attribution to specific malware families. The detection task aims to determine whether a given program qualifies as malware, while attribution focuses on identifying the specific family of the detected malware. The dataset employed for this evaluation, CICInvesAndMal2019, is organized into a multi-tier classification scheme.

At the first level, the dataset distinguishes between two broad categories: malware and benign samples. The second tier provides a detailed classification of various malware types, including adware, ransomware, scareware, and SMS malware. Finally, at the third level, the dataset is further refined into 38 distinct malware lineages, allowing for a comprehensive analysis of malware characteristics and behaviors. This structured approach facilitates a thorough assessment of DeepAMD’s efficacy in malware detection and classification.

Discussion

The discussion section of the research paper highlights the growing concern of Android malware, particularly with an anticipated increase in smartphone users. Various detection techniques have been explored, including deep learning models that utilize a combination of static and dynamic features. Notably, a model integrating permissions, services, and opcode sequences achieved over 96% accuracy, albeit with high computational demands. Other approaches, such as those employing machine learning algorithms like Naïve Bayes and decision trees, demonstrated varying effectiveness but often overlooked dynamic behaviors. The proposed hybrid method in this study improves upon existing techniques by incorporating API calls, permissions, and intents, resulting in a 6.9% increase in accuracy and a 5.7% improvement in recall compared to leading methods like DeepAMD.

Furthermore, the paper discusses the limitations of existing methodologies, such as the reliance on static analysis and the challenges posed by API obfuscation. Techniques like SeGDroid and SigPID have shown promise in extracting semantic knowledge and identifying critical permissions, respectively, but still face issues with scalability and real-time detection. The proposed hybrid approach aims to address these challenges by combining static and dynamic analyses, thereby enhancing the classification of malware into distinct families and categories. The use of the CICInvesAndMal2019 dataset ensures a robust representation of samples, ultimately establishing a new standard for Android malware detection that emphasizes generalization, adaptability, and explainability.