مسح حول أساليب جدولة الموارد في بيئة الحوسبة متعددة الوصول: دراسة تعلم تعزيز عميق A survey on resource scheduling approaches in multi-access edge computing environment: a deep reinforcement learning study

المجلة: Cluster Computing، المجلد: 28، العدد: 3
DOI: https://doi.org/10.1007/s10586-024-04893-7
تاريخ النشر: 2025-01-21
المؤلف: Ahmed A. Ismail وآخرون
الموضوع الرئيسي: إنترنت الأشياء والحوسبة الحافة/الضباب

نظرة عامة

تقدم هذه القسم نظرة عامة على الحوسبة الطرفية متعددة الوصول (MEC) ودورها في تعزيز جودة التجربة (QoS) للأجهزة ذات الموارد المحدودة من خلال تمكينها من تحميل المهام التي تتطلب معالجة مكثفة إلى خوادم MEC القريبة. يقلل هذا التحميل من وقت تنفيذ التطبيقات واستهلاك الطاقة، ولكنه يقدم أيضًا تحديات معقدة في جدولة الموارد. تستعرض الورقة الحالة الحالية للبحث في جدولة الموارد في MEC، مع تركيز خاص على حلول التعلم المعزز العميق (DRL)، مما يجعلها أول دراسة مخصصة لهذا التقاطع. تصنف الأدبيات إلى ثلاثة جوانب رئيسية: تخزين المحتوى، تحميل الحساب، وإدارة الموارد، بينما تقارن أيضًا 95 مقالة نُشرت بين عامي 2019 و2023 بناءً على معايير مختلفة مثل حالات استخدام التطبيقات، وهياكل الشبكات، ومقاييس التقييم.

في الخاتمة، يؤكد المؤلفون على إمكانيات MEC لدعم العديد من تطبيقات إنترنت الأشياء (IoT) والتطبيقات الزمنية الحقيقية التي تتطلب معالجة مكثفة. ومع ذلك، فإن العدد المتزايد من الأجهزة المتصلة يعقد جدولة الموارد. تسلط الدراسة الضوء على مزايا DRL مقارنة بالطرق التقليدية وتصنف المقالات المراجعة بناءً على الأساليب المركزية والموزعة. كما تتناول تأثير تنقل المستخدم على نماذج جدولة الموارد وتحدد العديد من التحديات غير المحلولة للبحث المستقبلي، بما في ذلك الأمان والخصوصية، والوعي بالحركة، وتوازن الحمل، ومخاوف QoS، وقابلية التوسع، وتقسيم التطبيقات. تقترح الورقة استراتيجيات محتملة للتعامل مع هذه التحديات، مما يوفر أساسًا شاملاً لمزيد من الاستكشاف في هذا المجال.

مقدمة

تناقش مقدمة الورقة التأثير الكبير لإنترنت الأشياء (IoT) وظهور شبكات 5G على التطبيقات المختلفة، بما في ذلك التعرف على الوجه، والأجهزة القابلة للارتداء، والمركبات المستقلة. مع توقع وجود 75 مليار جهاز IoT بحلول عام 2025، تصبح الحاجة إلى إدارة الموارد بكفاءة أمرًا حاسمًا، خاصةً وأن العديد من هذه الأجهزة تعاني من قيود في الموارد وغير قادرة على دعم التطبيقات الحساسة للزمن. لمعالجة هذه التحديات، يتم اقتراح نموذجين – الحوسبة السحابية المتنقلة (MCC) والحوسبة الطرفية متعددة الوصول (MEC). بينما يسمح MCC بتشغيل التطبيقات المتنقلة على موارد السحابة، فإنه يعاني من زيادة في زمن الانتظار بسبب المسافة الفيزيائية من السحابة. على النقيض من ذلك، تقرب MEC الحساب من المستخدم، مما يقلل زمن الانتظار ويعزز الأداء للتطبيقات الزمنية الحقيقية.

تؤكد الورقة على ضرورة تحسين الموارد داخل MEC، حيث يمكن أن تواجه خوادم الحافة، على الرغم من كونها أكثر قوة من أجهزة المستخدم، الازدحام بسبب المهام المتنافسة من المستخدمين. يتم تسليط الضوء على دمج الذكاء الاصطناعي (AI)، وخاصة التعلم المعزز العميق (DRL)، كنهج واعد لتحسين جدولة الموارد في بيئات MEC. قدرة DRL على التعامل مع البيئات المعقدة وعالية الأبعاد والتعلم من خلال التجربة والخطأ تجعلها أداة قيمة لاتخاذ القرارات الزمنية الحقيقية عبر مجالات مختلفة. تشمل المساهمات الرئيسية للدراسة تحليلًا شاملاً للحلول الحالية المعتمدة على DRL لجدولة موارد MEC، ومعالجة قضايا التنقل، وتحديد تحديات البحث المستقبلية.

طرق

في هذا القسم، يحدد المؤلفون منهجية بحثهم، مع التركيز على جدولة الموارد داخل أنظمة الحوسبة الطرفية المتنقلة (MEC). يطرحون عدة أسئلة بحثية (RQs) لتوجيه تحقيقهم، بما في ذلك تصنيف هياكل MEC، وقضايا جدولة الموارد، وتطبيق خوارزميات التعلم المعزز (RL). على وجه التحديد، يفحصون أنواع هياكل MEC التي تم النظر فيها، وتصنيف مشاكل جدولة الموارد، والحلول المعتمدة على RL المستخدمة، وطرق التقييم التي استخدمها المؤلفون السابقون. بالإضافة إلى ذلك، يتناولون تأثير تنقل المستخدم على أنظمة MEC ويحددون اتجاهات البحث المستقبلية.

تناقش الورقة أيضًا منهجيات RL المختلفة، مصنفة إياها إلى طرق قائمة على القيمة، وطرق قائمة على السياسة، وطرق الممثل-الناقد. تركز الطرق القائمة على القيمة، مثل التعلم Q ومشتقاته (مثل DQN، وDQN المزدوج، وDueling DQN)، على تقدير المكافآت المستقبلية المتوقعة للإجراءات لإبلاغ اتخاذ القرار. تعمل طرق تدرج السياسة، بما في ذلك تحسين السياسة القريبة (PPO) وتحسين سياسة منطقة الثقة (TRPO)، على تحسين السياسات مباشرة لزيادة المكافآت. أخيرًا، تجمع طرق الممثل-الناقد بين كلا النهجين، باستخدام ممثل لتحديد الإجراءات وناقد لتقييم المكافآت المتوقعة، مع تسليط الضوء على خوارزميات ملحوظة مثل Advantage Actor-Critic (A2C) وDeep Deterministic Policy Gradient (DDPG) لفعاليتها في فضاءات الإجراءات المستمرة.

نقاش

يستعرض قسم النقاش في الورقة تنظيم النتائج الرئيسية المتعلقة بالحوسبة الطرفية متعددة الوصول (MEC) وتطوراتها الحديثة. يبدأ بتفصيل هيكل الدراسة، الذي يتضمن استكشاف الأسئلة البحثية، والمنهجيات، ومراجعة الأدبيات الموجودة حول MEC. تؤكد الورقة على أهمية MEC في تعزيز الحوسبة السحابية من خلال نشر خوادم الحافة بالقرب من المستخدمين، مما يسهل تخزين المحتوى بكفاءة، وتحميل الحساب، وإدارة الموارد. يتم تسليط الضوء على دمج الشبكات المعرفة بالبرمجيات (SDN) مع MEC وإنترنت الأشياء (IoT) كنهج واعد لتحسين إدارة الشبكة من خلال نموذج جديد يسمى SDIoT-Edge.

يتناول القسم أيضًا تحديات جدولة الموارد داخل MEC، محددًا القضايا الرئيسية مثل التنافس بين الأجهزة، والتنسيق بين الأجهزة الطرفية المتنوعة والسحابة، ومخاوف الخصوصية. يناقش مجموعة من خوارزميات التحسين، بما في ذلك الألعاب التعاونية وعمليات اتخاذ القرار ماركوف (MDPs)، مع الإشارة إلى أن العديد من هذه المشاكل صعبة NP. تدعو الورقة إلى استخدام طرق التعلم المعزز العميق (DRL) لمعالجة هذه التحديات بفعالية. بالإضافة إلى ذلك، تفحص استراتيجيات تخزين الحافة المتنقلة، وخصائص تحميل الحساب، وتنظيم الخدمات، مما يبرز الحاجة إلى حلول مبتكرة لتعزيز استخدام الموارد وتقليل زمن الانتظار في بيئات الحافة. تختتم المناقشة بتلخيص مساهمات الدراسات ذات الصلة وضرورة المزيد من البحث في إدارة الموارد وجدولتها ضمن أطر MEC.

Journal: Cluster Computing, Volume: 28, Issue: 3
DOI: https://doi.org/10.1007/s10586-024-04893-7
Publication Date: 2025-01-21
Author(s): Ahmed A. Ismail et al.
Primary Topic: IoT and Edge/Fog Computing

Overview

The section provides an overview of multi-access edge computing (MEC) and its role in enhancing the quality of experience (QoS) for resource-constrained devices by enabling them to offload compute-intensive tasks to nearby MEC servers. This offloading reduces application execution time and energy consumption, but it also introduces complex resource scheduling challenges. The paper surveys the current state of research on resource scheduling in MEC, with a particular focus on deep reinforcement learning (DRL) solutions, marking it as the first survey dedicated to this intersection. It categorizes the literature into three main aspects: content caching, computation offloading, and resource management, while also comparing 95 articles published between 2019 and 2023 based on various criteria such as application use cases, network architectures, and evaluation metrics.

In the conclusion, the authors emphasize the potential of MEC to support numerous IoT and real-time applications that require intensive processing. However, the increasing number of connected devices complicates resource scheduling. The survey highlights the advantages of DRL over traditional methods and classifies the reviewed articles based on centralized and distributed approaches. It also addresses the impact of user mobility on resource scheduling models and outlines several unresolved challenges for future research, including security and privacy, mobility awareness, load balancing, QoS concerns, scalability, and application partitioning. The paper suggests potential strategies to tackle these challenges, providing a comprehensive foundation for further exploration in the field.

Introduction

The introduction of the paper discusses the significant impact of the Internet of Things (IoT) and the advent of 5G networks on various applications, including face recognition, wearable devices, and autonomous vehicles. With an anticipated 75 billion IoT devices by 2025, the need for efficient resource management becomes critical, particularly as many of these devices are resource-constrained and unable to support delay-sensitive applications. To address these challenges, two paradigms—mobile cloud computing (MCC) and multi-access edge computing (MEC)—are proposed. While MCC allows mobile applications to run on cloud resources, it suffers from increased latency due to physical distance from the cloud. In contrast, MEC brings computation closer to the user, thereby reducing latency and enhancing performance for real-time applications.

The paper emphasizes the necessity of optimizing resources within MEC, as edge servers, despite being more powerful than user devices, can still face congestion due to competing user tasks. The integration of artificial intelligence (AI), particularly deep reinforcement learning (DRL), is highlighted as a promising approach to improve resource scheduling in MEC environments. DRL’s ability to handle complex, high-dimensional environments and learn through trial and error positions it as a valuable tool for real-time decision-making across various domains. The main contributions of the survey include a comprehensive analysis of current DRL-based solutions for MEC resource scheduling, addressing mobility issues, and identifying future research challenges.

Methods

In this section, the authors outline their research methodology, focusing on resource scheduling within Mobile Edge Computing (MEC) systems. They pose several research questions (RQs) to guide their investigation, including the classification of MEC architectures, resource scheduling issues, and the application of reinforcement learning (RL) algorithms. Specifically, they examine the types of MEC architectures considered, the classification of resource scheduling problems, the RL-based solutions employed, and the evaluation methods used by previous authors. Additionally, they address the impact of user mobility on MEC systems and identify future research directions.

The paper also discusses various RL methodologies, categorizing them into value-based, policy-based, and actor-critic methods. Value-based methods, such as Q-learning and its derivatives (e.g., DQN, double DQN, and dueling DQN), focus on estimating the expected future rewards of actions to inform decision-making. Policy gradient methods, including Proximal Policy Optimization (PPO) and Trust Region Policy Optimization (TRPO), optimize policies directly to maximize rewards. Lastly, actor-critic methods combine both approaches, utilizing an actor to determine actions and a critic to evaluate the expected rewards, with notable algorithms like Advantage Actor-Critic (A2C) and Deep Deterministic Policy Gradient (DDPG) being highlighted for their effectiveness in continuous action spaces.

Discussion

The discussion section of the paper outlines the organization and key findings related to Multi-access Edge Computing (MEC) and its recent advancements. It begins by detailing the structure of the survey, which includes an exploration of research questions, methodologies, and a review of existing literature on MEC. The paper emphasizes the significance of MEC in enhancing cloud computing by deploying edge servers closer to users, thereby facilitating efficient content caching, computation offloading, and resource management. The integration of Software-Defined Networking (SDN) with MEC and the Internet of Things (IoT) is highlighted as a promising approach for optimizing network management through a new paradigm termed SDIoT-Edge.

The section further delves into resource scheduling challenges within MEC, identifying key issues such as contention among devices, coordination between heterogeneous edge devices and the cloud, and privacy concerns. It discusses various optimization algorithms, including cooperative games and Markov Decision Processes (MDPs), while noting that many of these problems are NP-hard. The paper advocates for the use of Deep Reinforcement Learning (DRL) methods to address these challenges effectively. Additionally, it examines mobile edge caching strategies, computation offloading properties, and service orchestration, underscoring the need for innovative solutions to enhance resource utilization and minimize latency in edge environments. The discussion concludes by summarizing the contributions of related surveys and the necessity for further research in resource management and scheduling within MEC frameworks.