GC-Fed: التعلم الفيدرالي المركزي المتدرج مع مشاركة جزئية من العملاء GC-Fed: Gradient centralized federated learning with partial client participation

المجلة: Information Fusion، المجلد: 131
DOI: https://doi.org/10.1016/j.inffus.2026.104148
تاريخ النشر: 2026-01-14
المؤلف: Zhenyun Du وآخرون
الموضوع الرئيسي: التقنيات التي تحافظ على الخصوصية في البيانات

نظرة عامة

تقدم ورقة البحث التعلم الفيدرالي المركزي المتدرج (GC-Fed)، وهو إطار عمل جديد مصمم لتعزيز أداء التعلم الفيدرالي (FL) في السيناريوهات التي تتميز بتنوع البيانات العالي والمشاركة الجزئية للعملاء. غالبًا ما تعتمد استراتيجيات التخفيف من الانجراف التقليدية على المراجع التاريخية، مثل التدرجات السابقة أو النماذج العالمية، مما يمكن أن يؤدي إلى تدريب غير مستقر عندما يشارك فقط مجموعة فرعية من العملاء. على النقيض من ذلك، يستخدم GC-Fed مستوىً فائقًا كنقطة مرجعية مستقلة تاريخيًا لتوجيه التدريب المحلي، مما يحسن محاذاة العملاء. يتكون الإطار من مكونين: GC المحلي، الذي يركز التدرجات أثناء التدريب المحلي لتنسيق مساهمات العملاء، وGC العالمي، الذي يركز التحديثات أثناء تجميع الخادم لاستقرار الأداء عبر جولات التدريب.

تشير النتائج إلى أن GC-Fed يخفف بفعالية من انجراف العملاء ويعزز الدقة بنسبة تصل إلى 20% في ظروف المشاركة الجزئية والمتنوعة. يسمح نهج تصحيح التدرج الخالي من المراجع المقترح بالتطبيق الانتقائي على أساس طبقة تلو الأخرى، مما يعالج قيود الطرق الحالية التي تعتمد على معلومات مرجعية وتكافح تحت المشاركة الجزئية الشديدة. بشكل عام، يظهر GC-Fed أداءً قويًا في بيئات FL التحدي، مما يمثل تقدمًا كبيرًا في هذا المجال.

مقدمة

تناقش مقدمة ورقة البحث التعلم الفيدرالي (FL) كإطار عمل يحافظ على الخصوصية ويسهل التدريب الموزع من خلال السماح للعملاء باشتقاق تحديثات النموذج بشكل مستقل من مجموعاتهم المحلية. تُرسل هذه التحديثات، التي تعكس مصادر بيانات متنوعة، إلى خادم مركزي للتجميع، عادةً من خلال استراتيجيات المتوسط أو الإجماع. بينما يعزز FL متانة النموذج من خلال الاستفادة من المعلومات متعددة المصادر، فإنه يواجه تحديات بسبب التنوع الفطري لبيانات العملاء، مما يمكن أن يؤدي إلى انجراف العملاء – مسارات تدريب محلية متباينة ناتجة عن بيانات غير مستقلة وموزعة بشكل متطابق (غير i.i.d.). يعقد هذا الانجراف عملية التحسين ويؤثر على استقرار النموذج العالمي.

لتخفيف انجراف العملاء، تقترح الورقة طريقة جديدة تُسمى GC-Fed، والتي تدمج المركزية التدرجية (GC) في كل من مراحل التحسين المحلية والعالمية. يعمل GC كنقطة مرجعية مستقرة لمواءمة تحديثات العملاء مع الحفاظ على اتجاهاتهم الأساسية. يستكشف المؤلفون نوعين من GC – GC المحلي وGC العالمي – كل منهما له خصائص أداء واستقرار مميزة. يجمع نهجهم الهجين، GC-Fed، بين فوائد كلا النوعين، محققًا أداءً واستقرارًا متفوقين حتى مع المشاركة الجزئية للعملاء. تدعم النتائج تجارب واسعة، تظهر أن GC-Fed يتفوق على الطرق الحالية الرائدة في مختلف الإعدادات. تشمل مساهمات هذا العمل تقليل فعّال لتباين العملاء من خلال الإسقاطات المشتركة والتحقق من فعالية GC-Fed عبر مجموعات بيانات متعددة.

طرق

في هذا القسم، يناقش المؤلفون التحديات والاستراتيجيات المرتبطة بأساليب تقليل التباين في التعلم الفيدرالي (FL). يؤدي الانتقال من الانحدار التدرجي الكامل إلى الانحدار التدرجي العشوائي (SGD) وSGD الصغير إلى إدخال تباين في التدرجات، مما يمكن أن يعيق التقارب. لتخفيف ذلك، تم تطوير تقنيات مختلفة مثل الزخم، ومعدلات التعلم التكيفية، وطرق تقليل التباين العشوائي (مثل SAG، SVRG، SAGA). ومع ذلك، في FL، ينشأ التباين ليس فقط من بيانات التدريب المحلية ولكن أيضًا من تجميع النماذج عبر عملاء متنوعين، مما يعقد تقليل التباين أثناء التجميع العالمي. يبرز المؤلفون أن تقنيات تقليل التباين الخاصة بـ FL الحالية، مثل SCAFFOLD، التي تستخدم متغيرات التحكم من التحديثات المحلية والنموذج العالمي، قد تقدم كفاءات غير فعالة بسبب الاعتماد على نقاط مرجعية تاريخية، مما يؤدي إلى زيادة تكاليف الاتصال ومتطلبات التخزين.

لمعالجة هذه الكفاءات غير الفعالة، يقترح المؤلفون نهجًا خاليًا من المراجع يستفيد من ضغط التدرج (GC). يوضحون إعدادهم التجريبي، الذي يتضمن تقييمات على مجموعات بيانات مثل EMNIST، CIFAR-10، CIFAR-100، وTinyImageNet، باستخدام هياكل نماذج متنوعة تتراوح من الشبكات العصبية متعددة الطبقات البسيطة (MLPs) والشبكات العصبية التلافيفية (CNNs) إلى هياكل أكثر تعقيدًا مثل VGG11 وResNet18. يتم تقييم النهج مقارنةً بعشرة معايير رائدة تم نشرها حتى عام 2024، مع ملخص شامل للموارد التجريبية المقدمة في الجدول 3.

نقاش

تسلط قسم النقاش في الورقة الضوء على التقدم في التعلم الفيدرالي (FL)، مع التركيز بشكل خاص على التحديات التي تطرحها البيانات غير المستقلة والموزعة بشكل متطابق (غير i.i.d.) عبر العملاء. يستعرض مجموعة متنوعة من الأساليب التي تهدف إلى تخفيف تدهور الأداء، مثل FedProx، الذي يقدم مصطلحًا قريبًا لوظائف الخسارة المحلية، وSCAFFOLD، الذي يصحح التدرجات المحلية لمعالجة انجراف العملاء. تركز الأساليب الحديثة، بما في ذلك FedOpt وFedSAM، على تحسينات جانب الخادم لتعزيز أداء التدريب والتعميم. تحدد الورقة أيضًا طبقة المصنف كمصدر كبير للتباين في نماذج FL، مقترحة تطبيقًا مستهدفًا للمركزية التدرجية (GC) لتحسين محاذاة العملاء أثناء التجميع العالمي.

تدمج الطريقة المقترحة، GC-Fed، استراتيجيات GC المحلية والعالمية لتثبيت التدريب وتسريع التقارب دون الحاجة إلى تحميل إضافي للاتصال. تظهر التقييمات التجريبية أن GC-Fed يتفوق باستمرار على خوارزميات الأساس، بما في ذلك FedAvg، عبر مجموعات بيانات وهياكل نماذج متنوعة، محققًا تحسينات كبيرة في الدقة وسرعة التقارب. تشير النتائج إلى أن GC-Fed لا يعزز فقط أداء النموذج ولكن أيضًا يقلل من التقلبات أثناء التدريب، مما يجعله بديلاً قويًا للبيئات الفيدرالية التي تتميز بتوزيعات بيانات متنوعة. تدعم التحليلات النظرية أيضًا فعالية GC-Fed من خلال توضيح إمكانيته في تقليل الفجوة إلى الحل الأمثل بشكل أكثر فعالية من الطرق التقليدية.

Journal: Information Fusion, Volume: 131
DOI: https://doi.org/10.1016/j.inffus.2026.104148
Publication Date: 2026-01-14
Author(s): Zhenyun Du et al.
Primary Topic: Privacy-Preserving Technologies in Data

Overview

The research paper presents Gradient Centralized Federated Learning (GC-Fed), a novel framework designed to enhance the performance of Federated Learning (FL) in scenarios characterized by high data heterogeneity and partial client participation. Traditional drift-mitigation strategies often rely on historical references, such as past gradients or global models, which can lead to unstable training when only a subset of clients is involved. In contrast, GC-Fed utilizes a hyperplane as a historically independent reference point to guide local training, thereby improving inter-client alignment. The framework consists of two components: Local GC, which centralizes gradients during local training to harmonize client contributions, and Global GC, which centralizes updates during server aggregation to stabilize performance across training rounds.

The findings indicate that GC-Fed effectively mitigates client drift and enhances accuracy by up to 20% in heterogeneous and partial participation conditions. The proposed reference-free gradient correction approach allows for selective application on a layer-by-layer basis, addressing the limitations of existing methods that depend on reference information and struggle under severe partial participation. Overall, GC-Fed demonstrates robust performance in challenging FL environments, marking a significant advancement in the field.

Introduction

The introduction of the research paper discusses Federated Learning (FL) as a privacy-preserving framework that facilitates distributed training by allowing clients to independently derive model updates from their local datasets. These updates, which reflect diverse data sources, are sent to a central server for aggregation, typically through averaging or consensus strategies. While FL enhances model robustness by leveraging multi-source information, it faces challenges due to the inherent heterogeneity of client data, which can lead to client drift—divergent local training trajectories resulting from non-independent and identically distributed (non-i.i.d.) data. This drift complicates the optimization process and affects the stability of the global model.

To mitigate client drift, the paper proposes a novel method called GC-Fed, which integrates Gradient Centralization (GC) into both local and global optimization stages. GC serves as a stable reference for aligning client updates while preserving their core directions. The authors explore two variants of GC—Local GC and Global GC—each with distinct performance and stability characteristics. Their hybrid approach, GC-Fed, combines the benefits of both variants, achieving superior performance and stability even with partial client participation. The findings are supported by extensive experiments, demonstrating that GC-Fed outperforms existing state-of-the-art methods in various settings. The contributions of this work include the effective reduction of client variance through shared projections and the validation of GC-Fed’s efficacy across multiple datasets.

Methods

In this section, the authors discuss the challenges and strategies associated with variance reduction methods in Federated Learning (FL). The transition from full-batch gradient descent to Stochastic Gradient Descent (SGD) and mini-batch SGD introduces gradient variance, which can impede convergence. To mitigate this, various techniques such as momentum, adaptive learning rates, and stochastic variance reduction methods (e.g., SAG, SVRG, SAGA) have been developed. However, in FL, variance arises not only from local training data but also from the aggregation of models across diverse clients, complicating the reduction of variance during global aggregation. The authors highlight that existing FL-specific variance reduction techniques, like SCAFFOLD, which utilize control variates from local updates and the global model, may introduce inefficiencies due to reliance on historical reference points, leading to increased communication costs and storage demands.

To address these inefficiencies, the authors propose a reference-free approach that leverages Gradient Compression (GC). They detail their experimental setup, which includes evaluations on datasets such as EMNIST, CIFAR-10, CIFAR-100, and TinyImageNet, employing various model architectures ranging from simple Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) to more complex structures like VGG11 and ResNet18. The approach is benchmarked against ten state-of-the-art baselines published up to 2024, with a comprehensive summary of the experimental resources provided in Table 3.

Discussion

The discussion section of the paper highlights advancements in Federated Learning (FL), particularly focusing on the challenges posed by non-independent and identically distributed (non-i.i.d.) data across clients. It reviews various approaches aimed at mitigating performance degradation, such as FedProx, which introduces a proximal term to local loss functions, and SCAFFOLD, which corrects local gradients to address client drift. Recent methods, including FedOpt and FedSAM, emphasize server-side optimizations to enhance training performance and generalization. The paper also identifies the classifier layer as a significant source of variance in FL models, proposing a targeted application of Gradient Centralization (GC) to improve inter-client alignment during global aggregation.

The proposed method, GC-Fed, integrates both local and global GC strategies to stabilize training and accelerate convergence without necessitating additional communication overhead. Empirical evaluations demonstrate that GC-Fed consistently outperforms baseline algorithms, including FedAvg, across various datasets and model architectures, achieving significant improvements in accuracy and convergence speed. The results indicate that GC-Fed not only enhances model performance but also reduces fluctuations during training, making it a robust alternative for federated environments characterized by heterogeneous data distributions. The theoretical analysis further supports the efficacy of GC-Fed by illustrating its potential to reduce the gap to the optimal solution more effectively than traditional methods.