نقل المعرفة التكيفية مع إطار عمل ثنائي الطلاب المتغير لتقسيم الصور الطبية شبه المراقب Adaptive knowledge transferring with switching dual-student framework for semi-supervised medical image segmentation

المجلة: Pattern Recognition، المجلد: 175
DOI: https://doi.org/10.1016/j.patcog.2026.113115
تاريخ النشر: 2026-01-17
المؤلف: Zhenyun Du وآخرون
الموضوع الرئيسي: تطبيقات الشبكات العصبية المتقدمة

نظرة عامة

في هذا البحث، يتناول المؤلفون القيود في أطر المعلم-الطالب لتجزئة الصور الطبية شبه المشروطة، لا سيما مشكلات الارتباط القوي ونقل المعرفة غير الموثوق بين الشبكات. يقترحون بنية جديدة تُسمى Dual-Student التي تختار الطالب الأكثر موثوقية في كل تكرار، مما يعزز التعاون ويمنع تعزيز الأخطاء. بالإضافة إلى ذلك، يقدمون استراتيجية Loss-Aware Exponential Moving Average (LA-EMA) التي تسمح لشبكة المعلم بامتصاص معلومات ذات مغزى من الطلاب، مما يحسن جودة التسميات الزائفة.

تظهر فعالية الطريقة المقترحة، التي تدمج وحدات LA-EMA واختيار الطالب في بنية Dual-Teacher، من خلال تقييمات موسعة على مجموعات بيانات تجزئة الصور الطبية ثلاثية الأبعاد. تشير النتائج إلى تحسينات كبيرة في الأداء مقارنةً بأساليب شبه المشروطة الحديثة، مما يظهر قدرة الإطار على تعزيز دقة التجزئة تحت إشراف محدود. لا تعزز هذه الطريقة فقط تعلم الطلاب ولكنها تمكن المعلم أيضًا من اكتساب معرفة أكثر صلة، مما يؤدي في النهاية إلى تسميات زائفة أكثر دقة والتخفيف من مشكلات التحيز التأكيدي.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على الدور الحاسم للتجزئة في التصوير الطبي، لا سيما باستخدام التصوير المقطعي المحوسب (CT) والتصوير بالرنين المغناطيسي (MRI)، والتحديات المرتبطة بعملية التوصيف البكسلي التي تتطلب جهدًا كبيرًا من المتخصصين. لمعالجة قيود الأساليب الحالية التي تعتمد بشكل كبير على مجموعات البيانات المعلّمة، يركز المؤلفون على تجزئة الصور الطبية شبه المشروطة (SSMIS)، التي تستخدم مزيجًا من الصور غير المعلّمة والمعلّمة بشكل محدود. إحدى الطرق البارزة في SSMIS هي إطار Mean-Teacher (MT)، حيث يقوم نموذج المعلم بإنشاء تسميات زائفة لنموذج الطالب. ومع ذلك، يحدد المؤلفون قيودين كبيرين في تطبيقات MT الحالية: الارتباط القوي بين الشبكات المعلم والطالب، والذي يمكن أن ينقل التحيزات والأخطاء، والطبيعة الثابتة لأوزان المتوسط المتحرك الأسي (EMA) التي تفشل في التكيف مع ديناميكيات تعلم الطالب المتطورة.

للتغلب على هذه التحديات، يقترح المؤلفون تحسينات على بنية Mean-Teacher، بما في ذلك إطار Dual-Student الذي يعزز تنوع المعرفة ووحدة اختيار الطالب التي تعزز موثوقية توجيه المعلم. بالإضافة إلى ذلك، يقدمون استراتيجية Loss-Aware Exponential Moving Average (LA-EMA) التي تعدل ديناميكيًا أوزان نموذج المعلم بناءً على سلوك خسارة الطالب. لا تحسن هذه الطريقة المبتكرة فقط جودة التسميات الزائفة ولكنها تضمن أيضًا نقل المعرفة بشكل أكثر فعالية بين الشبكات المعلم والطالب. يحقق المؤلفون صحة مساهماتهم من خلال تجارب موسعة على المعايير الطبية، مما يظهر أداءً متقدمًا وقابلية التطبيق على مهام التجزئة العامة للصور.

طرق

في هذا القسم، يحدد المؤلفون منهجيتهم لتقييم خوارزمية Loss-Aware EMA المقترحة باستخدام مزيج من مجموعات بيانات الصور الطبية والعامة. تشمل الإعدادات التجريبية مجموعتين طبيتين عامتين: LA، التي تتكون من 100 مسح بالرنين المغناطيسي المعزز بالغادولينيوم (GE-MRI) مقسمة إلى 80 للتدريب و20 للتحقق، وACDC، التي تتضمن 200 صورة قلبية قصيرة المحور معلمة من 100 مريض، مقسمة إلى 70 للتدريب و10 للتحقق و20 للاختبار.

لتقييم قوة نهجهم بشكل أكبر، يوسع المؤلفون تقييمهم ليشمل مجموعات بيانات الصور العامة، وتحديدًا CIFAR-10 وPascal VOC. تتضمن CIFAR-10 60,000 صورة عبر 10 فئات، مع إعداد شبه مشروط يتضمن 2% من البيانات المعلّمة (1,000 صورة) و98% غير معلّمة (49,000 صورة). تقدم Pascal VOC مجموعة متنوعة من المشاهد الطبيعية مع 20 فئة من الكائنات، باستخدام 366 صورة معلّمة و1,464 صورة غير معلّمة (20% معلّمة). يتم قياس أداء النماذج باستخدام مقاييس Intersection over Union (IoU) لكل من الشبكات الطالب (S) والمعلم (T)، مما يسمح بمقارنة شاملة مع الأساليب الحديثة.

نتائج

في قسم النتائج، يقدم المؤلفون تحليلًا مقارنًا لمنهجيتهم في التجزئة الطبية شبه المشروطة مقابل مجموعة متنوعة من الأساليب الحالية، كما هو موضح في الجدولين 1 و2. تظهر طريقتهم أداءً متفوقًا في سيناريوهات البيانات المعلّمة بنسبة 5% و10%، حيث تتفوق بشكل خاص على مجموعات بيانات LA وACDC. ومن الجدير بالذكر أنه في إعداد البيانات المعلّمة بنسبة 10%، تحقق طريقتهم درجة تشابه Dice تبلغ 91.31% ودرجة Jaccard تبلغ 84.07%، متجاوزة الطريقة الحديثة AD-MT بنسبة 1.48% و2.45%، على التوالي.

يبرز المؤلفون أنه بينما تظهر الطريقة الأساسية BCP (2023) نتائج تنافسية على مجموعة بيانات LA، فإن نهجهم المقترح يتفوق بشكل كبير على جميع الطرق السابقة في معظم المقاييس، لا سيما تحت شرط البيانات المعلّمة بنسبة 10%. على الرغم من أنها لا تصل إلى مستوى AD-MT في مقياس ASD لإعداد البيانات المعلّمة بنسبة 5%، إلا أنها لا تزال تحقق أعلى أداء عبر جميع المقاييس الأخرى التي تم تقييمها، مما يبرز فعالية منهجيتهم في استخدام كل من البيانات المعلّمة وغير المعلّمة لمهام التجزئة الطبية.

مناقشة

في قسم المناقشة من الورقة، يبرز المؤلفون التقدم في تجزئة الصور الطبية شبه المشروطة، مع التركيز بشكل خاص على أساليب مثل تنظيم الاتساق، والتسمية الزائفة، والتدريب المشترك. يشيرون إلى هياكل مختلفة، مثل MC-Net وSS-Net، التي تستخدم أطر ثنائية فك التشفير والضوضاء التنافسية لتعزيز اتساق التنبؤ وتعميم الميزات. على الرغم من هذه الابتكارات، لا تزال التحديات مثل التحيز التأكيدي قائمة، حيث يمكن أن تعيق التسميات الزائفة غير الصحيحة فعالية التدريب. لقد برز إطار Mean-Teacher كنهج رائد في هذا المجال، حيث يقوم بفعالية بإنشاء تسميات زائفة وتحقيق أداء عالٍ من خلال آليات مثل المتوسطات المتحركة الأسية (EMA). يقترح المؤلفون صيغة EMA قابلة للتكيف لضبط نقل المعرفة من الطالب إلى المعلم بشكل ديناميكي، مما يعالج قيود الأساليب ذات الأوزان الثابتة.

يقدم المؤلفون طريقتهم، التي تستخدم بنية ثنائية الطالب واستراتيجية Loss-Aware EMA (LA-EMA) لتحسين جودة التسميات الزائفة وتعزيز دقة التجزئة. يوضحون تنفيذ استراتيجية تعزيز Cross-Sample CutMix لتنويع عينات التدريب ووحدة اختيار الطالب التي تضمن أن يساهم الطالب الأكثر موثوقية فقط في عملية تعلم المعلم. تدعم التحليلات النظرية المقدمة فعالية نهجهم في تقليل تباين الضوضاء وتحسين حدود التعميم. بشكل عام، يظهر الإطار المقترح تحسينات كبيرة في أداء التجزئة عبر معايير متعددة، بينما يعالج أيضًا القيود العملية المرتبطة بالتعلم شبه المشروط في التصوير الطبي.

Journal: Pattern Recognition, Volume: 175
DOI: https://doi.org/10.1016/j.patcog.2026.113115
Publication Date: 2026-01-17
Author(s): Zhenyun Du et al.
Primary Topic: Advanced Neural Network Applications

Overview

In this research, the authors address limitations in teacher-student frameworks for semi-supervised medical image segmentation, particularly the issues of strong correlation and unreliable knowledge transfer between networks. They propose a novel switching Dual-Student architecture that selects the most reliable student at each iteration, thereby enhancing collaboration and preventing error reinforcement. Additionally, they introduce a Loss-Aware Exponential Moving Average (LA-EMA) strategy that allows the teacher network to absorb meaningful information from the students, which improves the quality of pseudo-labels.

The effectiveness of the proposed method, which integrates the LA-EMA and Student Selection modules into a Dual-Teacher architecture, is demonstrated through extensive evaluations on 3D medical image segmentation datasets. The results indicate significant performance improvements over state-of-the-art semi-supervised methods, showcasing the framework’s ability to enhance segmentation accuracy under limited supervision. This approach not only boosts student learning but also enables the teacher to acquire more relevant knowledge, ultimately leading to more accurate pseudo-labels and mitigating confirmation bias issues.

Introduction

The introduction of this research paper highlights the critical role of segmentation in medical imaging, particularly using CT and MRI, and the challenges associated with the labor-intensive process of pixel-wise annotation by specialists. To address the limitations of existing methods that rely heavily on annotated datasets, the authors focus on Semi-Supervised Medical Image Segmentation (SSMIS), which utilizes a combination of unlabeled and limited labeled images. A prominent approach in SSMIS is the Mean-Teacher (MT) framework, where a teacher model generates pseudo labels for a student model. However, the authors identify two significant limitations in current MT implementations: the strong correlation between teacher and student networks, which can propagate biases and errors, and the static nature of the Exponential Moving Average (EMA) weights that fail to adapt to the student’s evolving learning dynamics.

To overcome these challenges, the authors propose enhancements to the Mean-Teacher architecture, including a Dual-Student framework that promotes knowledge diversity and a Student Selection module that optimizes the reliability of the teacher’s guidance. Additionally, they introduce a Loss-Aware Exponential Moving Average (LA-EMA) strategy that dynamically adjusts the teacher model’s weights based on the student’s loss behavior. This innovative approach not only improves the quality of pseudo-labels but also ensures more effective knowledge transfer between the teacher and student networks. The authors validate their contributions through extensive experiments on medical benchmarks, demonstrating state-of-the-art performance and applicability to general image segmentation tasks.

Methods

In this section, the authors outline their methodology for evaluating the proposed Loss-Aware EMA algorithm using a combination of medical and general image datasets. The experimental settings include two public medical datasets: LA, which consists of 100 gadolinium-enhanced magnetic resonance image (GE-MRI) scans split into 80 for training and 20 for validation, and ACDC, comprising 200 annotated short-axis cardiac cine-MR images from 100 patients, divided into 70 for training, 10 for validation, and 20 for testing.

To further assess the robustness of their approach, the authors extend their evaluation to general image datasets, specifically CIFAR-10 and Pascal VOC. CIFAR-10 includes 60,000 images across 10 classes, with a semi-supervised setup of 2% labeled data (1,000 images) and 98% unlabeled (49,000 images). Pascal VOC presents a diverse set of natural scenes with 20 object categories, utilizing 366 labeled and 1,464 unlabeled images (20% labeled). The performance of the models is measured using Intersection over Union (IoU) metrics for both student (S) and teacher (T) networks, allowing for a comprehensive comparison with state-of-the-art methods.

Results

In the results section, the authors present a comparative analysis of their semi-supervised medical segmentation methodology against various existing approaches, as detailed in Tables 1 and 2. Their method demonstrates superior performance in both 5% and 10% labeled data scenarios, particularly excelling on the LA and ACDC datasets. Notably, in the 10% labeled data setting, their approach achieves a Dice similarity score of 91.31% and a Jaccard score of 84.07%, surpassing the state-of-the-art method AD-MT by 1.48% and 2.45%, respectively.

The authors highlight that while the baseline method BCP (2023) shows competitive results on the LA dataset, their proposed approach significantly outperforms all prior methods in most metrics, particularly under the 10% labeled data condition. Although it falls short of AD-MT in the ASD metric for the 5% labeled data setting, it still achieves the highest performance across all other evaluated metrics, underscoring the effectiveness of their methodology in utilizing both labeled and unlabeled data for medical segmentation tasks.

Discussion

In the discussion section of the paper, the authors highlight advancements in semi-supervised medical image segmentation, particularly focusing on methods like Consistency Regularization, Pseudo-Labeling, and Co-training. They reference various architectures, such as MC-Net and SS-Net, which utilize dual-decoder frameworks and adversarial noise to enhance prediction consistency and feature generalization. Despite these innovations, challenges like confirmation bias persist, where incorrect pseudo-labels can hinder training efficacy. The Mean-Teacher framework has emerged as a leading approach in this domain, effectively generating pseudo-labels and achieving high performance through mechanisms like exponential moving averages (EMA). The authors propose an adaptive EMA formula to dynamically adjust knowledge transfer from the student to the teacher, addressing the limitations of fixed-weight approaches.

The authors introduce their method, which employs a dual-student architecture and a Loss-Aware EMA (LA-EMA) to improve the quality of pseudo-labels and enhance segmentation accuracy. They detail the implementation of a Cross-Sample CutMix augmentation strategy to diversify training samples and a Student Selection module that ensures only the most reliable student contributes to the teacher’s learning process. The theoretical analysis provided supports the effectiveness of their approach in reducing noise variance and improving generalization bounds. Overall, the proposed framework demonstrates significant improvements in segmentation performance across multiple benchmarks, while also addressing practical limitations associated with semi-supervised learning in medical imaging.