استبيان حول التعلم العميق لتجزئة الأورام: التقنيات والتحديات والاتجاهات المستقبلية A survey on deep learning for polyp segmentation: techniques, challenges and future trends

المجلة: Visual Intelligence، المجلد: 3، العدد: 1
DOI: https://doi.org/10.1007/s44267-024-00071-w
تاريخ النشر: 2025-01-03
المؤلف: Jiaxin Mei وآخرون
الموضوع الرئيسي: الرياضيّات والتعلم الآلي في التصوير الطبي

نظرة عامة

تقدم هذه القسم نظرة عامة على الدور الحاسم لتقسيم الزوائد في الكشف المبكر وعلاج سرطان القولون والمستقيم (CRC). يسلط الضوء على قيود الطرق التقليدية التي اعتمدت على الميزات المستخرجة يدويًا—مثل اللون، والملمس، والشكل—والتي واجهت صعوبة في التقاط السياق العالمي وافتقرت إلى المتانة في السيناريوهات المعقدة. أدى ظهور التعلم العميق إلى تقدم كبير في خوارزميات تقسيم الصور الطبية، مما دفع إلى مراجعة شاملة لكل من طرق تقسيم الزوائد التقليدية وتلك المعتمدة على التعلم العميق. تناقش الورقة أيضًا مجموعات البيانات المرجعية ذات الصلة بالمجال وتقييم النماذج الحديثة للتعلم العميق بناءً على حجم الزوائد، مع معالجة الاختلافات في تركيز البحث وهياكل الشبكات.

في الختام، يؤكد المؤلفون أن هذا العمل يمثل أول مراجعة شاملة لتطوير وتقويم تقسيم الزوائد. يصنفون النماذج الحالية إلى طرق تقليدية وطرق تعلم عميق، ويقدمون رؤى مفصلة حول مجموعات البيانات الشائعة، ويجرون تقييمًا شاملاً لـ 26 نموذجًا تمثيليًا للتعلم العميق بناءً على حجم الزوائد. بالإضافة إلى ذلك، تحدد الورقة التحديات المستمرة وتقترح اتجاهات البحث المستقبلية، بهدف تعزيز الاهتمام والفهم في مجال تقسيم الزوائد.

مقدمة

تناقش مقدمة ورقة البحث الدور الحاسم لتقسيم الزوائد في مساعدة الأطباء السريريين على تحديد وتحديد مناطق الزوائد بدقة داخل القولون، وهو أمر ضروري للتشخيص المبكر وعلاج سرطان القولون والمستقيم (CRC). يعد تقسيم الزوائد تحديًا بسبب أحجامها وأشكالها المتنوعة، وارتباطها القوي بالأنسجة المحيطة. تاريخيًا، اعتمدت الطرق المبكرة على الميزات المستخرجة يدويًا، مثل معلومات الشكل واللون، والتي أثبتت عدم كفايتها في السيناريوهات المعقدة. أدت التقدمات الحديثة في التعلم العميق إلى تحسين كبير في تقسيم الزوائد، خاصة مع إدخال هياكل الترميز-فك الترميز ونماذج مثل نموذج تقسيم أي شيء (SAM)، التي تستفيد من المعلومات السياقية العالمية.

تهدف الورقة إلى تقديم مراجعة شاملة لكل من طرق تقسيم الزوائد التقليدية وتلك المعتمدة على التعلم العميق، مصنفة طرق التعلم العميق إلى نماذج واعية للحدود، ونماذج واعية للاهتمام، ونماذج دمج الميزات. تسلط الضوء على مزايا وخصائص أداء هذه الطرق بينما تعالج التحديات التي تطرحها تقسيم المجالات المتقاطعة، حيث تواجه النماذج صعوبة مع مجموعات البيانات من أجهزة التصوير المختلفة. تؤكد المقدمة على الحاجة إلى حلول فعالة عبر المجالات، مثل شبكة التكيف المتبادل المقترحة من قبل يانغ وآخرين، لتعزيز أداء التقسيم على مجموعات البيانات غير المرئية. توضح الورقة هيكلها، موضحة الأقسام التالية التي تستعرض النماذج الحالية، ومجموعات البيانات، وتقييمات الأداء، والاتجاهات المستقبلية في تقسيم الزوائد.

طرق

تناقش قسم الطرق مختلف الأساليب لتقسيم الزوائد باستخدام الشبكات العصبية التلافيفية (CNNs)، والمحولات، والنماذج الهجينة التي تدمج بين كلا الهيكلين. تعزز الطرق المعتمدة على CNN، مثل AC-SNet و EU-Net و MSNet و PEFNet، هياكل U-Net التقليدية من خلال إدخال تعديلات مثل وحدات استخراج السياق المحلي، ووحدات السياق العالمي التكيفية، وتقنيات استخراج الميزات متعددة المقاييس. تهدف هذه الابتكارات إلى تحسين دقة وعمومية تقسيم الزوائد من خلال دمج الميزات واختيارها بفعالية بناءً على استراتيجيات الانتباه.

في المقابل، تعالج الطرق المعتمدة على المحولات قيود CNNs في التقاط الاعتماديات بعيدة المدى. تستفيد نماذج مثل MSRAformer و DuAT و SSFormer و ColonFormer و TransNetR من نقاط القوة في هياكل المحولات لاستخراج ميزات متعددة المقاييس وتعزيز أداء التقسيم. تستخدم هذه النماذج استراتيجيات متنوعة، مثل الهياكل الهرمية وتقنيات التجميع المزدوج، لدمج الميزات المكانية العالمية والمحلية بفعالية.

تجمع الطرق الهجينة، مثل نماذج TransFuse و LAPFormer و PPFormer و HSNet و Fu-TransHNet، بين مزايا CNNs والمحولات. تستخدم فروعًا متوازية، وهياكل هرمية، وآليات دمج الميزات لالتقاط كل من السياق المحلي والاعتماديات بعيدة المدى، مما يحسن بشكل كبير نتائج التقسيم. بشكل عام، تعكس دمج هذه المنهجيات اتجاهًا نحو أساليب أكثر تعقيدًا وفعالية في مجال تقسيم الزوائد.

نقاش

يسلط النقاش حول نماذج تقسيم الزوائد الضوء على التطور من الطرق التقليدية المعتمدة على الميزات المصممة يدويًا إلى الأساليب المتقدمة للتعلم العميق التي تستفيد من استخراج الميزات الآلي. استخدمت النماذج التقليدية، مثل تلك المقترحة من قبل ياو وآخرين ولوق وآخرين، معلومات اللون والملمس والشكل، لكنها واجهت قيودًا في التعميم وقابلية التوسع. في المقابل، أظهرت نماذج التعلم العميق، بما في ذلك الهياكل الواعية للحدود والواعية للاهتمام، أداءً متفوقًا من خلال التقاط تمثيلات بيانات معقدة وتعزيز دقة التقسيم من خلال معلومات الحواف وآليات الانتباه.

تؤكد النماذج الواعية للحدود، مثل FeDNet و BDG-Net، على أهمية معلومات الحواف في تحسين دقة التقسيم، بينما تركز النماذج الواعية للاهتمام، مثل CASCADE و TGANet، على إعطاء الأولوية للميزات ذات الصلة لتعزيز الأداء. بالإضافة إلى ذلك، أظهرت استراتيجيات دمج الميزات أنها تعزز فعالية النموذج من خلال دمج الميزات متعددة المستويات، كما هو الحال في نماذج مثل CFA-Net و EFB-Seg. يشير النقاش أيضًا إلى تحول نحو طرق تقسيم تعتمد على الفيديو، والتي تستخدم المعلومات الزمنية لتحسين دقة التقسيم في الإعدادات السريرية. بشكل عام، تؤكد الورقة على أهمية الاستفادة من الهياكل المتقدمة والمنهجيات لمعالجة تحديات تقسيم الزوائد، مما يمهد الطريق للبحث والتطوير المستقبلي في هذا المجال الحاسم من التصوير الطبي.

Journal: Visual Intelligence, Volume: 3, Issue: 1
DOI: https://doi.org/10.1007/s44267-024-00071-w
Publication Date: 2025-01-03
Author(s): Jiaxin Mei et al.
Primary Topic: Radiomics and Machine Learning in Medical Imaging

Overview

The section provides an overview of the critical role of polyp segmentation in the early detection and treatment of colorectal cancer (CRC). It highlights the limitations of traditional methods that relied on manually extracted features—such as color, texture, and shape—which struggled with capturing global context and lacked robustness in complex scenarios. The emergence of deep learning has led to significant advancements in medical image segmentation algorithms, prompting a comprehensive review of both traditional and deep learning-based polyp segmentation methods. The paper also discusses benchmark datasets relevant to the field and evaluates recent deep learning models based on polyp size, addressing variations in research focus and network structures.

In the conclusion, the authors assert that this work represents the first comprehensive review of polyp segmentation development and evaluation. They categorize existing models into traditional and deep learning approaches, provide detailed insights into popular datasets, and conduct a thorough evaluation of 26 representative deep learning models based on polyp size. Additionally, the paper identifies ongoing challenges and suggests future research directions, aiming to enhance interest and understanding in the domain of polyp segmentation.

Introduction

The introduction of the research paper discusses the critical role of polyp segmentation in aiding clinical doctors to accurately identify and delineate polyp regions within the colon, which is essential for the early diagnosis and treatment of colorectal cancer (CRC). The segmentation of polyps is challenging due to their diverse sizes, shapes, and strong adherence to surrounding tissues. Historically, early methods relied on manually extracted features, such as shape and color information, which proved inadequate for complex scenarios. Recent advancements in deep learning have significantly improved polyp segmentation, particularly with the introduction of encoder-decoder architectures and models like the Segment Anything Model (SAM), which leverage global contextual information.

The paper aims to provide a comprehensive review of both traditional and deep learning-based polyp segmentation methods, categorizing deep learning approaches into boundary-aware, attention-aware, and feature fusion models. It highlights the advantages and performance characteristics of these methods while addressing the challenges posed by cross-domain segmentation, where models struggle with datasets from different imaging devices. The introduction emphasizes the need for effective cross-domain solutions, such as the mutual-prototype adaptation network proposed by Yang et al., to enhance segmentation performance on unseen datasets. The paper outlines its structure, detailing subsequent sections that review existing models, datasets, performance assessments, and future trends in polyp segmentation.

Methods

The section on methods discusses various approaches to polyp segmentation using convolutional neural networks (CNNs), Transformers, and hybrid models that integrate both architectures. CNN-based methods, such as AC-SNet, EU-Net, MSNet, and PEFNet, enhance traditional U-Net architectures by introducing modifications like local context extraction modules, adaptive global context modules, and multi-scale feature extraction techniques. These innovations aim to improve the accuracy and universality of polyp segmentation by effectively integrating and selecting features based on attention strategies.

In contrast, Transformer-based methods address the limitations of CNNs in capturing long-range dependencies. Models like MSRAformer, DuAT, SSFormer, ColonFormer, and TransNetR leverage the strengths of Transformer architectures to extract multi-scale features and enhance segmentation performance. These models utilize various strategies, such as pyramid structures and dual-aggregation techniques, to effectively combine global and local spatial features.

Hybrid methods, exemplified by models like TransFuse, LAPFormer, PPFormer, HSNet, and Fu-TransHNet, merge the advantages of CNNs and Transformers. They employ parallel branches, hierarchical structures, and feature fusion mechanisms to capture both local context and long-range dependencies, significantly improving segmentation outcomes. Overall, the integration of these methodologies reflects a trend toward more sophisticated and effective approaches in the field of polyp segmentation.

Discussion

The discussion on polyp segmentation models highlights the evolution from traditional methods reliant on manually engineered features to advanced deep learning approaches that leverage automated feature extraction. Traditional models, such as those proposed by Yao et al. and Lu et al., utilized color, texture, and shape information, but faced limitations in generalization and scalability. In contrast, deep learning models, including boundary-aware and attention-aware architectures, have demonstrated superior performance by effectively capturing complex data representations and enhancing segmentation accuracy through edge information and attention mechanisms.

Boundary-aware models, like FeDNet and BDG-Net, emphasize the importance of edge information in improving segmentation precision, while attention-aware models, such as CASCADE and TGANet, focus on prioritizing relevant features to boost performance. Additionally, feature fusion strategies have been shown to enhance model efficacy by integrating multi-level features, as seen in models like CFA-Net and EFB-Seg. The discussion also notes a shift towards video-based segmentation methods, which utilize temporal information to improve segmentation accuracy in clinical settings. Overall, the paper underscores the significance of leveraging advanced architectures and methodologies to address the challenges of polyp segmentation, paving the way for future research and development in this critical area of medical imaging.