تحليل الصور الطبية متعددة الأنماط باستخدام تسجيل التعلم العميق ودمج LWT-SVD Multimodal medical image analysis using deep learning registration and LWT-SVD fusion

المجلة: Discover Computing، المجلد: 29، العدد: 1
DOI: https://doi.org/10.1007/s10791-025-09857-y
تاريخ النشر: 2026-01-04
المؤلف: Paluck Arora وآخرون
الموضوع الرئيسي: كشف وتصنيف أورام الدماغ

نظرة عامة

يتناول القسم التقدم في تقنيات التصوير الطبي، مع التركيز بشكل خاص على التصوير المقطعي المحوسب (CT) والتصوير بالرنين المغناطيسي (MRI)، والتي تعتبر أساسية لتشخيص حالات طبية متنوعة. يبرز أهمية تسجيل الصور الطبية ودمجها كأدوات تساعد المتخصصين في الرعاية الصحية في اتخاذ قرارات علاجية مستنيرة بناءً على حالات المرضى.

يقدم البحث نهجًا جديدًا من مستويين لتسجيل الصور الطبية ودمجها يجمع بين التعلم العميق للتسجيل مع تحويل الموجات الرفيعة (LWT-SVD) للدمج. يعالج هذا الأسلوب قيود التقنيات التقليدية مثل تحويل الموجات المنفصلة (DWT) وتحويل الموجات الثابتة (SWT) وتحويل جيب التمام المنفصل (DCT)، والتي غالبًا ما تؤدي إلى فقدان التفاصيل عالية التردد وعدم التعامل الكافي مع القوام المعقدة. يعزز النهج المقترح جودة الصورة من خلال استخدام المعالجة المسبقة، وVGG-19 لاستخراج الميزات، ونموذج الصفيحة الرقيقة (TPS) للتوافق الدقيق. يقوم LWT بتفكيك الصور إلى نطاقات ترددية، مما يسمح بدمج فعال للمكونات منخفضة التردد والحفاظ على التفاصيل عالية التردد. تظهر التحقق على مجموعات بيانات متعددة الوسائط تحسينات كبيرة في مقاييس جودة الصورة، مما يؤكد موثوقية الأسلوب وإمكانية استخدامه السريري.

الطرق

تقدم المنهجية المقترحة، التي تُسمى RegFusion، إطار عمل قوي من مستويين لمحاذاة الصور الطبية متعددة الوسائط. في المستوى الأول، يتم إجراء استخراج الميزات باستخدام نموذج VGG-19، الذي يلتقط ميزات دلالية عميقة من كل من الصور المصدر والهدف لضمان محاذاة موثوقة. يركز المستوى الثاني على الدمج، مما يعزز عملية التسجيل من خلال دمج معلومات تشخيصية غنية، مما يؤدي إلى صورة مدمجة تظهر وضوحًا أفضل والحفاظ على الميزات، مما يسهل تشخيص الأمراض والتخطيط للعلاج بدقة أكبر.

تُقيّم التحليلات التجريبية نهج RegFusion مقابل مجموعات بيانات متنوعة، بما في ذلك الأمثلة القياسية والعالمية الحقيقية، باستخدام مقاييس الأداء مثل متوسط الخطأ الجذري التربيعي (RMSE)، ومتوسط الخطأ التربيعي (MSE)، ومؤشر التشابه الهيكلي (SSIM)، ونسبة الإشارة إلى الضوضاء (PSNR)، والترابط المتقاطع (CC)، والمعلومات المتبادلة (MI). يُظهر النهج تحسينات كبيرة في دقة التسجيل، مع انخفاض RMSE وMSE، وارتفاع قيم SSIM وPSNR مقارنة بالتقنيات الحديثة. ومن الجدير بالذكر أن متوسط وقت المعالجة لكل زوج من الصور يتراوح من 1.03 إلى 3.93 ثانية، وهو أسرع بكثير من الطرق التقليدية (5-10 ثوانٍ)، مما يجعل RegFusion مرشحًا واعدًا للتطبيقات السريرية حيث تكون كفاءة الوقت حاسمة. تؤكد التحقق من قبل متخصص جراحة الأعصاب أن الصور المعاد تحجيمها (الموحدة إلى 224 × 224 بكسل) تحتفظ بالتفاصيل التشخيصية الأساسية، مما يضمن التوافق مع الشبكات العصبية التلافيفية المدربة مسبقًا مع تحسين الموارد الحاسوبية.

المناقشة

تقيّم قسم المناقشة في ورقة البحث تقنيات الدمج والتسجيل الحالية في التصوير الطبي متعدد الوسائط بشكل نقدي، مع تسليط الضوء على قيودها مثل نقص عدم التغير في التحول، وفقدان التفاصيل عالية التردد، وعدم التعامل الكافي مع القوام المعقدة. تعيق هذه العيوب الحفاظ الدقيق على التفاصيل الدقيقة والسلامة الهيكلية في دمج الصور الطبية. لمعالجة هذه التحديات، يقترح المؤلفون نهجًا جديدًا، RegFusion، الذي يدمج عمليات التسجيل والدمج المحسّنة باستخدام VGG19-TPSI لاستخراج الميزات المحسّنة ودقة المحاذاة. يُظهر المنهج أداءً متفوقًا على كل من مجموعات البيانات القياسية والسريرية، مما يعكس تحسينات في الدقة والموثوقية وكفاءة الحوسبة.

تستعرض القسم أيضًا المساهمات المهمة من الدراسات السابقة، بما في ذلك الطرق التي تجمع بين SIFT مع DCT أو DWT لاستخراج الميزات القوية، ودمج الشبكات العصبية التلافيفية (CNNs) لتحسين تحديد الميزات واكتشاف القيم الشاذة. ومن الجدير بالذكر أن التقدم في تقنيات التعلم العميق، مثل استخدام AlexNet وVGG-16، قد صقل أيضًا عمليات استخراج الميزات والمطابقة. ومع ذلك، لا تزال العديد من الطرق الحالية محدودة بسبب التعقيد الحاسوبي والتحديات في التعميم عبر مجموعات بيانات متنوعة. يعالج نهج RegFusion المقترح الفجوة بين التسجيل والدمج، مستفيدًا من نقاط قوتهما التكميلية لتعزيز جودة وقيمة التشخيص للصور الطبية، مما يلبي الحاجة الملحة لأساليب متكاملة في التصوير الطبي.

Journal: Discover Computing, Volume: 29, Issue: 1
DOI: https://doi.org/10.1007/s10791-025-09857-y
Publication Date: 2026-01-04
Author(s): Paluck Arora et al.
Primary Topic: Brain Tumor Detection and Classification

Overview

The section discusses advancements in medical imaging techniques, specifically focusing on computed tomography (CT) and magnetic resonance imaging (MRI), which are essential for diagnosing various medical conditions. It highlights the importance of medical image registration and fusion as tools that assist healthcare professionals in making informed therapeutic decisions based on patient conditions.

The paper introduces a novel two-tier approach for medical image registration and fusion that combines deep learning for registration with Lifting Wavelet Transform Singular Value Decomposition (LWT-SVD) for fusion. This method addresses limitations of traditional techniques such as Discrete Wavelet Transform (DWT), Stationary Wavelet Transform (SWT), and Discrete Cosine Transform (DCT), which often lead to loss of high-frequency details and inadequate handling of complex textures. The proposed approach enhances image quality by utilizing preprocessing, VGG-19 for feature extraction, and the thin-plate spline (TPS) model for precise alignment. The LWT decomposes images into frequency bands, allowing for effective fusion of low-frequency components and preservation of high-frequency details. Validation on multimodal datasets demonstrates significant improvements in image quality metrics, affirming the method’s reliability and potential clinical utility.

Methods

The proposed methodology, termed RegFusion, presents a robust two-tier framework for multimodal medical image alignment. In Tier 1, feature extraction is performed using the VGG-19 model, which captures deep semantic features from both source and target images to ensure reliable alignment. Tier 2 focuses on fusion, enhancing the registration process by integrating enriched diagnostic information, resulting in a fused image that exhibits improved clarity and feature preservation, thereby facilitating more accurate disease diagnosis and treatment planning.

The experimental analysis evaluates the RegFusion approach against various datasets, including standard and real-world examples, using performance metrics such as Root Mean Square Error (RMSE), Mean Squared Error (MSE), Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), Cross-Correlation (CC), and Mutual Information (MI). The approach demonstrates significant improvements in registration accuracy, with lower RMSE and MSE, and higher SSIM and PSNR values compared to state-of-the-art techniques. Notably, the average processing time per image pair ranges from 1.03 to 3.93 seconds, significantly faster than traditional methods (5-10 seconds), making RegFusion a promising candidate for clinical applications where time efficiency is critical. Validation by a neurosurgical specialist confirms that the resized images (standardized to 224 × 224 pixels) retain essential diagnostic details, ensuring compatibility with pre-trained convolutional neural networks while optimizing computational resources.

Discussion

The discussion section of the research paper critically evaluates existing fusion and registration techniques in multimodal medical imaging, highlighting their limitations such as lack of shift invariance, loss of high-frequency details, and inadequate handling of complex textures. These shortcomings hinder the accurate preservation of fine details and structural integrity in medical image fusion. To address these challenges, the authors propose a novel approach, RegFusion, which integrates optimized registration and fusion processes using VGG19-TPSI for enhanced feature extraction and alignment precision. The methodology demonstrates superior performance on both standard and clinical datasets, showcasing improvements in accuracy, robustness, and computational efficiency.

The section also reviews significant contributions from previous studies, including methods that combine SIFT with DCT or DWT for robust feature extraction, and the integration of convolutional neural networks (CNNs) for improved feature localization and outlier detection. Notably, advancements in deep learning techniques, such as the use of AlexNet and VGG-16, have further refined feature extraction and matching processes. However, many existing methods remain limited by computational complexity and challenges in generalization across diverse datasets. The proposed RegFusion approach effectively bridges the gap between registration and fusion, leveraging their complementary strengths to enhance the quality and diagnostic value of medical images, thereby addressing the critical need for integrated methodologies in medical imaging.