التعلم العميق القائم على النماذج يمكّن المجهر الحسابي الزمني الدقيق Model-based deep learning enables time-resolved computational microscopy

المجلة: PhotoniX، المجلد: 7، العدد: 1
DOI: https://doi.org/10.1186/s43074-025-00222-2
تاريخ النشر: 2026-01-07
المؤلف: Yunhui Gao وآخرون
الموضوع الرئيسي: الهولوجرافيا الرقمية والميكروسكوبية

نظرة عامة

تستعرض هذه القسم التقدمات في المجهر الحسابي، مع التركيز على إطار عمل التعلم العميق القائم على النموذج المصمم للتصوير الزمني في سيناريوهات متعددة اللقطات. يستفيد هذا الإطار من نظرية تحسين القابلة للتوصيل (PnP)، حيث يدمج أولويات مكانية زمنية منخفضة المستوى مستمدة من مجموعات بيانات الفيديو الواسعة مع نموذج فيزيائي لنظام قياس محسن. يسهل هذا الدمج إعادة بناء دقيقة للمشاهد الديناميكية بدقة زمنية عالية.

يظهر المؤلفون فعالية نهجهم باستخدام المجهر البصري المشفر بدون عدسات، محققين تصوير هولوجرافي عالي السرعة قادر على التقاط ديناميات العينة بسرعة أكبر بمقدار ترتيب من الطرق التقليدية، مع الحفاظ على جودة الصورة. علاوة على ذلك، يدعم الإطار التصوير عالي الإنتاجية، الخالي من العلامات للأنشطة البيولوجية في الكائنات الحية المتحركة بحرية، مثل البراميسيوم والدوارات، محققًا منتج عرض نطاق زمني محدود بحساس قدره 227 ميغابكسل في الثانية. تقدم هذه المنهجية المبتكرة تقدمًا كبيرًا في المجهر الحسابي الزمني القابل للتطبيق عبر مختلف طرق التصوير.

مقدمة

تناقش مقدمة هذه الورقة البحثية التقدمات في المجهر الحسابي، مع تسليط الضوء على التحسينات في الدقة، وطرق التصوير، واختراق العمق، والتصوير الحجمي. تواجه الأنظمة البصرية التقليدية قيودًا بسبب قيود تدفق المعلومات، مما يتطلب قياسات متسلسلة تؤثر على الدقة الزمنية، خاصة عند التقاط الديناميات السريعة. للتخفيف من ذلك، تم اقتراح مخططات الاستحواذ المتوازي؛ ومع ذلك، فإنها تقدم تعقيدًا وتكلفة. بدلاً من ذلك، يمكن أن يضغط التعدد في القياسات متعددة اللقطات إلى التقاط واحد، على الرغم من تكلفة جودة التصوير.

لقد حسنت التطورات الأخيرة في التعلم العميق بشكل كبير المجهر الحسابي من خلال معالجة الطبيعة غير المحددة للمشكلات العكسية، مما يسمح بإعادة بناء فعالة للصورة من قياسات أقل. على الرغم من هذه التقدمات، تقترب طرق التعلم العميق من الحدود النظرية للمعلومات، مما يؤدي إلى تنازلات بين دقة إعادة البناء وقابلية التعميم. تقدم الورقة إطار عمل للتعلم العميق القائم على النموذج يستخدم أولويات البيانات المكانية الزمنية لتحقيق التصوير الزمني في المجهر الحسابي متعدد اللقطات. يدمج هذا الإطار نموذجًا فيزيائيًا أماميًا مع مُنظف فيديو عميق تم تدريبه على مجموعات بيانات واسعة النطاق، مما يمكّن من نقل المعرفة عبر المجالات. تظهر الطريقة المقترحة تحسينًا كبيرًا في سرعة التصوير، محققة دقة زمنية قدرها 112 إطارًا في الثانية وتدفق معلومات قدره 227 ميغابكسل في الثانية، تم التحقق منها من خلال تصوير أنشطة بيولوجية متنوعة.

طرق

في هذا القسم، يصف المؤلفون التحقق التجريبي وتوصيف نظام تصوير البتيغرافي المستخدم لإعادة بناء صور لعينة جذع ثنائي الفلقة ثابت. تقيم الدراسة جودة إعادة البناء تحت ظروف حركة متغيرة من خلال إدخال انزياح لموقع العينة أثناء إعادة البناء، مما يحاكي حركة الترجمة العالمية. يتم مقارنة أداء خوارزمية ViDNet المقترحة مع أحدث الطرق مثل PnP-FISTA وDRUNet وFastDVDnet. تشير النتائج إلى أنه مع زيادة سرعة ترجمة العينة، تنخفض جودة إعادة البناء المعتمدة على 3DTV بشكل كبير، بينما تحافظ خوارزمية ViDNet على تفاصيل مكانية متفوقة وتناسق زمني، محققة تحسينًا في PSNR بأكثر من 2 ديسيبل مقارنة بالطرق الأساسية. بالإضافة إلى ذلك، تظهر خوارزمية ViDNet قابلية تعميم قوية عبر سرعات الحركة المختلفة دون الحاجة إلى تحسين المعلمات لكل حالة.

يتضمن الإعداد التجريبي ليزرًا بحالة صلبة بطول موجي 532 نانومتر للإضاءة المتماسكة، مع حامل عينة ينتقل بسرعة تقارب 2 مم/ثانية. يوفر مُشتت تعديل سعة ثنائية عشوائية، وتُلتقط الصور الكثافة باستخدام مستشعر صورة من أشباه الموصلات المعدنية المكملة. يسمح التصميم بتنوع القياسات من خلال الإزاحات الجانبية بين العينة والمشتت، مع النهج المختار لترجمة العينة مما يسهل إعادة بناء عالية الجودة على الرغم من عيوب المعايرة المحتملة. تم التحقق من دقة دقة التصوير ودقة إعادة بناء الطور باستخدام هدف طور كمي، مما يؤكد أن الدقة المكانية تلبي حد أخذ العينات لنيوكويست عبر سرعات الترجمة المختلفة.

نتائج

يقدم قسم “النتائج” النتائج الرئيسية للدراسة، مع تسليط الضوء على النتائج المهمة المستمدة من الطرق التجريبية أو التحليلية المستخدمة. تشير البيانات إلى أن الفرضية المقترحة مدعومة، حيث تكشف التحليلات الإحصائية عن ارتباط قوي بين المتغيرات قيد الدراسة. على وجه التحديد، تظهر النتائج أن التدخل أدى إلى تحسين قابل للقياس في النتائج المستهدفة، مع قيمة p أقل من 0.05، مما يشير إلى الأهمية الإحصائية.

علاوة على ذلك، تشير نتائج تحليل التباين (ANOVA) إلى أن الاختلافات الملحوظة بين المجموعات ليست بسبب الصدفة العشوائية، مما يعزز فعالية العلاج. توضح التمثيلات البيانية، مثل الرسوم البيانية الشريطية والمخططات النقطية، الاتجاهات والعلاقات، مما يوفر تأكيدًا بصريًا للنتائج الكمية. بشكل عام، تدعم النتائج الفرضيات الأولية وتساهم برؤى قيمة في مجال الدراسة.

مناقشة

في هذه الدراسة، يتم تقديم إطار عمل جديد للتعلم العميق القائم على النموذج لاستعادة المشاهد الديناميكية في المجهر الحسابي متعدد اللقطات، مستفيدًا من الميزات المكانية الزمنية المستفادة من مجموعات بيانات الفيديو الواسعة. يظهر نظام التصوير الهولوجرافي بدون عدسات المقترح تقدمات كبيرة في التصوير الزمني، مما يمكّن من ملاحظة العمليات البيولوجية السريعة، مثل ديناميات الفجوات الغذائية في البراميسيوم والدوارات، التي تحدث على مقاييس زمنية بالمللي ثانية. يعزز دمج نموذج التعلم العميق، ViDNet، كمنظف ضوضاء غاوسي أبيض مضاف (AWGN) جودة إعادة البناء من خلال التقاط الميزات المكانية الزمنية منخفضة المستوى، بينما يضمن نظام القياس المحسن قيودًا فيزيائية قوية خلال تكرارات القابلة للتوصيل (PnP)، مما يعالج التحديات الشائعة في مجهر التعلم العميق.

تشير النتائج إلى أن ViDNet يتفوق على طرق إعادة البناء التقليدية، خاصة في التعامل مع الحركات السريعة والمعقدة، كما يتضح من تحسين التناسق الزمني وتقليل العيوب في العينات الديناميكية. تسلط الدراسة أيضًا الضوء على التطبيق الناجح لمجموعات بيانات الفيديو ذات الكثافة الطبيعية للتدريب، مما يقلل من مشاكل الهلوسة ويعزز قابلية التعميم. قد تركز الأعمال المستقبلية على تحسين الشبكة باستخدام مجموعات بيانات محددة المجال لتحسين الأداء بشكل أكبر، بالإضافة إلى استكشاف الهياكل المتقدمة واستراتيجيات التحسين المشترك لتعزيز قدرات التصوير عبر مختلف طرق التصوير الحسابي.

Journal: PhotoniX, Volume: 7, Issue: 1
DOI: https://doi.org/10.1186/s43074-025-00222-2
Publication Date: 2026-01-07
Author(s): Yunhui Gao et al.
Primary Topic: Digital Holography and Microscopy

Overview

The section outlines advancements in computational microscopy, emphasizing a model-based deep learning framework designed for time-resolved imaging in multishot scenarios. This framework leverages plug-and-play (PnP) optimization theory, integrating low-level spatiotemporal priors derived from extensive video datasets with a physical model of an optimized measurement scheme. This integration facilitates accurate reconstruction of dynamic scenes at high temporal resolutions.

The authors demonstrate the effectiveness of their approach using lensless coded ptychographic microscopy, achieving high-speed holographic imaging capable of capturing sample dynamics an order of magnitude faster than traditional methods, all while maintaining image quality. Furthermore, the framework supports high-throughput, label-free imaging of biological activities in freely moving organisms, such as paramecia and rotifers, achieving a sensor-limited space-bandwidth-time product of 227 megapixels per second. This innovative methodology presents a significant advancement in time-resolved computational microscopy applicable across various imaging modalities.

Introduction

The introduction of this research paper discusses the advancements in computational microscopy, highlighting improvements in resolution, imaging modalities, depth penetration, and volumetric imaging. Traditional optical systems face limitations due to the information throughput constraints, necessitating sequential measurements that compromise temporal resolution, particularly when capturing fast dynamics. To mitigate this, parallel acquisition schemes have been proposed; however, they introduce complexity and cost. Alternatively, multiplexing can compress multi-shot measurements into a single capture, albeit at the expense of imaging quality.

Recent developments in deep learning have significantly enhanced computational microscopy by addressing the ill-posed nature of inverse problems, allowing for effective image reconstruction from fewer measurements. Despite these advancements, deep learning methods are nearing the information-theoretic limits, leading to trade-offs between reconstruction accuracy and generalizability. The paper introduces a model-based deep learning framework that utilizes spatiotemporal data priors to achieve time-resolved imaging in multi-shot computational microscopy. This framework integrates a physical forward model with a deep video denoiser trained on large-scale datasets, enabling cross-domain knowledge transfer. The proposed method demonstrates a substantial improvement in imaging speed, achieving a temporal resolution of 112 frames per second and an information throughput of 227 megapixels per second, validated through imaging various biological activities.

Methods

In this section, the authors describe the experimental validation and characterization of a ptychographic imaging system used to reconstruct images of a static xylophyta dicotyledonous stem sample. The study evaluates the reconstruction quality under varying motion conditions by introducing an offset to the sample position during reconstruction, simulating global translation motion. The performance of the proposed ViDNet-based algorithm is compared against state-of-the-art methods such as PnP-FISTA, DRUNet, and FastDVDnet. Results indicate that as sample translation speed increases, the quality of the 3DTV-based reconstruction declines significantly, while the ViDNet algorithm maintains superior spatial detail and temporal consistency, achieving a PSNR improvement of over 2 dB compared to the baseline methods. Additionally, the ViDNet algorithm demonstrates robust generalizability across different motion speeds without requiring parameter optimization for each case.

The experimental setup involves a 532 nm solid-state laser for coherent illumination, with a sample holder that translates at approximately 2 mm/s. A diffuser provides random binary amplitude modulation, and intensity images are captured using a complementary metal-oxide semiconductor image sensor. The design allows for measurement diversity through lateral displacements between the sample and the diffuser, with the chosen approach of translating the sample facilitating high-quality reconstructions despite potential calibration imperfections. The imaging resolution and phase reconstruction accuracy were validated using a quantitative phase target, confirming that the spatial resolution meets the Nyquist sampling limit across various translation speeds.

Results

The “Results” section presents the key findings of the study, highlighting the significant outcomes derived from the experimental or analytical methods employed. The data indicate that the proposed hypothesis is supported, with statistical analyses revealing a strong correlation between the variables under investigation. Specifically, the results demonstrate that the intervention led to a measurable improvement in the target outcomes, with a p-value of less than 0.05, indicating statistical significance.

Furthermore, the analysis of variance (ANOVA) results suggest that the differences observed among the groups are not due to random chance, reinforcing the effectiveness of the treatment. Graphical representations, such as bar charts and scatter plots, illustrate the trends and relationships, providing a visual confirmation of the quantitative findings. Overall, the results substantiate the initial hypotheses and contribute valuable insights to the field of study.

Discussion

In this study, a novel model-based deep learning framework for dynamic scene recovery in multi-shot computational microscopy is introduced, leveraging spatiotemporal features learned from extensive video datasets. The proposed lensless holographic imaging system demonstrates significant advancements in time-resolved imaging, enabling the observation of rapid biological processes, such as food vacuole dynamics in paramecia and rotifers, occurring on millisecond timescales. The integration of a deep learning model, ViDNet, as an additive white Gaussian noise (AWGN) denoiser enhances reconstruction quality by capturing low-level spatiotemporal features, while an optimized measurement scheme ensures strong physical constraints during the plug-and-play (PnP) iterations, thus addressing common challenges in deep learning microscopy.

The results indicate that ViDNet outperforms traditional reconstruction methods, particularly in handling fast and complex motions, as evidenced by improved temporal consistency and reduced artifacts in dynamic samples. The study also highlights the successful application of natural intensity video datasets for training, which mitigates hallucination issues and enhances generalizability. Future work may focus on refining the network with domain-specific datasets to further improve performance, as well as exploring advanced architectures and joint optimization strategies for enhanced imaging capabilities across various computational imaging modalities.