محول رؤية قابل للتفسير لتصنيف النوم البصري التلقائي على إشارات PSG متعددة الأنماط Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals

المجلة: npj Digital Medicine، المجلد: 8، العدد: 1
DOI: https://doi.org/10.1038/s41746-024-01378-0
PMID: https://pubmed.ncbi.nlm.nih.gov/39863774
تاريخ النشر: 2025-01-25
المؤلف: Hyojin Lee وآخرون
الموضوع الرئيسي: تخطيط الدماغ وواجهات الدماغ-الكمبيوتر

نظرة عامة

تعتبر دراسة النوم (PSG) المعيار الذهبي لتشخيص اضطرابات النوم، حيث تقيس مجموعة متنوعة من المعايير الفسيولوجية مثل EEG وECG وEOG وEMG والنشاط التنفسي. ومع ذلك، فإن تسجيل بيانات PSG يدويًا يتطلب جهدًا كبيرًا ويعتمد على التقدير الشخصي، مما يؤدي إلى تباين كبير بين المسجلين. تقدم هذه الدراسة SleepXViT، وهو نظام تلقائي لتصنيف النوم يستخدم تقنية Vision Transformer (ViT)، والذي يهدف إلى تعزيز الاتساق وقابلية تفسير تسجيل النوم من خلال تقديم تفسيرات بديهية تحاكي التسجيل البصري البشري.

تم تقييم SleepXViT على مجموعة بيانات صور PSG KISS-a التي تضم 7,745 مريضًا من أربعة مستشفيات، حيث حقق درجة Macro F1 تبلغ 81.94%، متجاوزًا النماذج الأساسية ومظهرًا أداءً قويًا على مجموعات البيانات العامة SHHS1 وSHHS2. لا يقدم النظام فقط درجات ثقة مضبوطة جيدًا للتنبؤات ذات الثقة المنخفضة، بل يولد أيضًا خرائط حرارية عالية الدقة تبرز الميزات الحرجة ودرجات الصلة لتأثير الفترات المجاورة على توقعات مراحل النوم. تعزز هذه القدرات موثوقية وقابلية تفسير SleepXViT، مما يعزز نهجًا تعاونيًا بين نماذج الذكاء الاصطناعي والمسجلين البشريين في البيئات السريرية، مما يحسن في النهاية تقييم جودة النوم وتشخيص الاضطرابات ذات الصلة.

الطرق

تستعرض قسم “الطرق” في ورقة البحث التصميم التجريبي والتقنيات التحليلية المستخدمة للتحقيق في أسئلة البحث. استخدمت الدراسة نهجًا كميًا، يتضمن تحليلات إحصائية لتقييم البيانات التي تم جمعها من تجارب متنوعة. تم اختيار المشاركين بناءً على معايير إدراج محددة، مما يضمن عينة تمثيلية لأهداف الدراسة.

شملت جمع البيانات أدوات وبروتوكولات موحدة لضمان الموثوقية والصلاحية. تضمنت التحليلات تطبيق اختبارات إحصائية، مثل اختبارات t وANOVA، لتحديد الفروق الهامة بين المجموعات. بالإضافة إلى ذلك، تم إجراء تحليلات انحدار لاستكشاف العلاقات بين المتغيرات. كانت الدقة المنهجية تهدف إلى تقليل التحيز وتعزيز إمكانية تكرار النتائج، مما يساهم في قوة استنتاجات الدراسة.

النتائج

يقدم قسم “النتائج” في ورقة البحث النتائج الرئيسية المستمدة من التجارب أو التحليلات التي تم إجراؤها. يحدد النتائج الأساسية، بما في ذلك البيانات الإحصائية، والاتجاهات الملحوظة، وأي ارتباطات هامة تم تحديدها ضمن الدراسة. عادةً ما تكون النتائج مصحوبة بأشكال أو جداول أو معادلات ذات صلة توضح البيانات بوضوح وتدعم الاستنتاجات المستخلصة.

في هذا القسم، قد يناقش المؤلفون أيضًا تداعيات نتائجهم فيما يتعلق بالأدبيات الموجودة، مع تسليط الضوء على كيفية مساهمة نتائجهم في الفهم الأوسع للموضوع. يتم تناول أي نتائج غير متوقعة أو شذوذ، مما يوفر رؤى حول المجالات المحتملة لمزيد من البحث أو التحقيق. بشكل عام، تعتبر النتائج أساسًا للنقاشات والاستنتاجات اللاحقة في الورقة.

النقاش

في قسم النقاش، يبرز البحث أداء SleepXViT مقارنة بالنماذج الأساسية، وخاصة Jeong et al. وSleepTransformer، عبر مجموعات بيانات متنوعة. تفوق SleepXViT على هذه النماذج، لا سيما في مجموعة بيانات KISS، التي تستفيد من مجموعة بيانات متعددة الوسائط غنية بإشارات PSG المحولة إلى صور موحدة. يتم التأكيد على هذه الميزة في كثافة المعلومات كعامل رئيسي يساهم في الأداء المتفوق لـ SleepXViT، خاصة عند مقارنته بـ SleepTransformer، الذي يعتمد فقط على بيانات EEG أحادية القناة. تثير التمثيل الديموغرافي لمجموعة بيانات KISS، التي تضم مرضى ذكور بشكل أساسي، مخاوف بشأن التحيز المحتمل؛ ومع ذلك، تشير التحليلات إلى أن العوامل الديموغرافية مثل العمر ومؤشر كتلة الجسم والجنس لا تؤثر بشكل كبير على أداء النموذج.

علاوة على ذلك، يناقش القسم معايرة النموذج ومقاييس الثقة، مما يظهر أن درجات الثقة لـ SleepXViT متوافقة جيدًا مع دقتها، مما يعزز موثوقية النموذج في البيئات السريرية. كما يتم الإشارة إلى قدرة النموذج على تقديم توقعات قابلة للتفسير من خلال تفسيرات بصرية، مثل الخرائط الحرارية، مما يسمح بمزيد من الشفافية في اتخاذ القرارات. بشكل عام، تشير النتائج إلى أنه بينما يظهر SleepXViT أداءً قويًا وموثوقًا، فإن التقييم المستمر لملاءمته عبر مجموعات سكانية وسيناريوهات سريرية متنوعة لا يزال أمرًا ضروريًا.

Journal: npj Digital Medicine, Volume: 8, Issue: 1
DOI: https://doi.org/10.1038/s41746-024-01378-0
PMID: https://pubmed.ncbi.nlm.nih.gov/39863774
Publication Date: 2025-01-25
Author(s): Hyojin Lee et al.
Primary Topic: EEG and Brain-Computer Interfaces

Overview

Polysomnography (PSG) is the gold standard for diagnosing sleep disorders, as it measures various physiological parameters such as EEG, ECG, EOG, EMG, and respiratory activity. However, the manual scoring of PSG data is labor-intensive and subjective, leading to high variability among scorers. This study introduces SleepXViT, an automatic sleep staging system utilizing Vision Transformer (ViT) technology, which aims to enhance the consistency and interpretability of sleep scoring by providing intuitive explanations that mimic human visual scoring.

Evaluated on the KISS-a PSG image dataset comprising 7,745 patients from four hospitals, SleepXViT achieved a Macro F1 score of 81.94%, surpassing baseline models and demonstrating robust performance on public datasets SHHS1 and SHHS2. The system not only offers well-calibrated confidence scores for low-confidence predictions but also generates high-resolution heatmaps that highlight critical features and relevance scores for adjacent epochs’ influence on sleep stage predictions. These capabilities enhance the reliability and interpretability of SleepXViT, fostering a collaborative approach between AI models and human scorers in clinical settings, ultimately improving the assessment of sleep quality and the diagnosis of related disorders.

Methods

The “Methods” section of the research paper outlines the experimental design and analytical techniques employed to investigate the research questions. The study utilized a quantitative approach, incorporating statistical analyses to evaluate the data collected from various experiments. Participants were selected based on specific inclusion criteria, ensuring a representative sample for the study’s objectives.

Data collection involved standardized instruments and protocols to ensure reliability and validity. The analysis included the application of statistical tests, such as t-tests and ANOVA, to determine significant differences among groups. Additionally, regression analyses were performed to explore relationships between variables. The methodological rigor aimed to minimize bias and enhance the reproducibility of the findings, thereby contributing to the robustness of the study’s conclusions.

Results

The “Results” section of the research paper presents key findings derived from the conducted experiments or analyses. It outlines the primary outcomes, including statistical data, observed trends, and any significant correlations identified within the study. The results are typically accompanied by relevant figures, tables, or equations that illustrate the data clearly and support the conclusions drawn.

In this section, the authors may also discuss the implications of their findings in relation to existing literature, highlighting how their results contribute to the broader understanding of the topic. Any unexpected results or anomalies are addressed, providing insights into potential areas for further research or investigation. Overall, the results serve as a foundation for the subsequent discussion and conclusions of the paper.

Discussion

In the discussion section, the research highlights the performance of SleepXViT in comparison to baseline models, specifically Jeong et al. and SleepTransformer, across various datasets. SleepXViT outperformed these models, particularly in the KISS dataset, which benefits from a rich multimodal dataset of PSG signals converted into standardized images. This advantage in information density is emphasized as a key factor contributing to SleepXViT’s superior performance, especially when compared to SleepTransformer, which relies solely on single-channel EEG data. The KISS dataset’s demographic representation, primarily male patients, raises concerns about potential bias; however, analyses indicate that demographic factors such as age, BMI, and sex do not significantly impact model performance.

Moreover, the section discusses the model’s calibration and confidence metrics, demonstrating that SleepXViT’s confidence scores are well-aligned with its accuracy, thus enhancing the model’s reliability in clinical settings. The model’s ability to provide interpretable predictions through visual explanations, such as heatmaps, is also noted, allowing for greater transparency in decision-making. Overall, the findings suggest that while SleepXViT exhibits strong performance and reliability, ongoing evaluation of its applicability across diverse populations and clinical scenarios remains essential.