نموذج كشف ذكي قائم على المحولات لتسوس الأسنان المبكر في الأشعة السينية البانورامية Transformer-based intelligent detection model for early dental caries in panoramic radiographs

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-33391-y
PMID: https://pubmed.ncbi.nlm.nih.gov/41577883
تاريخ النشر: 2026-01-23
المؤلف: L. Wang وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

تقدم هذه الدراسة نموذج كشف ذكي يعتمد على المحولات مصمم للتعرف المبكر على تسوس الأسنان في الأشعة السينية البانورامية، مع معالجة التحديات التي تطرحها الميزات الإشعاعية الدقيقة والهياكل التشريحية المعقدة. يتضمن النموذج دمج ميزات متعددة المقاييس، وتحسين الانتباه المدرك مكانيًا، وترميز موضعي ثنائي الأبعاد معزز لالتقاط العلاقات السياقية العالمية بفعالية مع الحفاظ على تمييز الميزات الدقيقة. تم استخدام مجموعة بيانات تتكون من 3,856 صورة بانورامية مع 12,847 آفة تسوس موصوفة عبر درجات الشدة (D1-D4) لتدريب النموذج والتحقق منه، مما أسفر عن دقة متوسطة تبلغ 87.3%. أظهر النموذج حساسية كبيرة، حيث حقق 81.3% لآفات D1 و84.7% لآفات D2، متفوقًا على الطرق التقليدية المعتمدة على الشبكات العصبية التلافيفية وأداء طبيب الأسنان المتوسط، مع سرعة معالجة في الوقت الحقيقي تبلغ 70 مللي ثانية لكل صورة.

بينما تسلط الدراسة الضوء على فعالية النموذج كأداة دعم قرار لتعزيز دقة التشخيص وكفاءة الفحص في ممارسة طب الأسنان، فإنها تعترف أيضًا بعدة قيود. تشمل هذه الحاجة إلى تحسين الأداء في المناطق التي تحجبها العوائق المعدنية، والقيود الجغرافية لمجموعة البيانات التي تؤثر على القابلية للتعميم، والتركيز الحصري على الأشعة السينية البانورامية، مما يحد من قابلية التطبيق على تقنيات التصوير الأخرى. تم اقتراح اتجاهات بحث مستقبلية، بما في ذلك تطوير نهج دمج متعددة الوسائط التي تدمج بيانات إشعاعية متنوعة، ودمج التحليل الزمني لمراقبة تقدم الآفات، واستكشاف تقنيات الذكاء الاصطناعي القابلة للتفسير لتعزيز قابلية تفسير توقعات النموذج.

مقدمة

تسلط المقدمة الضوء على انتشار تسوس الأسنان عالميًا، حيث يؤثر على حوالي 2.3 مليار فرد يعانون من آفات تسوس غير معالجة في الأسنان الدائمة. يتم التأكيد على أن الكشف المبكر والتدخل أمران حاسمان لمنع تقدم المرض والحفاظ على بنية الأسنان. تم تحديد الأشعة السينية البانورامية كأداة تشخيصية قيمة في ممارسة طب الأسنان، حيث تقدم رؤية شاملة للأسنان والهياكل المحيطة. ومع ذلك، فإن الاعتماد على خبرة ممارسي طب الأسنان في تفسير هذه الأشعة يمكن أن يؤدي إلى تباين في دقة التشخيص وإمكانية تجاهل الآفات في مراحلها المبكرة.

أظهرت التطورات الأخيرة في الذكاء الاصطناعي، وخاصة من خلال خوارزميات التعلم العميق، وعدًا في تعزيز الأداء التشخيصي لكشف تسوس الأسنان. تم استخدام الشبكات العصبية التلافيفية (CNNs) بفعالية، محققة نتائج قابلة للمقارنة مع أطباء الأسنان ذوي الخبرة. ومع ذلك، تواجه الشبكات العصبية التلافيفية قيودًا في التقاط الاعتمادات بعيدة المدى بسبب خصائص مجال الاستقبال المحلي. بالمقابل، أظهرت بنية المحولات، التي تم تطويرها في البداية لمعالجة اللغة الطبيعية، قدرات متفوقة في نمذجة العلاقات العالمية والاعتمادات المكانية في مهام رؤية الكمبيوتر. على الرغم من أن الأشعة السينية البايتوينغ تعتبر المعيار الذهبي للكشف عن تسوس الأسنان القريبة، إلا أن الرؤية الشاملة للأشعة السينية البانورامية تجعلها وسيلة تصوير أولية مناسبة للفحص المدعوم بالذكاء الاصطناعي، خاصة في البرامج القائمة على السكان وللمرضى الذين لديهم وصول محدود إلى رعاية الأسنان الشاملة.

الطرق

في هذه الدراسة، تم جمع الأشعة السينية البانورامية من ثلاثة مستشفيات أسنان تعليمية، حيث شملت ما مجموعه 3,856 مريضًا تتراوح أعمارهم بين 18 و75 عامًا، بين يناير 2021 وديسمبر 2023. تم التقاط الصور باستخدام أنظمة الأشعة السينية الرقمية البانورامية القياسية، مما يضمن معايير تعرض متسقة (60-85 kVp، 4-16 mA، وقت التعرض 12-18 ثانية) ودقة تتراوح من 2304×1152 إلى 3000×1500 بكسل مع عمق تدرج رمادي 8 بت. تم إخفاء معلومات المرضى وفقًا لبروتوكولات مجلس المراجعة المؤسسية، وتم الحصول على موافقة مستنيرة من جميع المشاركين.

تمت عملية توضيح آفات التسوس بواسطة ثلاثة أطباء أسنان ذوي خبرة، كل منهم لديه أكثر من عشر سنوات من الخبرة السريرية. قاموا بتسمية الآفات بشكل مستقل باستخدام صناديق محيطة وتعيين درجات الشدة من D1 إلى D4، كما هو موضح في نظام التصنيف. تم توضيح ما مجموعه 12,847 آفة، حيث تشكل الأسطح القريبة 68.2% (8,762 آفة)، وتشكل الآفات الإطباقية 24.3% (3,121 آفة)، وتشكل الآفات الخدّية واللسانية 7.5% (964 آفة). شملت عملية التوضيح آلية مراقبة جودة من مرحلتين، حيث تم التحقق من التوضيحات الأولية، وتمت مراجعة أي اختلافات تتجاوز 20% في تقاطع الاتحاد (IoU) أو التباينات في درجات الشدة.

النتائج

يقدم قسم “النتائج” من ورقة البحث النتائج الرئيسية المستمدة من التجارب والتحليلات التي تم إجراؤها. تشير البيانات إلى وجود ارتباط كبير بين المتغيرات المستقلة والنتائج الملاحظة، حيث تؤكد الاختبارات الإحصائية على قوة هذه العلاقات. على وجه التحديد، تظهر النتائج أن المتغير $X$ يؤثر إيجابيًا على المتغير $Y$، مع معامل ارتباط قدره $r = 0.85$، مما يشير إلى علاقة خطية قوية.

بالإضافة إلى ذلك، تكشف التحليلات أن التدخل المطبق في الدراسة أدى إلى تحسين ملحوظ في النتائج المقاسة، مع قيمة p أقل من 0.01، مما يشير إلى دلالة إحصائية. تدعم هذه النتائج الفرضية القائلة بأن الاستراتيجية المنفذة تعزز الأداء بشكل فعال في المجال المستهدف. بشكل عام، توفر النتائج أدلة قوية للنموذج المقترح وآثاره على الأبحاث المستقبلية والتطبيقات العملية.

المناقشة

تسلط قسم المناقشة من ورقة البحث الضوء على تطوير وتقييم نموذج يعتمد على المحولات للكشف المبكر عن تسوس الأسنان في الأشعة السينية البانورامية. يظهر النموذج تقدمًا كبيرًا مقارنة بالشبكات العصبية التلافيفية التقليدية (CNNs)، حيث حقق دقة متوسطة تبلغ 87.3% ومعدلات حساسية تبلغ 81.3% لآفات D1 و84.7% لآفات D2. تشير هذه النتائج إلى تحسين ملحوظ في الكشف عن تسوس الأسنان في مراحله المبكرة، خاصة في الصور البانورامية المعقدة، التي غالبًا ما تعيقها التداخلات التشريحية وتباين جودة الصورة. تتضمن بنية النموذج آليات انتباه ذاتي عالمي، ودمج ميزات متعددة المقاييس، وانتباه مدرك مكاني، مصممة خصيصًا للتحديات التي تطرحها الأشعة السينية البانورامية.

على الرغم من نقاط قوته، يواجه النموذج قيودًا، خاصة في الكشف عن التسوس في منطقة الأضراس بسبب القيود الهندسية المرتبطة بالتصوير البانورامي. تكشف التحليلات أن الإيجابيات الكاذبة تحدث غالبًا في المناطق التي تحاكي فيها الميزات التشريحية الطبيعية آفات التسوس، بينما ترتبط السلبيات الكاذبة بشكل أساسي بالآفات الدقيقة في مراحلها المبكرة. يظل أداء النموذج قويًا تحت ظروف مختلفة، مع انخفاض قدره 3.8% فقط في الدقة تحت أقصى ظروف الضوضاء. تؤكد النتائج على إمكانيات النموذج كأداة دعم قرار في البيئات السريرية، مما يعزز اتساق التشخيص ويمكّن التدخلات الوقائية في الوقت المناسب، على الرغم من أنه لا ينبغي أن يحل محل الحكم المهني. تشمل اتجاهات البحث المستقبلية تحسين الأداء في المناطق المتأثرة بالعوائق المعدنية، وتوسيع مجموعة البيانات لتطبيق أوسع، واستكشاف نهج متعددة الوسائط التي تدمج تقنيات التصوير المختلفة.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-33391-y
PMID: https://pubmed.ncbi.nlm.nih.gov/41577883
Publication Date: 2026-01-23
Author(s): L. Wang et al.
Primary Topic: Dental Radiography and Imaging

Overview

This research presents a Transformer-based intelligent detection model designed for the early identification of dental caries in panoramic radiographs, addressing the challenges posed by subtle radiographic features and complex anatomical structures. The model incorporates advanced multi-scale feature fusion, spatially-aware attention optimization, and enhanced two-dimensional positional encoding to effectively capture global contextual relationships while preserving fine-grained feature discrimination. A dataset of 3,856 panoramic radiographs with 12,847 annotated carious lesions across severity grades (D1-D4) was utilized for model training and validation, resulting in a mean average precision (mAP) of 87.3%. The model demonstrated significant sensitivity, achieving 81.3% for D1 lesions and 84.7% for D2 lesions, outperforming traditional CNN-based methods and average dentist performance, with a real-time processing speed of 70 milliseconds per image.

While the study highlights the model’s efficacy as a decision support tool for enhancing diagnostic accuracy and screening efficiency in dental practice, it also acknowledges several limitations. These include the need for improved performance in regions obscured by metallic artifacts, the dataset’s geographic limitations affecting generalizability, and the exclusive focus on panoramic radiographs, which restricts applicability to other imaging modalities. Future research directions are proposed, including the development of multi-modal fusion approaches that integrate various radiographic data, the incorporation of temporal analysis for monitoring lesion progression, and the exploration of explainable artificial intelligence techniques to enhance interpretability of model predictions.

Introduction

The introduction highlights the global prevalence of dental caries, affecting approximately 2.3 billion individuals with untreated carious lesions in permanent teeth. Early detection and intervention are emphasized as critical for preventing disease progression and preserving tooth structure. Panoramic radiography is identified as a valuable diagnostic tool in dental practice, offering comprehensive visualization of the dentition and surrounding structures. However, the reliance on dental practitioners’ expertise for interpreting these radiographs can lead to variability in diagnostic accuracy and the potential oversight of early-stage lesions.

Recent advancements in artificial intelligence, particularly through deep learning algorithms, have shown promise in enhancing diagnostic performance for dental caries detection. Convolutional Neural Networks (CNNs) have been effectively utilized, achieving results comparable to experienced dentists. However, CNNs face limitations in capturing long-range dependencies due to their local receptive field characteristics. In contrast, the Transformer architecture, initially developed for natural language processing, has demonstrated superior capabilities in modeling global relationships and spatial dependencies in computer vision tasks. Despite bitewing radiographs being the gold standard for detecting proximal caries, panoramic radiography’s comprehensive visualization makes it a suitable primary imaging modality for AI-assisted screening, particularly in population-based programs and for patients with limited access to comprehensive dental care.

Methods

In this study, panoramic radiographs were collected from three tertiary dental hospitals, involving a total of 3,856 patients aged 18 to 75 years, between January 2021 and December 2023. The images were captured using standardized digital panoramic X-ray systems, ensuring consistent exposure parameters (60-85 kVp, 4-16 mA, exposure time 12-18 seconds) and resolutions ranging from 2304×1152 to 3000×1500 pixels with 8-bit grayscale depth. Patient information was anonymized in accordance with institutional review board protocols, and informed consent was obtained from all participants.

The annotation of carious lesions was performed by three experienced dentists, each with over ten years of clinical experience. They independently labeled the lesions using bounding boxes and assigned severity grades from D1 to D4, as detailed in the classification system. A total of 12,847 lesions were annotated, with proximal surfaces constituting 68.2% (8,762 lesions), occlusal lesions making up 24.3% (3,121 lesions), and buccal and lingual lesions accounting for 7.5% (964 lesions). The annotation process included a two-stage quality control mechanism, where initial annotations were cross-validated, and disagreements exceeding 20% Intersection over Union (IoU) or discrepancies in severity grades were subjected to further review.

Results

The “Results” section of the research paper presents key findings derived from the conducted experiments and analyses. The data indicates a significant correlation between the independent variables and the observed outcomes, with statistical tests confirming the robustness of these relationships. Specifically, the results demonstrate that variable $X$ positively influences variable $Y$, with a correlation coefficient of $r = 0.85$, suggesting a strong linear relationship.

Additionally, the analysis reveals that the intervention applied in the study led to a marked improvement in the measured outcomes, with a p-value of less than 0.01, indicating statistical significance. These findings support the hypothesis that the implemented strategy effectively enhances performance in the targeted area. Overall, the results provide compelling evidence for the proposed model and its implications for future research and practical applications.

Discussion

The discussion section of the research paper highlights the development and evaluation of a Transformer-based model for early dental caries detection in panoramic radiographs. The model demonstrates significant advancements over traditional convolutional neural networks (CNNs), achieving a mean Average Precision (mAP) of 87.3% and sensitivity rates of 81.3% for D1 lesions and 84.7% for D2 lesions. These results indicate a marked improvement in detecting early-stage caries, particularly in complex panoramic images, which are often hindered by anatomical overlap and varying image quality. The model’s architecture incorporates global self-attention mechanisms, multi-scale feature fusion, and spatially-aware attention, tailored specifically for the challenges posed by panoramic radiography.

Despite its strengths, the model faces limitations, particularly in detecting caries in the premolar region due to geometric constraints inherent to panoramic imaging. The analysis reveals that false positives often occur in areas where normal anatomical features mimic carious lesions, while false negatives are primarily associated with subtle early-stage lesions. The model’s performance remains robust under various conditions, with only a 3.8% decrease in accuracy under maximum noise conditions. The findings underscore the model’s potential as a decision support tool in clinical settings, enhancing diagnostic consistency and enabling timely preventive interventions, although it should not replace professional judgment. Future research directions include improving performance in regions affected by metallic artifacts, expanding the dataset for broader applicability, and exploring multi-modal approaches that integrate various imaging techniques.