الكشف الآلي وتصنيف الآفات العظمية في الأشعة السينية البانورامية باستخدام الشبكات العصبية والمحولات البصرية Automated detection and classification of osteolytic lesions in panoramic radiographs using CNNs and vision transformers

المجلة: BMC Oral Health، المجلد: 25، العدد: 1
DOI: https://doi.org/10.1186/s12903-025-06209-6
PMID: https://pubmed.ncbi.nlm.nih.gov/40544240
تاريخ النشر: 2025-06-21
المؤلف: Niels van Nistelrooij وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

تستقصي الدراسة تطبيق نماذج التعلم العميق للكشف عن الآفات العظمية في الأشعة البانورامية (PRs)، والتي غالبًا ما تكون بدون أعراض ويمكن أن تؤدي إلى تأخير في التشخيص. تم استخدام مجموعة بيانات تتكون من 676 صورة بانورامية، مصنفة إلى آفات محددة جيدًا، وآفات غير محددة، ومجموعة ضابطة. استخدمت البحث أربع هياكل نماذج تقسيم الحالة، بما في ذلك Mask R-CNN مع هياكل Swin-Tiny وResNet-50، وMask DINO، وYOLOv5، وتم تقييمها من خلال التحقق المتقاطع بخمس طيات. تم استخدام مقاييس الأداء مثل الحساسية، والخصوصية، ودرجة F1، والمساحة تحت المنحنى (AUC) لتقييم الفعالية.

أشارت النتائج إلى أن Mask R-CNN مع هيكل Swin-Tiny تفوق على النماذج الأخرى، محققًا درجة F1 قدرها 0.784 وAUC قدرها 0.881 للآفات المحددة جيدًا، ودرجة F1 قدرها 0.904 وAUC قدرها 0.971 للآفات غير المحددة. تبرز الدراسة أن النماذج التي تتضمن مكونات محولات الرؤية أظهرت أداءً متفوقًا مقارنة بهياكل الشبكات العصبية التلافيفية التقليدية (CNN). ومن الجدير بالذكر أن أخطاء النموذج لوحظت بشكل أساسي في المناطق المحيطة بالجيوب الأنفية العلوية ومواقع استخراج الأسنان. بشكل عام، تشير النتائج إلى أن النماذج المعتمدة على محولات الرؤية، وخاصة Mask R-CNN مع Swin-Tiny وMask DINO، تعزز بشكل كبير من الكشف الآلي وتصنيف الآفات العظمية في PRs، مما يمثل تقدمًا كبيرًا مقارنة بالأساليب التقليدية في التعلم العميق.

مقدمة

تناقش مقدمة ورقة البحث التحديات المرتبطة بالكشف عن الآفات العظمية في الفك، والتي تتميز بالامتصاص التدريجي للأنسجة العظمية وغالبًا ما تظهر كمناطق شفافة في الصور الشعاعية. يمكن أن تكون هذه الآفات بدون أعراض في المراحل المبكرة، مما يؤدي إلى تقدم كبير في المرض قبل التشخيص، مع حالات مرتبطة تشمل العدوى الميكروبية، ونخر العظم غير الوعائي، والتهاب العظم والنقي، والأورام الخبيثة مثل السرطان النقوي وسرطان الدم المتعدد. تؤكد الورقة على أهمية تقنيات التصوير، وخاصة الأشعة البانورامية (PRs)، للكشف المبكر، على الرغم من قيودها في تحديد الهياكل المرضية بدقة بسبب العيوب والميزات التشريحية المتداخلة.

أظهرت التطورات الأخيرة في الذكاء الاصطناعي (AI)، وخاصة من خلال الشبكات العصبية التلافيفية (CNNs) ومحولات الرؤية، وعدًا في تعزيز الكشف عن الأمراض السنية في PRs. بينما ركزت الدراسات السابقة على آفات محددة باستخدام تقنيات تصوير محدودة، تهدف هذه الدراسة إلى استخدام محولات الرؤية للكشف وتصنيف كل من الآفات المحددة جيدًا وغير المحددة مباشرة على PRs. هذه المقاربة جديدة، حيث تعالج الفجوة في الأبحاث الحالية بشأن الكشف عن الآفات غير المحددة، والتي قد تشير إلى حالات أكثر خطورة مثل التهاب العظم والنقي أو الأورام الخبيثة. تسعى الدراسة إلى تحسين دقة التشخيص ومساعدة الأطباء في تحديد المناطق التي تتطلب تقييمًا أقرب.

طرق البحث

في هذا القسم، يوضح المؤلفون المنهجية المستخدمة لتقييم قائمة التحقق لأبحاث الذكاء الاصطناعي (AI) في طب الأسنان. شمل عملية المراجعة تقييمًا شاملاً للأدبيات الحالية لضمان أن قائمة التحقق تتماشى مع معايير التقرير المعتمدة. كان الهدف من هذا التقييم هو تعزيز الشفافية وقابلية التكرار لنتائج أبحاث الذكاء الاصطناعي في مجال طب الأسنان. من خلال تحليل قائمة التحقق بشكل منهجي، سعى المؤلفون إلى تحديد المكونات الرئيسية التي تسهم في ممارسات التقرير القوية، مما يعزز في النهاية الجودة والموثوقية في تطبيقات الذكاء الاصطناعي في طب الأسنان.

النتائج

تشير النتائج إلى أن الآفات العظمية أثرت بشكل رئيسي على المناطق الفرعية المجاورة وتحت الأضراس السفلية والفك العلوي، مع مشاركة أقل تكرارًا لمناطق الفك السفلي الأخرى. كانت الآفات المحددة جيدًا أكثر شيوعًا في الفك العلوي، بينما كانت مناطق الكورونويد والمفصل أقل تأثرًا. تم تدريب أربع هياكل نماذج للكشف وتصنيف هذه الآفات من الأشعة البانورامية (PRs)، حيث أظهرت النماذج التي تتضمن طبقات محولات أداءً متفوقًا، خاصة في الكشف عن الآفات غير المحددة. كانت أكثر الهياكل فعالية هي Mask R-CNN مع هيكل Swin-Tiny (macro-F1 = 0.844) وMask DINO (macro-F1 = 0.830)، متفوقة على تلك التي لا تحتوي على طبقات محولات، مثل Mask R-CNN مع هيكل ResNet-50 (macro-F1 = 0.759) وYOLOv5 (macro-F1 = 0.681).

أكدت منحنيات خصائص التشغيل المستقبلية (ROC) فعالية النماذج المعتمدة على المحولات، مع قيم AUC قدرها 0.881 و0.911 للأشعة البانورامية المحددة جيدًا، و0.971 و0.965 للأشعة البانورامية غير المحددة، على التوالي. كشفت مصفوفات الالتباس أن الآفات المحددة جيدًا كانت أكثر عرضة للتصنيف الخاطئ كمجموعة ضابطة مقارنة بالآفات غير المحددة، على الأرجح بسبب انتشارها الأعلى في عينة الدراسة. حددت التحليلات النوعية التحديات في تمييز PRs الضابطة عن تلك التي تحتوي على آفات محددة جيدًا، خاصة بالقرب من الجيب الأنفي العلوي أو بعد استخراج الضرس الثالث. بالإضافة إلى ذلك، أدت الإسقاطات الزائدة التي أدت إلى شفافية شعاعية واسعة إلى تعقيد الكشف عن الآفات، كما يتضح من أمثلة PR المحددة حيث نجحت النماذج المعتمدة على المحولات فقط في تحديد الآفات غير المحددة.

المناقشة

هدفت الدراسة إلى الكشف وتصنيف الآفات العظمية في الأشعة البانورامية (PRs) على أنها محددة جيدًا أو غير محددة باستخدام تقنيات التعلم العميق، وخاصة الشبكات العصبية التلافيفية (CNNs) ومحولات الرؤية. تم تحليل مجموعة بيانات من 6,404 PRs، تحتوي على 346 منها على آفات عظمية. تم وضع علامات على الآفات بواسطة محترفين ذوي خبرة، وتم مقارنة أربعة نماذج للتعلم العميق: Mask R-CNN مع ResNet-50، Mask R-CNN مع Swin-Tiny، Mask DINO، وYOLOv5. أشارت النتائج إلى أن النماذج المعتمدة على المحولات، وخاصة Mask R-CNN مع هيكل Swin-Tiny، تفوقت على نماذج CNN فقط، محققة حساسية وخصوصية عالية، خاصة بالنسبة للآفات غير المحددة. كانت المساحة تحت المنحنى (AUC) للآفات المحددة جيدًا 0.881 لـ Mask R-CNN (Swin-Tiny) و0.911 لـ Mask DINO، بينما كانت للآفات غير المحددة 0.971 و0.965، على التوالي.

تسلط النتائج الضوء على الأهمية السريرية لتحديد الآفات غير المحددة بدقة، والتي قد تشير إلى حالات خطيرة مثل الأورام الخبيثة. يمكن أن يعمل نظام الذكاء الاصطناعي المقترح كأداة دعم قرار لأطباء الأسنان، مما يعزز الكشف المبكر خلال الفحوصات الروتينية. ومع ذلك، تشمل القيود عدم القدرة على التمييز بين أنواع الآفات المحددة جيدًا ووجود تحيز محتمل في الاختيار بسبب العدد المحدود من الآفات العظمية في مجموعة البيانات. يجب أن تركز الأبحاث المستقبلية على التحقق من صحة هذه النماذج باستخدام مجموعات بيانات خارجية وتحسين قوتها من خلال دمج بيانات المرضى الإضافية وتنقيح تقنيات المعالجة المسبقة. بشكل عام، تؤكد الدراسة على إمكانيات هياكل محولات الرؤية في تعزيز تشخيصات التصوير السني المدعومة بالذكاء الاصطناعي.

Journal: BMC Oral Health, Volume: 25, Issue: 1
DOI: https://doi.org/10.1186/s12903-025-06209-6
PMID: https://pubmed.ncbi.nlm.nih.gov/40544240
Publication Date: 2025-06-21
Author(s): Niels van Nistelrooij et al.
Primary Topic: Dental Radiography and Imaging

Overview

The study investigates the application of deep learning models for the detection of osteolytic lesions in panoramic radiographs (PRs), which are often asymptomatic and can lead to delayed diagnosis. A dataset comprising 676 PRs, categorized into well-defined lesions, ill-defined lesions, and controls, was utilized. The research employed four instance segmentation model architectures, including Mask R-CNN with Swin-Tiny and ResNet-50 backbones, Mask DINO, and YOLOv5, evaluated through five-fold cross-validation. Performance metrics such as sensitivity, specificity, F1-score, and area under the curve (AUC) were used to assess effectiveness.

Results indicated that the Mask R-CNN with a Swin-Tiny backbone outperformed other models, achieving an F1-score of 0.784 and AUC of 0.881 for well-defined lesions, and an F1-score of 0.904 and AUC of 0.971 for ill-defined lesions. The study highlights that models incorporating vision transformer components demonstrated superior performance compared to traditional convolutional neural network (CNN) architectures. Notably, model errors were primarily observed in regions around the maxillary sinus and tooth extraction sites. Overall, the findings suggest that vision transformer-based models, particularly Mask R-CNN with Swin-Tiny and Mask DINO, significantly enhance the automated detection and classification of osteolytic lesions in PRs, representing a substantial advancement over conventional deep learning approaches.

Introduction

The introduction of the research paper discusses the challenges associated with the detection of osteolytic lesions in the jaws, which are characterized by the gradual absorption of bone tissue and often present as radiolucent areas in radiographic images. These lesions can be asymptomatic in early stages, leading to significant disease progression before diagnosis, with associated conditions including microbial infections, avascular osteonecrosis, osteomyelitis, and malignancies such as metastatic cancer and multiple myeloma. The paper emphasizes the importance of imaging techniques, particularly panoramic radiographs (PRs), for early detection, despite their limitations in accurately identifying pathological structures due to artifacts and overlapping anatomical features.

Recent advancements in artificial intelligence (AI), particularly through convolutional neural networks (CNNs) and vision transformers, have shown promise in enhancing the detection of dental pathologies on PRs. While previous studies have focused on specific lesions using limited imaging techniques, this research aims to utilize vision transformers to detect and classify both well-defined and ill-defined osteolytic lesions directly on PRs. This approach is novel, as it addresses the gap in current research regarding the detection of ill-defined lesions, which may indicate more severe conditions like osteomyelitis or malignancies. The study seeks to improve diagnostic accuracy and assist clinicians in identifying areas that require closer evaluation.

Methods

In this section, the authors detail the methodology employed to evaluate the checklist for artificial intelligence (AI) research in dentistry. The review process involved a comprehensive assessment of existing literature to ensure that the checklist adheres to established reporting standards. This evaluation aimed to enhance the transparency and reproducibility of AI research findings within the dental field. By systematically analyzing the checklist, the authors sought to identify key components that contribute to robust reporting practices, ultimately fostering improved quality and reliability in AI applications in dentistry.

Results

The results indicate that osteolytic lesions predominantly affected the sub-regions adjacent to and beneath the mandibular molars and the maxilla, with less frequent involvement of other mandibular areas. Well-defined lesions were more prevalent in the maxilla, while the coronoid and condyle regions were less affected. Four model architectures were trained to detect and classify these lesions from panoramic radiographs (PRs), with models incorporating transformer layers demonstrating superior performance, particularly in detecting ill-defined lesions. The most effective architectures were Mask R-CNN with a Swin-Tiny backbone (macro-F1 = 0.844) and Mask DINO (macro-F1 = 0.830), outperforming those without transformer layers, such as Mask R-CNN with a ResNet-50 backbone (macro-F1 = 0.759) and YOLOv5 (macro-F1 = 0.681).

Receiver Operating Characteristic (ROC) curves confirmed the efficacy of the transformer-based models, with AUC values of 0.881 and 0.911 for well-defined PRs, and 0.971 and 0.965 for ill-defined PRs, respectively. Confusion matrices revealed that well-defined lesions were more frequently misclassified as controls compared to ill-defined lesions, likely due to their higher prevalence in the study sample. Qualitative analyses identified challenges in differentiating control PRs from those with well-defined lesions, particularly near the maxillary sinus or post-third molar extraction. Additionally, overprojections leading to extensive radiolucency complicated lesion detection, as illustrated by specific PR examples where only transformer-based models successfully identified ill-defined lesions.

Discussion

The study aimed to detect and categorize osteolytic lesions in panoramic radiographs (PRs) as either well-defined or ill-defined using deep learning techniques, specifically convolutional neural networks (CNNs) and vision transformers. A dataset of 6,404 PRs was analyzed, with 346 containing osteolytic lesions. The lesions were annotated by experienced professionals, and four deep learning models were compared: Mask R-CNN with ResNet-50, Mask R-CNN with Swin-Tiny, Mask DINO, and YOLOv5. Results indicated that transformer-based models, particularly Mask R-CNN with a Swin-Tiny backbone, outperformed CNN-only models, achieving high sensitivity and specificity, especially for ill-defined lesions. The area under the curve (AUC) for well-defined lesions was 0.881 for Mask R-CNN (Swin-Tiny) and 0.911 for Mask DINO, while for ill-defined lesions, it was 0.971 and 0.965, respectively.

The findings highlight the clinical relevance of accurately identifying ill-defined lesions, which may indicate serious conditions such as malignancies. The proposed AI system could serve as a decision support tool for dentists, enhancing early detection during routine examinations. However, limitations include the inability to differentiate between types of well-defined lesions and a potential selection bias due to the limited number of osteolytic lesions in the dataset. Future research should focus on validating these models with external datasets and improving their robustness by integrating additional patient data and refining pre-processing techniques. Overall, the study underscores the potential of vision transformer-based architectures in advancing AI-assisted dental imaging diagnostics.