دقة تطبيقات الذكاء الاصطناعي في طب الأسنان اللثوي: مراجعة سردية موضوعية Accuracy of artificial intelligence applications in periodontics: a thematic narrative review

المجلة: Frontiers in Dental Medicine، المجلد: 7
DOI: https://doi.org/10.3389/fdmed.2026.1729825
PMID: https://pubmed.ncbi.nlm.nih.gov/41660180
تاريخ النشر: 2026-01-22
المؤلف: Ady Azhari
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

تقوم المراجعة بتلخيص النتائج من 35 دراسة نُشرت بين عامي 2019 و2025، مع التركيز على تطبيق الذكاء الاصطناعي (AI) في تشخيص أمراض اللثة عبر مختلف طرق التصوير، بما في ذلك الأشعة السينية المحيطية، والتصوير المقطعي المحوسب باستخدام شعاع المخروط (CBCT)، والصور داخل الفم. تسلط التحليل الضوء على أن النماذج المعتمدة على الشبكات العصبية الالتفافية (CNN) للأشعة السينية المحيطية حققت دقة تشخيصية متوسطة إلى عالية (0.82-0.85) وقيم منطقة تحت المنحنى (AUC) تتجاوز 0.88، مما يجعلها قابلة للمقارنة مع أداء الأطباء. بالمقابل، أظهرت CBCT دقة متفوقة (تصل إلى 0.91) للتقييمات الحجمية، بينما أظهرت الأشعة السينية البانورامية والصور داخل الفم أداءً متغيرًا بسبب عدم الاتساق في التصوير ومعايير المرجع.

على الرغم من التقدم الكبير في تطبيقات الذكاء الاصطناعي لتشخيص أمراض اللثة، فإن التحديات مثل التباين المنهجي، والأدلة المستقبلية المحدودة، ومعايير الإبلاغ المتغيرة تعيق الاعتماد السريري الواسع. يُقترح دمج أطر الإبلاغ الموحدة مثل STARD-AI وTRIPOD-AI، جنبًا إلى جنب مع تقنيات الذكاء الاصطناعي القابلة للتفسير وسير العمل الذي يتضمن الأطباء، كمسار قابل للتطبيق للنشر الآمن والفعال. يدعو المؤلفون إلى إجراء تجارب مستقبلية متعددة المراكز وبروتوكولات تصوير موحدة لتعزيز قوة وعدالة دمج الذكاء الاصطناعي في ممارسة أمراض اللثة، مؤكدين أن الذكاء الاصطناعي يجب أن يكون أداة تكميلية لخبرة الأطباء، مما يحسن في النهاية دقة التشخيص ورعاية المرضى.

مقدمة

**مقدمة**

تمثل أمراض اللثة قضية صحية عالمية هامة، تؤثر على مليارات الأشخاص وتساهم في عبء الأمراض الفموية. يعد التشخيص المبكر والدقيق أمرًا حاسمًا لمنع تقدم المرض وفقدان الأسنان. تعاني طرق التشخيص التقليدية، مثل الفحص السريري وتفسير الأشعة السينية ثنائية الأبعاد (2D)، من قيود، بما في ذلك تباين الفاحصين وإمكانية تمثيلات تشريحية مشوهة. بالمقابل، ظهر الذكاء الاصطناعي (AI)، وخاصة تقنيات التعلم العميق مثل الشبكات العصبية الالتفافية (CNNs)، كأداة واعدة لتعزيز سير العمل التشخيصي في طب الأسنان، وخاصة في أمراض اللثة.

تزايدت تطبيقات الذكاء الاصطناعي عبر مختلف التخصصات السنية، مستفيدة من طرق تصوير متنوعة مثل الأشعة السينية المحيطية والبانورامية، والتصوير المقطعي المحوسب باستخدام شعاع المخروط (CBCT)، والصور داخل الفم. تشير الدراسات إلى أن المصنفات المعتمدة على CNN يمكن أن تحقق دقة تشخيصية تتراوح بين 0.82 إلى 0.85 على الأشعة السينية المحيطية، مع قيم منطقة تحت المنحنى (AUC) تتجاوز 0.88، وغالبًا ما تتجاوز أداء الأطباء المتوسط. ومع ذلك، تميل أداء نماذج الذكاء الاصطناعي، بما في ذلك نموذج DeepLabv3+، إلى الانخفاض عند تقييمها على مجموعات بيانات خارجية، مما يبرز التحديات المتعلقة بتحولات مجموعات البيانات والحاجة إلى تحقق خارجي صارم قبل التنفيذ السريري الواسع. أظهرت التقدمات الأخيرة في تصوير CBCT نتائج واعدة في التقييمات الحجمية، حيث حققت نماذج التعلم العميق دقة حوالي 0.91 في اكتشاف فقدان العظم السنخي. تهدف هذه المراجعة السردية الموضوعية إلى تلخيص الأدبيات الحالية حول دقة التشخيص بالذكاء الاصطناعي في أمراض اللثة، مصنفة الدراسات حسب طريقة التصوير، ومهمة التشخيص، وهندسة الذكاء الاصطناعي، مع معالجة محددات الأداء وآثارها على الدمج السريري.

مناقشة

تسلط قسم المناقشة في ورقة البحث الضوء على التقدمات والتحديات التي تواجه الذكاء الاصطناعي (AI) في تشخيص أمراض اللثة عبر مختلف طرق التصوير. أظهرت الأشعة السينية المحيطية وأشعة العضّة نتائج واعدة، حيث حققت النماذج المعتمدة على الشبكات العصبية الالتفافية (CNN) دقة تشخيصية تتراوح بين 0.82 و0.85 وقيم منطقة تحت المنحنى (AUC) تتجاوز 0.88، مما يجعلها قابلة للمقارنة مع أداء الأطباء. لقد حسنت المنهجيات المعززة، مثل نماذج التعلم العميق الهجينة والتحقق من صحة متعددة المراكز، من القدرات التشخيصية، خاصة في اكتشاف فقدان العظم السنخي. بالمقابل، تواجه الأشعة السينية البانورامية والتصوير المقطعي المحوسب باستخدام شعاع المخروط (CBCT) تحديات بسبب التشوهات التشريحية، ومع ذلك، أظهرت CBCT أداءً متفوقًا في التقييمات الحجمية، حيث حققت دقة تصل إلى 0.91 وقيم AUC تتجاوز 0.95 للمهام المعقدة مثل اكتشاف التورط في الفروع.

تؤكد الورقة على أهمية تنوع مجموعات البيانات والتحقق الخارجي، مشيرة إلى أن الدراسات السابقة غالبًا ما فشلت في التعميم عبر مجموعات سكانية وظروف تصوير مختلفة. تدعو إلى الالتزام بمعايير الإبلاغ مثل STARD-AI وTRIPOD-AI لتعزيز الصرامة المنهجية والشفافية. سريريًا، لدى الذكاء الاصطناعي القدرة على تحسين دقة التشخيص وكفاءة سير العمل، خاصة في إعدادات الرعاية المتخصصة والرعاية الأولية. ومع ذلك، يتطلب الدمج الناجح في الممارسة السريرية معالجة التوافق التنظيمي، وحوكمة البيانات، والمبادرات التعليمية لضمان قدرة الأطباء على تفسير مخرجات الذكاء الاصطناعي بفعالية. يجب أن تركز الأبحاث المستقبلية على التجارب متعددة المراكز، والهندسات المعمارية المحددة للمهام، وتطوير أنظمة الذكاء الاصطناعي القابلة للتفسير لتعزيز ثقة الأطباء وتحسين رعاية المرضى في أمراض اللثة.

القيود

تتعدد قيود قاعدة الأدلة الحالية في تشخيص أمراض اللثة المدعوم بالذكاء الاصطناعي. تشمل القضايا الرئيسية تباين تصاميم الدراسات، مما يعقد المقارنات المباشرة بسبب الاختلافات في أحجام مجموعات البيانات، والمواقع التشريحية، واستراتيجيات التسمية، ومعايير التقييم. بالإضافة إلى ذلك، يحد هيمنة الدراسات الاسترجاعية على التجارب المستقبلية من قوة النتائج، بينما تعيق معايير الإبلاغ المتغيرة القابلية للتكرار والشفافية في المعالجة المسبقة، والتحقق، وتحليل الأخطاء. علاوة على ذلك، تثير التركيز الجغرافي لمجموعات البيانات متعددة المراكز مخاوف بشأن إمكانية تعميم النتائج على مجموعات مرضى متنوعة.

بعيدًا عن هذه القيود المنهجية، تهدد عدة تحيزات موثوقية نماذج الذكاء الاصطناعي. يمكن أن يؤدي تحيز مجموعة البيانات، الناجم عن عدم التوازن في انتشار المرض والتمثيل الديموغرافي، إلى تعلم ارتباطات زائفة. كما أن تباين معايير المرجع، خاصة عندما تُشتق تسميات الحقيقة الأرضية من أحكام سريرية غير متسقة، يضر بدقة النموذج. قد تفتقر أيضًا تعليقات الأطباء إلى الاتساق، مما يزيد من الأخطاء النظامية. إن خطر تحيز الأتمتة كبير، حيث قد يعتمد الأطباء بشكل مفرط على مخرجات الخوارزميات، مما قد يؤدي إلى التراخي في التشخيص وتأخير اكتشاف الأخطاء. لمعالجة هذه التحديات، فإن تنفيذ سير العمل الذي يتضمن الأطباء، وبروتوكولات التسمية الموحدة، وواجهات الذكاء الاصطناعي القابلة للتفسير أمر حاسم للحفاظ على المساءلة السريرية في اتخاذ القرارات المدعومة بالذكاء الاصطناعي.

Journal: Frontiers in Dental Medicine, Volume: 7
DOI: https://doi.org/10.3389/fdmed.2026.1729825
PMID: https://pubmed.ncbi.nlm.nih.gov/41660180
Publication Date: 2026-01-22
Author(s): Ady Azhari
Primary Topic: Dental Radiography and Imaging

Overview

The review synthesizes findings from 35 studies published between 2019 and 2025, focusing on the application of artificial intelligence (AI) in periodontal diagnostics across various imaging modalities, including periapical radiographs, cone-beam computed tomography (CBCT), and intraoral photographs. The analysis highlights that convolutional neural network (CNN)-based models for periapical radiographs achieved moderate-to-high diagnostic accuracy (0.82-0.85) and area under the curve (AUC) values exceeding 0.88, comparable to clinician performance. In contrast, CBCT demonstrated superior accuracy (up to 0.91) for volumetric assessments, while panoramic radiographs and intraoral photographs exhibited variable performance due to inconsistencies in imaging and reference standards.

Despite significant advancements in AI applications for periodontal diagnostics, challenges such as methodological heterogeneity, limited prospective evidence, and variable reporting standards hinder widespread clinical adoption. The integration of standardized reporting frameworks like STARD-AI and TRIPOD-AI, along with explainable AI techniques and clinician-in-the-loop workflows, is proposed as a viable pathway for safe and effective deployment. The authors advocate for future multicenter prospective trials and standardized imaging protocols to enhance the robustness and equity of AI integration in periodontal practice, emphasizing that AI should serve as a complementary tool to clinician expertise, ultimately improving diagnostic accuracy and patient care.

Introduction

**Introduction**

Periodontal diseases represent a significant global health issue, affecting billions and contributing to the burden of oral diseases. Early and accurate diagnosis is crucial to prevent disease progression and tooth loss. Traditional diagnostic methods, such as clinical probing and two-dimensional (2D) radiographic interpretation, have limitations, including examiner variability and the potential for distorted anatomical representations. In contrast, artificial intelligence (AI), particularly deep learning techniques like convolutional neural networks (CNNs), has emerged as a promising tool to enhance diagnostic workflows in dentistry, particularly in periodontology.

AI applications have proliferated across various dental disciplines, utilizing diverse imaging modalities such as periapical and panoramic radiographs, cone-beam computed tomography (CBCT), and intraoral photographs. Studies indicate that CNN-based classifiers can achieve diagnostic accuracies ranging from 0.82 to 0.85 on periapical radiographs, with area under the curve (AUC) values exceeding 0.88, often surpassing average clinician performance. However, the performance of AI models, including the DeepLabv3+ model, tends to decline when evaluated on external datasets, highlighting challenges related to dataset shifts and the need for rigorous external validation before widespread clinical implementation. Recent advancements in CBCT imaging have shown promising results in volumetric assessments, with deep learning models achieving accuracies around 0.91 for alveolar bone loss detection. This thematic narrative review aims to synthesize current literature on AI diagnostic accuracy in periodontics, categorizing studies by imaging modality, diagnostic task, and AI architecture, while addressing performance determinants and implications for clinical integration.

Discussion

The discussion section of the research paper highlights the advancements and challenges of artificial intelligence (AI) in periodontal diagnostics across various imaging modalities. Periapical and bitewing radiographs have shown promising results, with convolutional neural network (CNN)-based models achieving diagnostic accuracies between 0.82 and 0.85 and area under the curve (AUC) values exceeding 0.88, comparable to clinician performance. Enhanced methodologies, such as hybrid deep learning models and multicenter validations, have further improved diagnostic capabilities, particularly in detecting alveolar bone loss. In contrast, panoramic radiographs and cone-beam computed tomography (CBCT) face challenges due to anatomical distortions, yet CBCT has demonstrated superior performance in volumetric assessments, achieving accuracies up to 0.91 and AUC values above 0.95 for complex tasks like furcation involvement detection.

The paper emphasizes the importance of dataset diversity and external validation, noting that earlier studies often failed to generalize across different populations and imaging conditions. It advocates for adherence to reporting standards like STARD-AI and TRIPOD-AI to enhance methodological rigor and transparency. Clinically, AI has the potential to improve diagnostic accuracy and workflow efficiency, particularly in specialist and primary care settings. However, successful integration into clinical practice requires addressing regulatory alignment, data governance, and educational initiatives to ensure clinicians can effectively interpret AI outputs. Future research should focus on multicenter trials, task-specific architectures, and the development of explainable AI systems to foster clinician trust and enhance patient care in periodontics.

Limitations

The limitations of the current evidence base in AI-assisted periodontal diagnosis are multifaceted. Key issues include the heterogeneity of study designs, which complicates direct comparisons due to variations in dataset sizes, anatomical sites, labeling strategies, and evaluation metrics. Additionally, the predominance of retrospective studies over prospective trials limits the robustness of findings, while variable reporting standards hinder reproducibility and transparency in preprocessing, validation, and error analysis. Furthermore, the geographic concentration of multicenter datasets raises concerns about the generalizability of results to diverse patient populations.

Beyond these methodological constraints, several biases threaten the reliability of AI models. Dataset bias, stemming from imbalances in disease prevalence and demographic representation, can lead to the learning of spurious associations. Variability in reference standards, particularly when ground-truth labels are derived from inconsistent clinical judgments, further compromises model accuracy. Clinician annotations may also lack consistency, exacerbating systematic errors. The risk of automation bias is significant, as clinicians may overly rely on algorithmic outputs, potentially leading to diagnostic complacency and delayed error detection. To address these challenges, the implementation of clinician-in-the-loop workflows, standardized labeling protocols, and explainable AI interfaces is crucial for maintaining clinical accountability in AI-supported decision-making.