التعلم العميق لتقييم مؤشر اللويحات السنية بشكل آلي: التحقق من صحة النتائج مقابل تقييمات الخبراء Deep learning for automated dental plaque index assessment: validation against expert evaluations

المجلة: BMC Oral Health، المجلد: 25، العدد: 1
DOI: https://doi.org/10.1186/s12903-025-06350-2
PMID: https://pubmed.ncbi.nlm.nih.gov/40604649
تاريخ النشر: 2025-07-02
المؤلف: Jin-Sun Jeong وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

أظهر دمج الذكاء الاصطناعي (AI) في الرعاية الصحية، وخاصة في طب الأسنان، إمكانيات لتعزيز اتخاذ القرارات السريرية ودقة التشخيص. ركزت هذه الدراسة على تطوير نظام تعلم عميق (DL) للكشف التلقائي وتقدير كمية اللويحات السنية باستخدام الصور داخل الفم، مسترشدًا بمؤشر اللويحات القياسي كويغلي-هاين. تم تقييم سبعين مشاركًا، مع التقاط الصور قبل وبعد تطبيق عامل الكشف عن اللويحات. تم تدريب نموذج DL باستخدام الصور المعلّمة، وتمت مقارنة أدائه إحصائيًا مع التقييمات التي أجراها طبيب أسنان ذو خبرة وأخصائي صحة فموية.

أشارت النتائج إلى أن نموذج DL حقق دقة متوسطة صغيرة تبلغ 73.67% ودقة متوسطة كبيرة تبلغ 65.15%، بالإضافة إلى دقة تبلغ 76.34%، واسترجاع يبلغ 65.15%، ودرجة F1 تبلغ 66.15%. ومن الجدير بالذكر أنه لم يكن هناك فرق كبير في الأداء بين نموذج DL وطبيب الأسنان ذو الخبرة (P > 0.05)، مما يشير إلى موثوقية النموذج السريرية. تستنتج الدراسة أن النظام القائم على DL يقوم بشكل فعال بأتمتة تقييم اللويحات السنية، مما يوفر أداة واعدة لتعزيز طب الأسنان الرقمي وتسهيل التقييمات السنية عن بُعد، وبالتالي تحسين تقديم الرعاية الصحية الفموية بشكل عام.

مقدمة

تستعرض مقدمة ورقة البحث الآثار الصحية الكبيرة لمرض اللثة، الذي يرتبط بحالات نظامية متنوعة مثل الأمراض القلبية الوعائية، والسكري، والتدهور المعرفي، بما في ذلك الخرف. يتم تسليط الضوء على تحدي الكشف عن اللويحات السنية، وهو المساهم الرئيسي في مرض اللثة، مع الإشارة إلى أن طرق الكشف الحالية غالبًا ما تكون مرهقة وتتطلب خبرة كبيرة، مما يؤدي إلى إرهاق الأطباء. مع تزايد الوعي بتأثيرات صحة الفم الأوسع، هناك طلب متزايد على حلول فعالة لرعاية الأسنان.

تناقش الورقة الاهتمام المتزايد في تقنيات الذكاء الاصطناعي (AI) والتعلم العميق (DL)، التي أظهرت وعدًا في تعزيز دقة التشخيص عبر مجالات طبية متنوعة. تشمل الأمثلة البارزة الخوارزميات الخاصة باعتلال الشبكية السكري وسرطان الجلد التي تتفوق على المتخصصين البشريين. في طب الأسنان، تظهر تطبيقات الذكاء الاصطناعي لتحسين كفاءة التشخيص وتقليل الأعباء. ومع ذلك، تركز معظم تطبيقات DL الحالية على الأشعة السينية، التي تكون مكلفة وتنطوي على مخاطر الإشعاع. تهدف هذه الدراسة إلى تطوير وتقييم خوارزميات DL التي يمكن أن تقيم تلقائيًا مؤشر اللويحات من الصور الملتقطة باستخدام الأجهزة المحمولة، مما يوفر أداة أكثر وصولًا وملاءمة للمهنيين في مجال طب الأسنان.

الطرق

تستعرض قسم “المواد والطرق” تصميم التجربة والإجراءات المستخدمة في الدراسة. توضح المواد المحددة المستخدمة، بما في ذلك أي مواد كيميائية، ومعدات، وعينات بيولوجية، مما يضمن إمكانية تكرار التجارب. يتم وصف المنهجية بطريقة منهجية، مع تسليط الضوء على التقنيات المستخدمة في جمع البيانات وتحليلها، مثل الاختبارات الإحصائية أو النماذج الحسابية المطبقة.

بالإضافة إلى ذلك، قد يتضمن القسم معلومات حول حجم العينة، والضوابط، وأي اعتبارات أخلاقية تم أخذها في الاعتبار خلال البحث. يسمح هذا النهج الشامل بفهم واضح لكيفية اشتقاق النتائج ويدعم صلاحية النتائج المقدمة في الدراسة.

النتائج

تشير نتائج هذه الدراسة إلى أن نموذج التعلم العميق (DL) الذي تم تطويره للكشف وتصنيف اللويحات السنية باستخدام مؤشر اللويحات كويغلي-هاين يظهر مقاييس أداء واعدة. حقق النموذج دقة متوسطة صغيرة تبلغ 73.67% ودقة متوسطة كبيرة تبلغ 65.15%، إلى جانب دقة تبلغ 76.34% واسترجاع يبلغ 65.15%. ومن الجدير بالذكر أن هذه النتائج قابلة للمقارنة مع نتائج طبيب أسنان لديه 10 سنوات من الخبرة، مما يبرز الصلة السريرية للنموذج. كانت تطبيق تقنيات زيادة البيانات أمرًا حاسمًا في تعزيز قدرات تصنيف النموذج لفئات مؤشر اللويحات الممثلة تمثيلًا ناقصًا، مما يضمن أداءً قويًا عبر ظروف المرضى المتنوعة.

تشير هذه النتائج إلى إمكانيات كبيرة لدمج الأدوات المدفوعة بالذكاء الاصطناعي في ممارسات طب الأسنان، خاصة في أتمتة مهام الكشف عن اللويحات التي تكون تقليديًا كثيفة العمل للأطباء. يمكن أن تخفف هذه الأتمتة من عبء العمل على الأطباء وتسهّل تقييمات أكثر كفاءة لنظافة الفم، مما يسهم في تحسين رعاية المرضى. بينما تتماشى النتائج مع الدراسات السابقة التي تظهر فعالية نماذج DL في مجالات طبية متنوعة، يحذر المؤلفون من أن قابلية تعميم هذه النتائج محدودة ويجب تفسيرها بحذر.

المناقشة

في هذه الدراسة، طور المؤلفون نموذج تعلم عميق (DL) لتقييم اللويحات السنية باستخدام صور فوتوغرافية قياسية، مما يختلف عن الأبحاث السابقة التي استخدمت بشكل أساسي التصوير الشعاعي أو الأجهزة المتخصصة. شملت الدراسة 70 بالغًا صينيًا صحيًا، مما يضمن تجانس رؤية اللويحات من خلال استبعاد الأفراد الذين يخضعون لعلاجات تقويم الأسنان أو لديهم مشاكل سنية كبيرة. تم التقاط صور عالية الدقة وتحليلها باستخدام مؤشر اللويحات كويغلي-هاين، المعدل بواسطة تورسكي، لتقييم تراكم اللويحات. خضع نموذج DL لعملية تدريب صارمة، مستخدمًا هياكل متنوعة وتقنيات زيادة البيانات لمعالجة عدم التوازن في فئات مؤشر اللويحات. أظهر النموذج دقة عالية، ودقة، واسترجاع، محققًا أداءً قابلًا للمقارنة مع طبيب أسنان لديه عشر سنوات من الخبرة.

تشير النتائج إلى أن نموذج DL يمكن أن يعمل كأداة مساعدة فعالة لتقييم اللويحات، خاصة في البيئات التي تعاني من محدودية الوصول إلى الرعاية السنية. بينما لم يتفوق النموذج على طبيب الأسنان ذو الخبرة في جميع المقاييس، لم يظهر أي فرق إحصائي كبير في الأداء، مما يشير إلى إمكانية استخدامه في التقييمات السنية الروتينية. تؤكد الدراسة على أهمية التحقق السريري في تقييم نماذج DL وتبرز الحاجة إلى مجموعات بيانات متنوعة ومقيمين متعددين في الأبحاث المستقبلية لتعزيز قوة وملاءمة أنظمة الكشف عن اللويحات الآلية في الممارسة السريرية.

القيود

يستعرض هذا القسم عدة قيود للدراسة، والتي قد تؤثر على قابلية تعميم وملاءمة نتائجها. أولاً، يحد حجم العينة المكونة من 70 مشاركًا من القدرة على استنتاج النتائج عبر خلفيات ديموغرافية وسريرية متنوعة. يركز البحث على الأسطح الخدّية لأسنان محددة واستبعاد الأفراد الذين لديهم ترميمات صناعية ثابتة أو قابلة للإزالة، أو أجهزة تقويم الأسنان، أو فقدان جزئي للأسنان، بينما يضمن امتصاص صبغة موحد ورؤية واضحة للويحات، مما يحد من قابليته للتطبيق على مجموعة أوسع من المرضى. بالإضافة إلى ذلك، قد لا تمثل الاعتماد على كاميرات DSLR عالية الدقة لالتقاط الصور تحت ظروف محكومة بدقة التباين الذي يتم مواجهته في البيئات السريرية النموذجية، حيث تكون الأجهزة المحمولة أو الكاميرات داخل الفم أكثر شيوعًا.

علاوة على ذلك، فإن التصنيف اليدوي للصور لتدريب النموذج يقدم خطأ بشريًا محتملاً، على الرغم من الجهود المبذولة للحفاظ على الاتساق وتحقيق درجة موثوقية عالية بين الفاحصين (كابا فليز 0.876). قد يؤدي الاعتماد على فاحص واحد ذو خبرة لتقييم اللويحات إلى إدخال ذاتية وتباين، خاصة بالنظر إلى مؤشر اللويحات متعدد المستويات المستخدم. يجب أن تشمل الأبحاث المستقبلية مقيمين خبراء متعددين مع معايرة لتعزيز الموثوقية وقابلية التعميم. لمعالجة هذه القيود، ستسعى الأعمال المستقبلية إلى توسيع معايير الإدماج، ودمج التصوير الفموي الكامل، والتحقق من أداء النموذج باستخدام الصور الملتقطة من قبل المرضى في سياقات العالم الحقيقي، مما يدعم التقييمات السنية عن بُعد.

Journal: BMC Oral Health, Volume: 25, Issue: 1
DOI: https://doi.org/10.1186/s12903-025-06350-2
PMID: https://pubmed.ncbi.nlm.nih.gov/40604649
Publication Date: 2025-07-02
Author(s): Jin-Sun Jeong et al.
Primary Topic: Dental Radiography and Imaging

Overview

The integration of artificial intelligence (AI) in healthcare, particularly in dentistry, has shown potential for enhancing clinical decision-making and diagnostic accuracy. This study focused on developing a deep learning (DL) system for the automatic detection and quantification of dental plaque using intraoral images, guided by the standardized Quigley-Hein plaque index. Seventy participants were evaluated, with images captured before and after the application of a plaque-disclosing agent. The DL model was trained using labeled images, and its performance was statistically compared to assessments made by an experienced dentist and a dental hygienist.

The results indicated that the DL model achieved a micro-average accuracy of 73.67% and a macro-average accuracy of 65.15%, along with a precision of 76.34%, recall of 65.15%, and an F1 score of 66.15%. Notably, there was no significant difference in performance between the DL model and the experienced dentist (P > 0.05), suggesting the model’s clinical reliability. The study concludes that the DL-based system effectively automates dental plaque evaluation, offering a promising tool for enhancing digital dentistry and facilitating remote dental assessments, thereby improving overall oral healthcare delivery.

Introduction

The introduction of the research paper outlines the significant health implications of periodontal disease, which is linked to various systemic conditions such as cardiovascular diseases, diabetes, and cognitive decline, including dementia. The challenge of detecting dental plaque, a primary contributor to periodontal disease, is highlighted, noting that current detection methods are often cumbersome and require considerable expertise, leading to clinician fatigue. As awareness of oral health’s broader impacts grows, there is an increasing demand for effective dental care solutions.

The paper discusses the rising interest in artificial intelligence (AI) and deep learning (DL) technologies, which have shown promise in enhancing diagnostic accuracy across various medical fields. Notable examples include algorithms for diabetic retinopathy and skin cancer that outperform human specialists. In dentistry, AI applications are emerging to improve diagnostic efficiency and reduce workloads. However, most existing DL applications focus on X-ray radiographs, which are costly and involve radiation risks. This study aims to develop and evaluate DL algorithms that can automatically assess plaque index from images taken with mobile devices, offering a more accessible and practical tool for dental professionals.

Methods

The “Materials and Methods” section outlines the experimental design and procedures employed in the study. It details the specific materials used, including any reagents, equipment, and biological samples, ensuring reproducibility of the experiments. The methodology is described in a systematic manner, highlighting the techniques for data collection and analysis, such as statistical tests or computational models applied.

Additionally, the section may include information on the sample size, controls, and any ethical considerations taken into account during the research. This comprehensive approach allows for a clear understanding of how the findings were derived and supports the validity of the results presented in the study.

Results

The results of this study indicate that the deep learning (DL) model developed for detecting and classifying dental plaques using the Quigley-Hein plaque index demonstrates promising performance metrics. The model achieved a micro-average accuracy of 73.67% and a macro-average accuracy of 65.15%, alongside a precision of 76.34% and a recall of 65.15%. Notably, these results are comparable to those of a dentist with 10 years of experience, underscoring the model’s clinical relevance. The implementation of data augmentation techniques was crucial in enhancing the model’s classification capabilities for underrepresented plaque index categories, thereby ensuring robust performance across diverse patient conditions.

These findings suggest significant potential for integrating AI-driven tools into dental practices, particularly in automating plaque detection tasks that are traditionally labor-intensive for clinicians. Such automation could alleviate clinician workload and facilitate more efficient assessments of oral hygiene, ultimately contributing to improved patient care. While the results align with previous studies demonstrating the efficacy of DL models in various medical fields, the authors caution that the generalizability of these findings is limited and should be interpreted with care.

Discussion

In this study, the authors developed a deep learning (DL) model to evaluate dental plaque using standard photographic images, contrasting with previous research that primarily utilized radiographic imaging or specialized devices. The study involved 70 healthy Chinese adults, ensuring uniformity in plaque visualization by excluding individuals with orthodontic treatments or significant dental issues. High-resolution images were captured and analyzed using the Quigley-Hein plaque index, modified by Turesky, to assess plaque accumulation. The DL model underwent a rigorous training process, employing various architectures and data augmentation techniques to address class imbalances in plaque index categories. The model demonstrated high accuracy, precision, and recall, achieving performance comparable to that of a dentist with ten years of experience.

The findings indicate that the DL model can serve as an effective adjunct tool for plaque assessment, particularly in settings with limited access to dental care. While the model did not outperform the experienced dentist in all metrics, it showed no significant statistical difference in performance, suggesting its potential utility in routine dental evaluations. The study emphasizes the importance of clinical validation in assessing DL models and highlights the need for diverse datasets and multiple evaluators in future research to enhance the robustness and applicability of automated plaque detection systems in clinical practice.

Limitations

This section outlines several limitations of the study, which may affect the generalizability and applicability of its findings. Firstly, the sample size of 70 participants restricts the ability to extrapolate results across diverse demographic and clinical backgrounds. The study’s focus on the buccal surfaces of specific teeth and the exclusion of individuals with fixed or removable prosthetic restorations, orthodontic appliances, or partial edentulism, while ensuring uniform dye uptake and clear plaque visualization, limits its applicability to a broader patient population. Additionally, the reliance on high-resolution DSLR cameras for image capture under controlled conditions may not accurately represent the variability encountered in typical clinical settings, where mobile devices or intraoral cameras are more prevalent.

Moreover, the manual labeling of images for model training introduces potential human error, despite efforts to maintain consistency and achieve a high inter-examiner reliability score (Fleiss’s kappa of 0.876). The study’s reliance on a single experienced examiner for plaque scoring may introduce subjectivity and variability, particularly given the multi-level plaque index used. Future research should involve multiple expert evaluators with calibration to enhance reliability and generalizability. To address these limitations, future work will aim to expand inclusion criteria, incorporate full-mouth imaging, and validate the model’s performance using patient-captured images in real-world contexts, thereby supporting remote dental evaluations.