تقييم من قبل المتخصصين في طب الأسنان لتطبيق يعتمد على الذكاء الاصطناعي لقياس فقدان العظم السنخي Evaluation by dental professionals of an artificial intelligence-based application to measure alveolar bone loss

المجلة: BMC Oral Health، المجلد: 25، العدد: 1
DOI: https://doi.org/10.1186/s12903-025-05677-0
PMID: https://pubmed.ncbi.nlm.nih.gov/40025477
تاريخ النشر: 2025-03-01
المؤلف: Sang Won Lee وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

تبحث الدراسة في دمج الذكاء الاصطناعي (AI) في تشخيص الأسنان، مع التركيز بشكل خاص على قابلية الاستخدام والقبول لنموذج التعلم العميق (DL) المصمم لقياس ارتفاع القمة السنخية (ACH) من خلال الشبكات الخاصة بتقسيم المعاني واكتشاف الكائنات. تم تدريب النموذج على مجموعة بيانات تتكون من 550 صورة شعاعية، مما أسس معيارًا ذهبيًا لقياسات ACH. تم إجراء استبيان بين المتخصصين في طب الأسنان لمقارنة دقة الفحوصات الشعاعية اليدوية مع تطبيق الذكاء الاصطناعي وتقييم قبوله.

أشارت النتائج إلى أنه بينما حدد المتخصصون في طب الأسنان بدقة فقدان العظام اللثوي الشديد (ACH > 5 مم) في 35-87% من الحالات، حقق تطبيق الذكاء الاصطناعي معدل دقة يتراوح بين 82-87%. من بين 65 مشاركًا في الاستبيان، الذين كانوا في الغالب من الأوساط الأكاديمية، أفاد 21% فقط باستخدام أدوات الذكاء الاصطناعي في ممارستهم، مع تقدير 57% لمستويات العظام بدلاً من قياسها بدقة. ومن الجدير بالذكر أن 84% من المشاركين أعربوا عن دعمهم لتطبيق الذكاء الاصطناعي في قياس ACH، واعتقد 56% أنه سيكون مفيدًا في ممارستهم المهنية. تشير النتائج إلى أن تطبيق الذكاء الاصطناعي مقبول بشكل جيد ويمكن أن يعزز الدقة والكفاءة السريرية في تشخيص الأسنان.

مقدمة

تسلط المقدمة الضوء على التقدم الكبير في الذكاء الاصطناعي (AI) الذي عزز التشخيص بمساعدة الكمبيوتر (CAD) في التصوير الفموي، مما يتوازى مع التحسينات في مجالات طبية أخرى تتعلق بالصور الشعاعية. تم استخدام نماذج التعلم العميق (DL) بشكل فعال في طب الأسنان لمهام مثل تحديد الهياكل التشريحية واكتشاف النتائج المرضية على الأشعة. تعتبر مسألة قياس فقدان العظام السنخية على الأشعة السينية قضية حرجة تعتمد على التحليل البصري من قبل أطباء الأسنان، وهو ما قد يكون عرضة للأخطاء. على الرغم من أن نماذج DL أظهرت وعدًا في اكتشاف فقدان العظام السنخية الكبير، تظهر تحديات في مقارنة هذه النماذج بسبب الاختلافات في أنواع الأشعة داخل الفم، ومعايير الحقيقة الأساسية، وطرق التقييم.

تبلغ هذه الدراسة عن تطوير وتدريب خوارزميات DL المتقدمة التي تتجاوز الشبكات العصبية التي تم تأسيسها سابقًا، مثل VGG وResNet. على عكس بعض التطبيقات التجارية، استخدمت الدراسة مجموعات بيانات تم تنسيقها بعناية من قبل أخصائي أشعة فموية للتدريب والتحقق والاختبار. على الرغم من توفر برامج الذكاء الاصطناعي التجارية لتشخيص التسوس وفقدان العظام السنخية، إلا أن هناك نقصًا في الأبحاث حول قبول المتخصصين في طب الأسنان لهذه الأدوات ودمجها المقصود في الممارسة. تهدف الدراسة إلى (1) مقارنة دقة وكفاءة نموذج DL لقياس فقدان العظام السنخية مقابل التقييمات اليدوية من قبل مقدمي الخدمة، و(2) تقييم قابلية تطبيق الذكاء الاصطناعي بين المتخصصين في طب الأسنان، مع التركيز على الميزات التي يجدونها الأكثر فائدة.

الطرق

توضح قسم “الطرق” الإجراءات التجريبية والتحليلية المستخدمة في الدراسة. تتفصل في اختيار المشاركين، وتصميم التجارب، والتقنيات الإحصائية المستخدمة لتحليل البيانات. استخدم الباحثون تنسيق تجربة عشوائية محكومة لضمان موثوقية النتائج، مع إيلاء اهتمام خاص للتحكم في المتغيرات المربكة.

شملت جمع البيانات مقاييس وأدوات موحدة لتقييم النتائج الأساسية، مما يضمن الاتساق والصلاحية عبر العينة. تم إجراء التحليل باستخدام برامج إحصائية مناسبة، مع تحديد مستويات الدلالة عند p < 0.05. كانت الطرق المستخدمة تهدف إلى اختبار الفرضيات بدقة وتقديم أدلة قوية للنتائج المعروضة في الدراسة.

النتائج

في هذه الدراسة، تم تنفيذ خوارزميتين للتعلم العميق (DL) لمهام تقسيم المعاني، محققة نتائج ملحوظة: وصلت الخوارزمية الأولى لتقسيم الأسنان إلى دقة مجموعة اختبار عالمية قدرها 0.9567، بينما حققت الثانية، التي تركزت على تقسيم العظام السنخية، دقة قدرها 0.9281. بالنسبة لاكتشاف الكائنات، حققت الخوارزمية المصممة لتحديد نقاط التقاء الأسمنت والمينا دقة متوسطة قدرها 0.72، بينما وصلت الخوارزمية الخاصة باكتشاف فقدان العظام السنخية (ABCLs) إلى 0.65.

تمت مقارنة أداء تطبيق الذكاء الاصطناعي مع أداء 56 متخصصًا في طب الأسنان في تصنيف فقدان العظام اللثوي الشديد (ACH > 5 مم) مقابل غير الشديد (ACH ≤ 5 مم) عبر ثلاث صور شعاعية. أظهر تطبيق الذكاء الاصطناعي دقة تصنيف قدرها 94%، متفوقًا بشكل كبير على المتخصصين، الذين حققوا فقط 68% دقة. في الصورة الشعاعية الأولى، حقق الذكاء الاصطناعي دقة مثالية (100%) على 16 قياسًا قابلًا للحساب لـ ACH، بينما صنف المتخصصون 87% بشكل صحيح. أظهرت الصورة الشعاعية الثانية أن الذكاء الاصطناعي حقق دقة 83% على 6 قياسات لـ ACH، مقارنة بـ 35% من قبل المتخصصين. في الصورة الشعاعية الثالثة، حافظ الذكاء الاصطناعي على دقة 100% على 13 قياسًا لـ ACH، بينما حقق المتخصصون 82%. عالج تطبيق الذكاء الاصطناعي جميع الصور في أقل من 10 ثوانٍ، مقارنة بأوقات التحليل المتوسطة للمتخصصين، التي تراوحت بين حوالي 71.2 إلى 105.3 ثوانٍ.

المناقشة

تناقش البحث تطوير وتنفيذ نموذج التعلم العميق (DL) المصمم لأتمتة قياس ارتفاع القمة السنخية (ACH) في الصور الشعاعية. يدمج هذا النموذج خمسة شبكات عصبية عميقة، بما في ذلك شبكتين لاكتشاف الكائنات تحدد مستوى العظام السنخية (ABCL) ونقاط التقاء الأسمنت والمينا (CEJ)، وثلاث شبكات لتقسيم المعاني تحدد بدقة هذه الهياكل والأسنان في الأشعة. تم تدريب النموذج على مجموعة بيانات تتكون من 550 صورة شعاعية مشروحة، محققًا دقة عالية في اكتشاف ABCLs وCEJs، مع درجات دقة متوسطة قدرها 0.60 و0.65، على التوالي، عبر الشبكات المختلفة. تسلط الدراسة الضوء على الأداء المتفوق للنموذج مقارنة بالمتخصصين في طب الأسنان في تقييم فقدان العظام المرتبط بأمراض اللثة، مما يبرز إمكانيته كأداة تشخيصية موثوقة.

بالإضافة إلى ذلك، تم إجراء استبيان بين المتخصصين في طب الأسنان لتقييم قابلية الاستخدام والقبول لتطبيق الذكاء الاصطناعي. على الرغم من انخفاض معدل الاستجابة، أشارت النتائج إلى أن 84% من المشاركين وافقوا على قياسات ACH للذكاء الاصطناعي، واعتقد 56% أن الذكاء الاصطناعي يمكن أن يعزز ممارستهم. ومع ذلك، أفاد 21% فقط باستخدام أدوات الذكاء الاصطناعي في عملهم، مما يشير إلى فرصة كبيرة لتبني تقنيات DL بشكل أوسع في طب الأسنان. تعترف الدراسة بالقيود، بما في ذلك تدريب النموذج على علامة تجارية واحدة من معدات الأشعة السينية والحاجة إلى مزيد من البحث لاستكشاف الحواجز أمام دمج الذكاء الاصطناعي في ممارسات طب الأسنان. بشكل عام، تدعم النتائج إمكانيات تطبيقات DL لتحسين دقة وكفاءة التشخيص في تقييمات أمراض اللثة.

Journal: BMC Oral Health, Volume: 25, Issue: 1
DOI: https://doi.org/10.1186/s12903-025-05677-0
PMID: https://pubmed.ncbi.nlm.nih.gov/40025477
Publication Date: 2025-03-01
Author(s): Sang Won Lee et al.
Primary Topic: Dental Radiography and Imaging

Overview

The research investigates the integration of artificial intelligence (AI) in dental diagnostics, specifically focusing on the acceptability and usability of a deep learning (DL) model designed to measure alveolar crestal height (ACH) through semantic segmentation and object detection networks. The model was trained on a dataset of 550 bitewing radiographs, establishing a gold standard for ACH measurements. A survey was conducted among dental professionals to compare the accuracy of manual X-ray examinations with the AI application and to evaluate its acceptability.

Results indicated that while dental professionals accurately identified severe periodontal bone loss (ACH > 5 mm) in 35-87% of cases, the AI application achieved an accuracy rate of 82-87%. Among the 65 survey participants, predominantly from academic settings, only 21% reported using AI tools in their practice, with 57% approximating bone levels rather than measuring precisely. Notably, 84% of participants expressed support for the AI application in measuring ACH, and 56% believed it would be beneficial in their professional practice. The findings suggest that the AI application is well-received and could enhance clinical accuracy and efficiency in dental diagnostics.

Introduction

The introduction highlights significant advancements in artificial intelligence (AI) that have enhanced computer-aided diagnosis (CAD) in oral imaging, paralleling improvements in other medical fields involving radiographic images. Deep learning (DL) models have been effectively employed in dentistry for tasks such as identifying anatomical structures and detecting pathological findings on radiographs. A critical issue in measuring alveolar bone loss on X-rays is the reliance on visual analysis by dentists, which is susceptible to errors. Although DL models have shown promise in detecting significant alveolar bone loss, challenges arise in comparing these models due to variations in intra-oral radiograph types, ground truth criteria, and evaluation methods.

This study reports the development and training of advanced DL algorithms that surpass previously established neural networks, such as VGG and ResNet. Unlike some commercial applications, the study utilized meticulously curated datasets by an oral radiologist for training, validation, and testing. Despite the availability of commercial AI programs for diagnosing caries and alveolar bone loss, there is a lack of research on dental professionals’ acceptance of these tools and their intended integration into practice. The study aims to (1) compare the accuracy and efficiency of the DL model for measuring alveolar bone loss against manual evaluations by providers, and (2) assess the acceptability of the AI application among dental professionals, focusing on the features they find most beneficial.

Methods

The “Methods” section outlines the experimental and analytical procedures employed in the study. It details the selection of participants, the design of the experiments, and the statistical techniques used for data analysis. The researchers utilized a randomized controlled trial format to ensure the reliability of the results, with specific attention given to controlling for confounding variables.

Data collection involved standardized measures and instruments to assess the primary outcomes, ensuring consistency and validity across the sample. The analysis was conducted using appropriate statistical software, with significance levels set at p < 0.05. The methods employed aimed to rigorously test the hypotheses and provide robust evidence for the findings presented in the study.

Results

In this study, two deep learning (DL) algorithms were implemented for semantic segmentation tasks, achieving notable results: the first algorithm for segmenting teeth reached a global test set accuracy of 0.9567, while the second, focused on alveolar bone segmentation, attained an accuracy of 0.9281. For object detection, the algorithm designed to identify cemento-enamel junctions achieved an average precision of 0.72, and the one for detecting alveolar bone loss (ABCLs) reached 0.65.

The performance of the AI application was compared to that of 56 dental professionals in classifying severe (ACH > 5 mm) versus nonsevere (ACH ≤ 5 mm) periodontal bone loss across three bitewing radiographs. The AI application demonstrated a classification accuracy of 94%, significantly outperforming the professionals, who achieved only 68% accuracy. In the first radiograph, the AI achieved perfect accuracy (100%) on 16 calculable ACHs, while professionals classified 87% correctly. The second radiograph showed the AI at 83% accuracy on 6 ACHs, compared to 35% by professionals. In the third radiograph, the AI maintained 100% accuracy on 13 ACHs, while professionals achieved 82%. The AI application processed all images in under 10 seconds, contrasting with the mean analysis times for professionals, which ranged from approximately 71.2 to 105.3 seconds.

Discussion

The research discusses the development and implementation of a deep learning (DL) model designed to automate the measurement of alveolar crestal height (ACH) in bitewing radiographs. This model integrates five deep neural networks, including two object detection networks that identify the alveolar bone crestal level (ABCL) and cement-enamel junctions (CEJ), and three semantic segmentation networks that accurately delineate these structures and the teeth in the radiographs. The model was trained on a dataset of 550 annotated bitewing images, achieving high precision in detecting ABCLs and CEJs, with average precision scores of 0.60 and 0.65, respectively, across different networks. The study highlights the model’s superior performance compared to dental professionals in assessing periodontal disease-related bone loss, emphasizing its potential as a reliable diagnostic tool.

Additionally, a survey conducted among dental professionals evaluated the acceptability and usability of the AI application. Despite a low response rate, the findings indicated that 84% of participants agreed with the AI’s ACH measurements, and 56% believed that AI could enhance their practice. However, only 21% reported using AI tools in their work, suggesting a significant opportunity for broader adoption of DL technologies in dentistry. The study acknowledges limitations, including the model’s training on a single brand of X-ray equipment and the need for further research to explore barriers to AI integration in dental practices. Overall, the findings support the potential of DL applications to improve diagnostic accuracy and efficiency in periodontal assessments.