“تحديد فعالية نموذج تعلم الآلة لقياس فقدان العظام اللثوي” “Determining the efficacy of a machine learning model for measuring periodontal bone loss”

المجلة: BMC Oral Health، المجلد: 24، العدد: 1
DOI: https://doi.org/10.1186/s12903-023-03819-w
PMID: https://pubmed.ncbi.nlm.nih.gov/38233822
تاريخ النشر: 2024-01-17
المؤلف: Diego Cerda Mardini وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

تبحث هذه الدراسة في فعالية نموذج التعلم الآلي (ML) المصمم لأتمتة قياس فقدان العظام اللثوي (PBL) في الأشعة السينية البانورامية، مع معالجة الحاجة إلى تحسين أدوات التشخيص في ضوء انتشار التهاب اللثة. تم استخدام مجموعة بيانات تتكون من 2010 صور، تم تخصيص 1970 صورة للتدريب و40 للاختبار. يدمج النموذج المقترح تقنيات الاستدلال الإحصائي، والشبكات العصبية التلافيفية (CNNs) لاستخراج المعلومات البصرية، وخوارزمية لت quantifying PBL كنسبة وتصنيفه إلى مراحل. تم تقييم أداء النموذج مقابل تقييمات من طبيبين أشعة، وطبيبين لثة، وطبيب أسنان عام واحد من خلال اختبار موحد.

أشارت النتائج إلى أن نموذج التعلم الآلي أظهر أداءً مقبولاً في تشخيص فقدان العظام اللثوي الخفيف إلى المعتدل، محققًا حساسية مرجحة تبلغ 0.23 ودرجة F1 مرجحة تبلغ 0.29، بالإضافة إلى القدرة على التشخيص في الوقت الحقيقي. ومع ذلك، كان غير فعال في تشخيص فقدان العظام اللثوي الشديد، حيث كانت الحساسية والدقة ودرجة F1 جميعها تساوي صفر. تستنتج الدراسة أنه بينما يظهر نموذج التعلم الآلي وعدًا في أتمتة تشخيص PBL في الأشعة السينية البانورامية، فإن المزيد من التطوير ضروري، خاصة للحالات الشديدة. يمكن أن تعزز التطبيقات المستقبلية لأدوات التعلم الآلي المماثلة سير العمل التشخيصي لمختلف الأمراض الفموية ذات الميزات الشعاعية.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على الانتشار العالمي الكبير لالتهاب اللثة، الذي يؤثر على حوالي 740 مليون فرد ويشكل عبئًا كبيرًا على الجوانب الاجتماعية والاقتصادية والرعاية الصحية. تؤكد على الأهمية الحاسمة للتشخيص المبكر والصيانة المنتظمة في إدارة المرض، كما هو مصنف في التصنيف الحالي للأمراض والحالات اللثوية واللثوية المحيطة بالزرع، الذي يصنف التهاب اللثة إلى مراحل (1 إلى 4) بناءً على الشدة والتعقيد، ودرجات (A إلى C) تعكس خطر التقدم وتأثير الصحة العامة. يتم التأكيد على التقييم الشعاعي، وخاصة من خلال الأشعة السينية البانورامية، كأمر أساسي لتشخيص ومراقبة فقدان العظام اللثوي (PBL)، نظرًا لمزاياه مثل التعرض المنخفض للإشعاع ونظرة شاملة.

تناقش الورقة أيضًا الاندماج المتزايد للذكاء الاصطناعي (AI) في طب الأسنان، وخاصة من خلال نماذج التعلم الآلي (ML) التي تؤتمت التحليل الشعاعي. يتم تسليط الضوء على خوارزميات التعلم الخاضع للإشراف لقدرتها على التعلم من مجموعات البيانات المعلّمة، مما يحسن الأداء على البيانات الجديدة. تهدف الدراسة إلى تقييم فعالية نموذج التعلم الآلي في قياس PBL تلقائيًا في الأشعة السينية البانورامية، مقارنةً بأدائه مقابل أداء المتخصصين في طب الأسنان. هذه الدراسة ذات صلة خاصة لأنها تعالج فجوة في تطبيق التعلم الآلي لتقييم PBL ضمن السكان التشيلينيين، مما قد يعزز سير العمل التشخيصي الحالي في إدارة التهاب اللثة.

طرق البحث

في هذه الدراسة، تم جمع 500 صورة شعاعية بانورامية من مجموعتين: 250 من مجموعة تشيلية غير تمثيلية تم علاجها في جامعة الأنديز و250 من مجموعة بيانات متاحة للجمهور من جامعة توفتس. تم إخفاء هوية كل صورة شعاعية برمز فريد، وتم تضمين الأضراس فقط للتحليل، مع استبعاد المرضى الذين لا أسنان لهم والذين لديهم أطقم مؤقتة. تم إجراء تصنيف الصور الشعاعية بواسطة طالب جامعي تحت إشراف طبيب أشعة ذو خبرة، مع التركيز على النقاط التشريحية الرئيسية اللازمة لحساب فقدان العظام اللثوي (PBL). شمل عملية التصنيف إنشاء صناديق محيطة حول الأضراس وتحديد ثلاث نقاط حاسمة: نقطة التقاء المينا والعاج (CEJ)، وقمة العظم، وقمة الجذر.

تكون نموذج التعلم الآلي الذي تم تطويره في هذه الدراسة من ثلاثة مكونات متسلسلة. أنشأ المكون الأول توقعات أولية لنقاط PBL باستخدام الاستدلال الإحصائي، تلاه شبكة عصبية تلافيفية عميقة (DCNN) قامت بتحسين هذه التوقعات بناءً على المحتوى البصري. استخدم المكون النهائي خوارزمية قائمة على القواعد لحساب نسب PBL والمراحل. تم تدريب النموذج على 1970 صورة مصنفة، مع تخصيص 40 صورة للاختبار. تم تقييم الأداء مقابل خمسة مشاركين بشريين، بما في ذلك طبيبين أشعة، باستخدام مؤشرات تشخيصية مثل الحساسية والنوعية. تمت الموافقة على الدراسة أخلاقيًا، وتم الحصول على موافقة مستنيرة من جميع المشاركين المعنيين.

النتائج

تشير النتائج إلى وجود معامل ارتباط بين الفئات (ICC) مرتفع يبلغ 0.91 لمراقب تصنيف البيانات، مما يشير إلى توافق قوي بين المشاركين ذوي سنوات الخبرة المختلفة، كما هو موضح في الملفات التكميلية. أظهرت نتائج الاختبار من كلا طبيبي الأشعة أن المرحلة 1 من PBL تم تحديدها في 57.14% من الحالات، والمرحلة 2 في 35.71%، والمراحل 3/4 في 7.14%. كانت النسب المئوية المتوسطة لكل مرحلة من PBL هي 10.58% للمرحلة 1، و24.84% للمرحلة 2، و46.66% للمراحل 3/4. أكمل أطباء الأشعة الاختبار في متوسط 23.2 دقيقة، بينما استغرق المراقبون 26.9 دقيقة، مما أدى إلى متوسط إجمالي قدره 25.4 دقيقة، في تناقض صارخ مع وقت استجابة النموذج الذي بلغ 0.93 ثانية فقط.

فيما يتعلق بمقاييس الأداء، أظهر النموذج حساسية ونوعية متفاوتة عبر مراحل PBL. بالنسبة للمرحلة 1، حقق حساسية قدرها 0.5، وهي الثانية الأعلى في المجموعة، لكنه كان لديه أدنى دقة (0.26)، واسترجاع (0.42)، ونوعية (0.39)، مع درجة F1 تبلغ 0.34. أظهرت النتائج المتوسطة الكلية أن النموذج كان لديه أدنى حساسية (0.3)، ونوعية (0.65)، ودقة (0.14)، واسترجاع (0.64)، بينما احتل المرتبة الثالثة في درجة F1 (0.19). في المتوسط المرجح، كان لدى النموذج ثاني أعلى حساسية (0.23) وثالث أعلى دقة (0.11) ودرجة F1 (0.15)، لكنه كان لديه أدنى استرجاع (0.26) ونوعية (0.26). بالنسبة للمعدل الدقيق، احتل النموذج مرة أخرى المرتبة الثانية في الحساسية (0.42) والثالثة في الدقة (0.22) ودرجة F1 (0.29). حقق طبيب اللثة 2 باستمرار أعلى القيم عبر جميع المقاييس، مع توفير تفاصيل أداء المشاركين في الجداول 5 و6.

المناقشة

تسلط قسم المناقشة في الدراسة الضوء على تطوير شبكة عصبية تلافيفية عميقة (DCNN) تهدف إلى أتمتة تشخيص فقدان العظام اللثوي الشعاعي (PBL). على الرغم من الأداء المقبول للنموذج وقدرته على التشخيص في الوقت الحقيقي، تم تدريبه على مجموعة بيانات محدودة من 500 صورة شعاعية بانورامية، وهو أقل من الحجم الموصى به لمثل هذه النماذج. لمعالجة نقص البيانات، دمجت الدراسة مجموعة سكانية ثانية من مجموعة بيانات متاحة للجمهور، على الرغم من أن هذا قد يؤثر على قابلية تعميم النموذج على السكان التشيلينيين المستهدفين. يعترف المؤلفون بأن الطبيعة غير التمثيلية لمجموعات البيانات تحد من قابلية تطبيق النموذج على السكان الخارجيين ويؤكدون على الحاجة إلى مجموعة بيانات متوازنة لتعزيز دقة التشخيص عبر مراحل المرض المختلفة.

أظهر النموذج كفاءة في تشخيص مراحل PBL الخفيفة إلى المعتدلة، محققًا تشخيصًا في الوقت الحقيقي بمعدل 0.02 ثانية لكل سن. ومع ذلك، واجه صعوبة في اكتشاف PBL الشديد (المراحل 3/4)، مما يشكل قيدًا كبيرًا. أشارت مقاييس الأداء إلى أنه بينما كانت نتائج النموذج قابلة للمقارنة مع تلك الخاصة بأطباء الأسنان العامين، إلا أنه كان أداؤه أقل مقارنة بأطباء اللثة والأطباء الأشعة الأكثر خبرة. تقترح الدراسة اتجاهات البحث المستقبلية، بما في ذلك إنشاء مجموعة بيانات مخصصة لحالات PBL الشديدة واستكشاف التشخيص الآلي لمشاكل الأسنان الأخرى. بشكل عام، تؤكد النتائج على إمكانيات التعلم الآلي في تعزيز سير العمل السريري في رعاية الأسنان، على الرغم من التحديات مثل أمان البيانات والحاجة إلى مجموعات بيانات أكبر ومعيارية لا تزال بحاجة إلى معالجة.

Journal: BMC Oral Health, Volume: 24, Issue: 1
DOI: https://doi.org/10.1186/s12903-023-03819-w
PMID: https://pubmed.ncbi.nlm.nih.gov/38233822
Publication Date: 2024-01-17
Author(s): Diego Cerda Mardini et al.
Primary Topic: Dental Radiography and Imaging

Overview

The research investigates the efficacy of a Machine Learning (ML) model designed to automate the measurement of Periodontal Bone Loss (PBL) in panoramic radiographs, addressing the need for improved diagnostic tools in light of the prevalence of Periodontitis. A dataset comprising 2010 images was utilized, with 1970 images allocated for training and 40 for testing. The proposed model integrates statistical inference techniques, Convolutional Neural Networks (CNNs) for visual information extraction, and an algorithm to quantify PBL as a percentage and classify it into stages. The model’s performance was evaluated against assessments from two radiologists, two periodontists, and one general dentist through a standardized test.

Results indicated that the ML model demonstrated acceptable performance in diagnosing light to moderate PBL, achieving a weighted sensitivity of 0.23 and a weighted F1-score of 0.29, along with the capability for real-time diagnosis. However, it was ineffective in diagnosing severe PBL, with sensitivity, precision, and F1-score all equating to zero. The study concludes that while the ML model shows promise for automating PBL diagnosis in panoramic radiographs, further development is necessary, particularly for severe cases. Future applications of similar ML tools could enhance diagnostic workflows for various oral pathologies with radiographic features.

Introduction

The introduction of this research paper highlights the significant global prevalence of periodontitis, affecting approximately 740 million individuals and imposing substantial socio-economic and healthcare burdens. It underscores the critical importance of early diagnosis and regular maintenance in managing the disease, as classified by the current Classification of Periodontal and Peri-Implant Diseases and Conditions, which categorizes periodontitis into stages (1 to 4) based on severity and complexity, and grades (A to C) reflecting the risk of progression and systemic health impact. Radiographic evaluation, particularly through panoramic radiographs, is emphasized as essential for diagnosing and monitoring periodontal bone loss (PBL), given its advantages such as low radiation exposure and comprehensive overview.

The paper also discusses the increasing integration of artificial intelligence (AI) in dentistry, particularly through machine learning (ML) models that automate radiographic analysis. Supervised learning algorithms are highlighted for their ability to learn from labeled datasets, optimizing performance on new data. The study aims to evaluate the efficacy of an ML model in automatically measuring PBL in panoramic radiographs, comparing its performance against that of dental professionals. This research is particularly relevant as it addresses a gap in the application of ML for PBL assessment within the Chilean population, potentially enhancing current diagnostic workflows in periodontitis management.

Methods

In this study, 500 panoramic radiographs were collected from two populations: 250 from a non-representative Chilean cohort treated at the Universidad de los Andes and 250 from a publicly available dataset from Tufts University. Each radiograph was anonymized with a unique code, and only molars were included for analysis, excluding edentulous patients and those with temporary dentures. The labeling of radiographs was performed by an undergraduate student under the supervision of an experienced radiologist, focusing on key anatomical points necessary for calculating periodontal bone loss (PBL). The labeling process involved creating bounding boxes around molars and identifying three critical points: the Cementoenamel Junction (CEJ), the Alveolar Crest, and the Root Apex.

The machine learning model developed in this study consisted of three sequential components. The first component generated initial predictions of PBL points using statistical inference, followed by a Deep Convolutional Neural Network (DCNN) that refined these predictions based on visual content. The final component employed a rule-based algorithm to calculate PBL percentages and stages. The model was trained on 1970 labeled images, with 40 images reserved for testing. Performance was evaluated against five human participants, including two radiologists, using diagnostic indices such as sensitivity and specificity. The study was ethically approved, and informed consent was obtained from all participants involved.

Results

The results indicate a high interclass correlation coefficient (ICC) of 0.91 for the data-labeling observer, suggesting strong agreement among participants with varying years of experience, detailed in the Supplemental Files. The test results from both radiologists revealed that PBL stage 1 was identified in 57.14% of cases, stage 2 in 35.71%, and stages 3/4 in 7.14%. The average percentages for each PBL stage were 10.58% for stage 1, 24.84% for stage 2, and 46.66% for stages 3/4. Radiologists completed the test in an average of 23.2 minutes, while controls took 26.9 minutes, resulting in an overall average of 25.4 minutes, in stark contrast to the model’s response time of just 0.93 seconds.

In terms of performance metrics, the model exhibited varying sensitivity and specificity across PBL stages. For stage 1, it achieved a sensitivity of 0.5, the second highest in the group, but had the lowest precision (0.26), recall (0.42), and specificity (0.39), with an F1-score of 0.34. The macro average results showed the model with the lowest sensitivity (0.3), specificity (0.65), precision (0.14), and recall (0.64), while ranking third in F1-score (0.19). In the weighted average, the model had the second-highest sensitivity (0.23) and third-highest precision (0.11) and F1-score (0.15), but the lowest recall (0.26) and specificity (0.26). For the micro-average, the model again ranked second in sensitivity (0.42) and third in precision (0.22) and F1-score (0.29). Periodontist 2 consistently achieved the highest values across all metrics, with detailed participant performance metrics provided in Tables 5 and 6.

Discussion

The discussion section of the study highlights the development of a Deep Convolutional Neural Network (DCNN) aimed at automating the diagnosis of radiographic Periodontal Bone Loss (PBL). Despite the model’s acceptable performance and capability for real-time diagnosis, it was trained on a limited dataset of 500 panoramic radiographs, which is below the recommended size for such models. To address data scarcity, the study incorporated a second population from a publicly available dataset, although this may affect the model’s generalizability to the intended Chilean population. The authors acknowledge that the non-representative nature of the datasets limits the model’s applicability to external populations and emphasize the need for a balanced dataset to enhance diagnostic accuracy across varying disease stages.

The model demonstrated efficiency in diagnosing light to moderate PBL stages, achieving real-time diagnosis at a rate of 0.02 seconds per tooth. However, it struggled with detecting severe PBL (stages 3/4), which constitutes a significant limitation. The performance metrics indicated that while the model’s results were comparable to those of general dentists, it underperformed relative to more experienced periodontists and radiologists. The study suggests future research directions, including the creation of a dedicated dataset for severe PBL cases and the exploration of automated diagnosis for other dental pathologies. Overall, the findings underscore the potential of machine learning in enhancing clinical workflows in dental care, although challenges such as data security and the need for larger, standardized datasets remain to be addressed.