تقييم دقة الذكاء الاصطناعي في تقسيم قناة الفك السفلي مقارنة بالتقسيم شبه التلقائي على صور الأشعة المقطعية باستخدام شعاع المخروط Assessing the accuracy of artificial intelligence in mandibular canal segmentation compared to semi-automatic segmentation on cone-beam computed tomography images

المجلة: Polish Journal of Radiology، المجلد: 90
DOI: https://doi.org/10.5114/pjr/202477
PMID: https://pubmed.ncbi.nlm.nih.gov/40416521
تاريخ النشر: 2025-04-10
المؤلف: Julien Issa وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

تقيّم هذه الدراسة دقة الذكاء الاصطناعي (AI) في تقسيم قناة الفك السفلي (MC) من مسح التصوير المقطعي المحوسب باستخدام شعاع مخروط (CBCT)، مقارنةً بالتقسيم شبه التلقائي الذي يقوم به الخبراء. تم تحليل ما مجموعه 150 مسحًا من CBCT، تشمل 300 قناة MC. تم تحديد المعيار المرجعي للتقسيم باستخدام برنامج Romexis، بينما تم إجراء تقسيم AI عبر منصة Diagnocat. تم تقييم دقة التقسيمات من خلال مقاييس المسافة من السطح إلى السطح، وتم إجراء تحليلات إحصائية لتقييم موثوقية المقيمين بين الأفراد وداخل الأفراد ومقارنات المجموعات.

تشير النتائج إلى أن الانحراف الوسيط بين تقسيم AI والتقسيم شبه التلقائي كان 0.29 مم، مع 88% من الحالات تقع ضمن الحد المقبول سريريًا وهو ≤ 0.50 مم. كانت موثوقية المقيمين بين الأفراد لطريقة التقسيم شبه التلقائي 84.5%، وكانت موثوقية المقيمين داخل الأفراد مرتفعة بشكل ملحوظ عند 95.5%. كانت دقة تقسيم AI الأعلى في المسحات التي لا تحتوي على الأضراس الثالثة (0.27 مم)، تليها الأضراس الثالثة المنفجرة (0.28 مم) والأضراس الثالثة impacted (0.32 مم). تستنتج الدراسة أنه على الرغم من أن تقسيم AI دقيق للغاية، إلا أن الأخطاء أكثر شيوعًا في الحالات التي تتضمن الأضراس الثالثة impacted، مما يشير إلى الحاجة إلى تحسين مجموعات بيانات تدريب AI والتحقق من صحة متعددة المراكز لتعزيز الأداء في السيناريوهات التشريحية المعقدة.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على التأثير التحويلي للتقدم التكنولوجي، وخاصة التصوير المقطعي المحوسب باستخدام شعاع مخروط (CBCT)، على سير العمل في طب الأسنان والفك والوجه. يوفر CBCT إعادة بناء تشريحية ثلاثية الأبعاد عالية الدقة تتجاوز قيود الأشعة السينية التقليدية ثنائية الأبعاد، مما يجعله ضروريًا للتخطيط قبل العمليات الجراحية، خاصة في تحديد الهياكل الحيوية مثل قناة الفك السفلي (MC) والعصب الفكي السفلي (IAN). إن التحديد الدقيق للعصب الفكي السفلي أمر حاسم لمنع الإصابات الناتجة عن الإجراءات السنية، ومع ذلك، تظل هذه الإصابات مصدر قلق سريري كبير، حيث تتراوح معدلات الحدوث من 0.4% إلى 13.4%.

على الرغم من أهمية تقسيم MC الدقيق لتحديد موقع IAN، تواجه المنهجيات الحالية – التي تتراوح من الطرق اليدوية إلى شبه الآلية – تحديات مثل كثافة العمل والتحيز من المشغل. لقد أظهرت ظهور الذكاء الاصطناعي (AI)، وخاصة من خلال الشبكات العصبية التلافيفية (CNNs)، وعدًا في تعزيز دقة التقسيم، حيث أفادت بعض الدراسات بدقة تصل إلى 99%. ومع ذلك، توجد قيود في اتساق الطرق المرجعية ونقص الأبحاث حول كيفية تأثير حالة الأضراس الثالثة على دقة التقسيم. تهدف هذه الدراسة إلى تقييم دقة تقسيم MC المدعوم بالذكاء الاصطناعي بشكل منهجي مقابل تتبع الأشعة السينية شبه الآلية، باستخدام منهجية موحدة لتحديد حقيقة موثوقة. بالإضافة إلى ذلك، ستبحث في تأثير وجود أو غياب الأضراس الثالثة على نتائج التقسيم، مما يبرز الحاجة إلى تقييمات صارمة لأداء أدوات الذكاء الاصطناعي في البيئات السريرية.

طرق

تحدد قسم “الطرق” في الورقة البحثية المواد والمنهجيات المستخدمة في الدراسة. يوضح التصميم التجريبي، بما في ذلك اختيار المواد، وتحضير العينة، والتقنيات المحددة المستخدمة لجمع البيانات وتحليلها. يبرز القسم أهمية القابلية للتكرار والدقة في الإعداد التجريبي، مما يضمن إمكانية التحقق من النتائج من قبل باحثين آخرين.

بالإضافة إلى ذلك، يتم وصف الطرق بطريقة منهجية، مع تسليط الضوء على أي أدوات إحصائية أو برامج تم استخدامها لتحليل البيانات. قد يتناول القسم أيضًا أي ضوابط تم تنفيذها للتخفيف من التحيز وتعزيز موثوقية النتائج. بشكل عام، تعتبر الطرق المستخدمة حاسمة لفهم صلاحية استنتاجات الدراسة وقابليتها للتطبيق في سياقات أوسع.

نتائج

كشفت نتائج الدراسة حول حالة الأضراس الثالثة أنه، من بين 300 مسح تم تحليله، كانت توزيع الأضراس الثالثة كما يلي: على الجانب الأيسر، أظهر 60 مسحًا عدم وجود ضرس ثالث، و34 كان لديهم ضرس ثالث منفجر، و56 قدموا ضرسًا ثالثًا impacted. على الجانب الأيمن، أشار 70 مسحًا إلى غياب ضرس ثالث، و25 كان لديهم ضرس ثالث منفجر، و55 كانوا impacted. استخدمت الدراسة تقييمات موثوقية المقيمين بين الأفراد وداخل الأفراد لتقييم اتساق تقسيم قناة الفك السفلي (MC) شبه الآلي. أظهرت موثوقية المقيمين بين الأفراد توافقًا كبيرًا بنسبة 84.5%، بينما حققت موثوقية المقيمين داخل الأفراد توافقًا قريبًا من الكمال بنسبة 95.5%، مما يدل على قوة البروتوكول شبه الآلي.

تم إجراء تحليل كمي للاختلافات المكانية ثلاثية الأبعاد بين تقسيمات MC المعتمدة على AI وتقسيمات MC شبه الآلية باستخدام مقاييس المسافة من السطح إلى السطح. أشار اختبار شابيرو-ويلك إلى توزيع غير طبيعي (p < 0.05)، مع انحراف وسطي إجمالي قدره 0.29 مم وانحراف معياري يتراوح من 0.25 إلى 0.37 مم. تراوحت المسافة المتوسطة لكل مسح من 0.19 مم إلى 4.72 مم. أظهر تحليل صندوق الرسم أن الفئة الغائبة كانت لديها أدنى انحراف وسطي (0.27 مم)، تليها الفئة المنفجرة (0.28 مم)، وعرضت الفئة impacted أعلى الانحرافات (0.32 مم). بالإضافة إلى ذلك، كانت الانحرافات أكبر قليلاً على الجانب الأيسر مقارنةً بالجانب الأيمن.

مناقشة

هدفت الدراسة إلى تقييم دقة أداة الذكاء الاصطناعي التجارية، Diagnocat، لتقسيم قناة الفك السفلي (MC) في مسحات التصوير المقطعي المحوسب باستخدام شعاع مخروط (CBCT)، مع التركيز بشكل خاص على تأثير حالة الأضراس الثالثة على دقة التقسيم. تم تحليل ما مجموعه 150 مسحًا مجهول الهوية من CBCT، باستخدام كل من التقسيم شبه الآلي من قبل أطباء الأشعة ذوي الخبرة والتقسيم الآلي عبر منصة الذكاء الاصطناعي. أشارت النتائج إلى انحراف وسطي قدره 0.29 مم بين تقسيمات AI والخبراء، مع 88% من تقسيمات AI تقع ضمن الحدود المقبولة سريريًا (≤ 0.50 مم). ومع ذلك، تراجعت الدقة في الحالات التي تحتوي على أضراس ثالثة impacted، مما يبرز التحديات التي تطرحها التعقيدات التشريحية التي تعيق مسار القناة.

تؤكد النتائج على أهمية تدريب نماذج الذكاء الاصطناعي على مجموعات بيانات متنوعة لتعزيز قوتها، خاصة في السيناريوهات السريرية الصعبة التي تتضمن أضراس ثالثة impacted. على الرغم من أن منهجية الدراسة كانت موحدة وحجم العينة كبير، إلا أن القيود مثل التصميم الرجعي واستبعاد الحالات التي تحتوي على عيوب قد تؤثر على قابلية تعميم النتائج. يجب أن تهدف الأبحاث المستقبلية إلى تضمين مجموعات متعددة المراكز واستخدام نماذج ذكاء اصطناعي مفتوحة المصدر لتحسين دقة التقسيم وموثوقيته في المناطق التشريحية المعقدة.

Journal: Polish Journal of Radiology, Volume: 90
DOI: https://doi.org/10.5114/pjr/202477
PMID: https://pubmed.ncbi.nlm.nih.gov/40416521
Publication Date: 2025-04-10
Author(s): Julien Issa et al.
Primary Topic: Dental Radiography and Imaging

Overview

This study evaluates the accuracy of artificial intelligence (AI) in segmenting the mandibular canal (MC) from cone-beam computed tomography (CBCT) scans, comparing it to semi-automatic segmentation performed by experts. A total of 150 CBCT scans, encompassing 300 MCs, were analyzed. The reference standard for segmentation was established using Romexis software, while AI segmentation was performed via the Diagnocat platform. The accuracy of the segmentations was assessed through surface-to-surface distance metrics, and statistical analyses were conducted to evaluate inter- and intra-rater reliability and group comparisons.

The findings indicate that the median deviation between AI and semi-automatic segmentation was 0.29 mm, with 88% of cases falling within the clinically acceptable limit of ≤ 0.50 mm. Inter-rater reliability for the semi-automatic method was 84.5%, and intra-rater reliability was notably high at 95.5%. AI segmentation accuracy was highest in scans without third molars (0.27 mm), followed by erupted (0.28 mm) and impacted third molars (0.32 mm). The study concludes that while AI segmentation is highly accurate, errors are more prevalent in cases involving impacted third molars, suggesting a need for improved AI training datasets and multi-centre validation to enhance performance in complex anatomical scenarios.

Introduction

The introduction of this research paper highlights the transformative impact of technological advancements, particularly cone-beam computed tomography (CBCT), on dental and maxillofacial workflows. CBCT provides high-resolution, three-dimensional anatomical reconstructions that surpass the limitations of traditional two-dimensional radiography, making it essential for preoperative planning, especially in identifying critical structures like the mandibular canal (MC) and the inferior alveolar nerve (IAN). The precise localization of the IAN is crucial to prevent iatrogenic injuries during dental procedures, yet such injuries remain a significant clinical concern, with incidence rates ranging from 0.4% to 13.4%.

Despite the importance of accurate MC segmentation for IAN localization, current methodologies—ranging from manual to semi-automated approaches—face challenges such as labor intensity and operator bias. The emergence of artificial intelligence (AI), particularly through convolutional neural networks (CNNs), has shown promise in enhancing segmentation accuracy, with some studies reporting up to 99% accuracy. However, limitations exist in the consistency of reference methods and the lack of research on how third molar status affects segmentation accuracy. This study aims to systematically evaluate the precision of AI-powered MC segmentation against semi-automated radiologist tracings, using a standardized methodology to establish a reliable ground truth. Additionally, it will investigate the influence of third molar presence or absence on segmentation outcomes, emphasizing the need for rigorous performance evaluations of AI tools in clinical settings.

Methods

The “Methods” section of the research paper outlines the materials and methodologies employed in the study. It details the experimental design, including the selection of materials, sample preparation, and the specific techniques used for data collection and analysis. The section emphasizes the importance of reproducibility and rigor in the experimental setup, ensuring that the findings can be validated by other researchers.

Additionally, the methods are described in a systematic manner, highlighting any statistical tools or software utilized for data analysis. The section may also address any controls implemented to mitigate bias and enhance the reliability of the results. Overall, the methods employed are critical for understanding the validity of the study’s conclusions and their applicability to broader contexts.

Results

The results of the study on third molar status revealed that, among 300 scans analyzed, the distribution of third molars was as follows: on the left side, 60 scans showed no third molar, 34 had an erupted third molar, and 56 presented with an impacted third molar. On the right side, 70 scans indicated the absence of a third molar, 25 had an erupted third molar, and 55 were impacted. The study employed inter- and intra-rater reliability assessments to evaluate the consistency of semi-automated segmentation of the mandibular canal (MC). Inter-rater reliability demonstrated substantial agreement at 84.5%, while intra-rater reliability achieved near-perfect agreement at 95.5%, indicating the robustness of the semi-automated protocol.

Quantitative analysis of three-dimensional spatial deviations between AI-based and semi-automated MC segmentations was performed using surface-to-surface distance metrics. The Shapiro-Wilk test indicated a non-normal distribution (p < 0.05), with an overall median deviation of 0.29 mm and a standard deviation ranging from 0.25 to 0.37 mm. The average distance per scan varied from 0.19 mm to 4.72 mm. Box plot analysis illustrated that the absent category had the lowest median deviation (0.27 mm), followed by the erupted category (0.28 mm), and the impacted category exhibited the highest deviations (0.32 mm). Additionally, deviations were slightly greater on the left side compared to the right.

Discussion

The study aimed to evaluate the accuracy of a commercial AI tool, Diagnocat, for mandibular canal (MC) segmentation in cone-beam computed tomography (CBCT) scans, with a particular focus on the influence of third molar status on segmentation precision. A total of 150 anonymized CBCT scans were analyzed, employing both semi-automated segmentation by experienced radiologists and automated segmentation via the AI platform. The results indicated a median deviation of 0.29 mm between the AI and expert segmentations, with 88% of AI segmentations falling within clinically acceptable limits (≤ 0.50 mm). However, the accuracy diminished in cases with impacted third molars, highlighting the challenges posed by anatomical complexities that obscure the canal’s path.

The findings underscore the importance of training AI models on diverse datasets to enhance their robustness, particularly in clinically challenging scenarios involving impacted third molars. While the study’s methodology was standardized and the sample size substantial, limitations such as the retrospective design and exclusion of cases with artifacts may affect the generalizability of the results. Future research should aim to include multi-center cohorts and utilize open-source AI models to improve segmentation accuracy and reliability in complex anatomical regions.