DilatedToothSegNet: شبكة تقسيم الأسنان على شبكات الأسنان ثلاثية الأبعاد من خلال زيادة الرؤية الاستقبالية DilatedToothSegNet: Tooth Segmentation Network on 3D Dental Meshes Through Increasing Receptive Vision

المجلة: Journal of Imaging Informatics in Medicine، المجلد: 37، العدد: 4
DOI: https://doi.org/10.1007/s10278-024-01061-6
PMID: https://pubmed.ncbi.nlm.nih.gov/38441700
تاريخ النشر: 2024-03-05
المؤلف: Lucas Krenmayr وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

يتناول القسم زيادة اعتماد الماسحات الضوئية داخل الفم المتقدمة لتوليد نماذج الأسنان ثلاثية الأبعاد، مع التأكيد على أهمية التقسيم الدقيق ووضع العلامات على الأسنان من أجل تخطيط العلاج بمساعدة الكمبيوتر بشكل فعال. يُلاحظ أن وضع العلامات اليدوي يتطلب جهداً كبيراً، مما يدفع لاستكشاف تقنيات التعلم العميق الهندسية التي أظهرت وعداً في أتمتة تقسيم السطح. على الرغم من التقدم، لا تزال التحديات قائمة، خاصة في حالات الأسنان المفقودة أو غير المتراصة. لمعالجة هذه القضايا، يقدم المؤلفون مشغل شبكة جديد، وهو الالتفاف على الحواف المتوسعة، الذي يعزز قدرة النموذج على تعلم الميزات البعيدة، مما يحسن نتائج التقسيم في السيناريوهات المعقدة.

في الختام، يقدم المؤلفون DilatedToothSegNet، وهي شبكة عصبية رسومية مصممة للتقسيم التلقائي لنماذج الأسنان ثلاثية الأبعاد المستمدة من المسحات داخل الفم. تعتمد هذه الطريقة على الأطر السابقة، حيث تدمج طبقات الالتفاف على الحواف الديناميكية لالتقاط الميزات الهندسية المحلية بينما تستخدم الالتفاف على الحواف المتوسعة لمعالجة التصنيف الخاطئ في الحالات الصعبة. تم تقييم أداء DilatedToothSegNet بدقة مقابل مجموعة بيانات معيارية Teeth3DS، مما يظهر تفوقها على طرق التقسيم الحالية. لا تعزز الطريقة المقترحة دقة التقسيم فحسب، بل تسهل أيضاً التكامل في برامج CAD لتخطيط العلاج بشكل سلس، مما يبرز تأثيرها المحتمل على العمليات التحليلية اللاحقة، مثل تحليل بولتون لقياس الأسنان.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على الأهمية المتزايدة لنماذج الأسنان ثلاثية الأبعاد (3D) في طب الأسنان وتقويم الأسنان، خاصة لتشخيص عدم انتظام الأسنان وتخطيط العلاجات. يتم عادةً توليد هذه النماذج من انطباعات فعلية أو ماسحات ضوئية داخل الفم (IOSs). يعد التقسيم الدقيق ووضع العلامات على الأسنان داخل هذه النماذج الرقمية أمراً ضرورياً للحصول على قياسات موثوقة، ومع ذلك فإن وضع العلامات اليدوي يتطلب جهداً كبيراً. وبالتالي، هناك حاجة ملحة لطرق تقسيم الأسنان ثلاثية الأبعاد الآلية. ومع ذلك، تواجه الأساليب الحالية تحديات بسبب التباين في تشريح الأسنان بين المرضى والقيود في جودة المسحات الرقمية، التي يمكن أن تتأثر بالضوضاء والالتقاط غير الكامل للمناطق داخل الفم.

تناقش الورقة تطور تقنيات التقسيم، مشيرة إلى أن الأساليب المبكرة اعتمدت على خوارزميات كلاسيكية تتطلب إدخال المستخدم، بينما دمجت التطورات الأخيرة استراتيجيات التعلم العميق. على الرغم من التحسينات، غالباً ما تفترض هذه الأساليب تشريح أسنان قياسي لا يعكس التباين في العالم الحقيقي، وغالباً ما تستند تقييماتها إلى مجموعات بيانات خاصة، مما يحد من إمكانية إعادة الإنتاج. لمعالجة هذه القضايا، يقدم المؤلفون مجموعة بيانات Teeth3DS للمعايير ويقترحون استراتيجية تعلم ميزات جديدة تُسمى الالتفاف على الحواف المتوسعة. تعزز هذه الطريقة تقسيم الأسنان الدلالي من خلال توسيع مجال الرؤية الاستقبالية عبر أخذ عينات من النقاط الأبعد، مما يسمح للنموذج بدمج الميزات من الأسنان المجاورة. تُظهر البنية المقترحة، التي تدمج طبقات الالتفاف على الحواف الديناميكية وطبقات الالتفاف على الحواف المتوسعة، تحسينات كبيرة في دقة التقسيم، كما تم التحقق من ذلك من خلال التجارب باستخدام مجموعة بيانات Teeth3DS المعيارية.

الطرق

في هذا القسم، يحدد المؤلفون الطرق المستخدمة لتقييم فعالية تقنية تقسيم السطح ثلاثية الأبعاد المقترحة من خلال مقارنتها بثلاث خوارزميات رائدة—PointNet++ وPointNext وDGCNN—بالإضافة إلى طريقتين متخصصتين لتقسيم نماذج الأسنان ثلاثية الأبعاد، MeshSegNet وTSGCNet. تستخدم كل طريقة متنافسة تنسيقات إدخال مميزة: تتطلب كل من PointNet++ وPointNext مصفوفة بحجم $M \times 6$ تمثل الإحداثيات ثلاثية الأبعاد والمتجهات الطبيعية لمراكز الوجوه؛ بينما يستخدم DGCNN نفس هيكل الإدخال. في المقابل، تستخدم MeshSegNet مصفوفة بحجم $M \times 15$ تتضمن الإحداثيات ثلاثية الأبعاد للرؤوس ومركز الوجه، بالإضافة إلى المتجهات الطبيعية، مدعومة بمصفوفات الجوار لتعلم الرسوم البيانية. إدخال TSGCNet هو مصفوفة بحجم $M \times 24$، والتي تتضمن معلومات إضافية عن المتجهات الطبيعية لكل رأس والوجه.

لضمان مقارنة عادلة، تم تدريب جميع الطرق تحت إعداد متسق، باستثناء MeshSegNet، التي استخدمت حجم دفعة قدره 10 لتعزيز التقارب، كما هو موصى به في دراستها الأصلية. يسمح هذا الصرامة المنهجية بتقييم شامل لأداء التقنية المقترحة مقارنة بالمعايير المعتمدة في هذا المجال.

النتائج

في هذا القسم، يقدم المؤلفون تحليلًا مقارنًا لطريقتهم المقترحة، DilatedToothSegNet، مقابل تقنيات أخرى متطورة لتقسيم السطح ثلاثية الأبعاد وتقسيم نماذج الأسنان ثلاثية الأبعاد. تشير النتائج إلى أن DilatedToothSegNet تظهر أداءً متفوقًا في دقة التقسيم، مما يبرز فعاليتها في التعامل مع الهياكل السنية المعقدة.

بالإضافة إلى ذلك، أجرى المؤلفون سلسلة من التجارب التدريبية لتقييم تأثير المكونات الرئيسية المختلفة المدمجة في نهجهم. توفر هذه التجارب رؤى حول مساهمات كل مكون، مما يعزز من صحة وفعالية DilatedToothSegNet في سياق مهام تقسيم الأسنان.

المناقشة

في مناقشة تقسيم نماذج الأسنان، تسلط الورقة الضوء على التطور من الأساليب المبكرة المعتمدة على الهندسة إلى تقنيات التعلم العميق المتقدمة التي تمكن من التقسيم الآلي بالكامل. تم استبدال الأساليب الأولية، التي اعتمدت على التدخل اليدوي والعمليات شبه الآلية، إلى حد كبير بأساليب تستخدم أطر التعلم العميق. من الجدير بالذكر أن Xu وآخرين قدموا طريقة تحول نماذج الأسنان إلى صور ثنائية الأبعاد لتدريب الشبكات العصبية التلافيفية (CNNs)، بينما استخدم Tian وآخرون تقسيم octree للتجزئة متبوعًا بشبكات CNN ثلاثية الأبعاد. ومع ذلك، غالباً ما تتطلب هذه الأساليب معالجة مسبقة وبعدية واسعة، مما قد يؤثر على المعلومات المكانية. تستفيد التطورات الحديثة، مثل MeshSegNet وTSGCNet، من بيانات السطح الخام مباشرة من الماسحات الضوئية داخل الفم (IOSs)، باستخدام تقنيات مثل التعلم المقيد بالرسوم البيانية متعددة المقاييس والالتفاف على الحواف لتعزيز دقة التقسيم من خلال التقاط السياقات الهندسية المحلية.

توضح الورقة أيضًا استخدام مجموعة بيانات Teeth3DS، التي تتكون من 1,800 سطح أسنان فريد تم التحقق منه من قبل محترفين في طب الأسنان، لتدريب وتقييم شبكة التقسيم المقترحة. تم تصميم هيكل مجموعة البيانات، بما في ذلك تقسيمين متميزين للتدريب والاختبار، لتسهيل إمكانية إعادة الإنتاج مع تسليط الضوء على التحديات مثل عدم التوازن في الحالات التي تحتوي على فتحات صناعية وتلك التي لا تحتوي عليها. تصنف بنية الشبكة المقترحة كل وجه من شبكة الأسنان إلى واحدة من 17 فئة، باستخدام مزيج من كتل تعلم الميزات المحلية والمتوسعة لالتقاط كل من المعلومات السياقية المحلية والأوسع. تظهر النتائج التجريبية أن الطريقة المقترحة تتفوق على التقنيات الحالية من حيث الدقة العامة، ومتوسط تقاطع الاتحاد (mIoU)، ودرجة Dice، خاصة في الحالات الصعبة التي تتضمن أسنانًا مفقودة أو غير متراصة، مما يبرز فعالية نهج الالتفاف على الحواف المتوسعة في تعلم الميزات التمييزية من الهندسة السنية المعقدة.

القيود

يتناول قسم القيود القيود والضعف المحتمل في منهجية الدراسة ونتائجها. يبرز أن بعض الافتراضات التي تم إجراؤها خلال البحث قد لا تكون صحيحة في جميع السياقات، مما قد يؤثر على إمكانية تعميم النتائج. بالإضافة إلى ذلك، قد تكون حجم العينة غير كافٍ لاستخلاص استنتاجات حاسمة، مما قد يؤدي إلى تحيزات في تفسير البيانات.

علاوة على ذلك، تعترف الدراسة بإمكانية حدوث أخطاء قياس في عملية جمع البيانات، مما قد يؤثر على موثوقية النتائج. يقترح المؤلفون أن الأبحاث المستقبلية يجب أن تعالج هذه القيود من خلال استخدام عينات أكبر وأكثر تنوعًا، بالإضافة إلى تحسين تقنيات القياس لتعزيز دقة النتائج. بشكل عام، بينما تسهم الدراسة برؤى قيمة، فإن هذه القيود تستدعي اعتبارًا دقيقًا عند تطبيق النتائج على سياقات أوسع.

Journal: Journal of Imaging Informatics in Medicine, Volume: 37, Issue: 4
DOI: https://doi.org/10.1007/s10278-024-01061-6
PMID: https://pubmed.ncbi.nlm.nih.gov/38441700
Publication Date: 2024-03-05
Author(s): Lucas Krenmayr et al.
Primary Topic: Dental Radiography and Imaging

Overview

The section discusses the increasing adoption of advanced intraoral scanners for generating 3D dental models, emphasizing the importance of accurate segmentation and labeling of teeth for effective computer-aided treatment planning. Manual labeling is noted as labor-intensive, prompting the exploration of geometric deep learning techniques that have shown promise in automating surface segmentation. Despite advancements, challenges persist, particularly in cases of missing or misaligned teeth. To address these issues, the authors introduce a novel network operator, dilated edge convolution, which enhances the model’s capacity to learn distant features, thereby improving segmentation outcomes in complex scenarios.

In the conclusion, the authors present DilatedToothSegNet, a graph neural network tailored for the automatic segmentation of 3D dental models derived from intraoral scans. This method builds on previous frameworks, integrating dynamic edge convolution layers to capture local geometric features while employing dilated edge convolution to address misclassification in challenging cases. The performance of DilatedToothSegNet was rigorously evaluated against the Teeth3DS benchmark dataset, demonstrating its superiority over existing segmentation methods. The proposed approach not only enhances segmentation accuracy but also facilitates the integration into CAD software for streamlined treatment planning, underscoring its potential impact on subsequent analytical processes, such as Bolton analysis for tooth measurement.

Introduction

The introduction of this research paper highlights the growing significance of three-dimensional (3D) dental models in dentistry and orthodontics, particularly for diagnosing tooth misalignments and planning treatments. These models are typically generated from physical impressions or advanced intraoral scanners (IOSs). Accurate segmentation and labeling of teeth within these digital models are essential for reliable measurements, yet manual labeling is labor-intensive. Consequently, there is a pressing need for automated 3D tooth segmentation methods. However, existing approaches face challenges due to variability in dental anatomy among patients and limitations in the quality of digital scans, which can be affected by noise and incomplete capture of intraoral regions.

The paper discusses the evolution of segmentation techniques, noting that early methods relied on classical algorithms that required user input, while recent advancements have incorporated deep learning strategies. Despite improvements, these methods often assume a standard dental anatomy that does not reflect real-world variability, and their evaluation is frequently based on proprietary datasets, limiting reproducibility. To address these issues, the authors introduce the Teeth3DS dataset for benchmarking and propose a novel feature learning strategy called dilated edge convolution. This method enhances semantic tooth segmentation by expanding the receptive visual field through farthest point sampling, allowing the model to incorporate features from neighboring teeth. The proposed architecture, which integrates dynamic edge convolution and dilated edge convolution layers, demonstrates significant improvements in segmentation accuracy, as validated by experiments using the Teeth3DS benchmark.

Methods

In this section, the authors outline the methods used to evaluate the effectiveness of their proposed 3D surface segmentation technique by comparing it against three leading algorithms—PointNet++, PointNext, and DGCNN—as well as two specialized methods for 3D dental model segmentation, MeshSegNet and TSGCNet. Each competing method utilizes distinct input formats: PointNet++ and PointNext both require an $M \times 6$ matrix representing the 3D coordinates and normal vectors of face centers; DGCNN employs the same input structure. In contrast, MeshSegNet uses an $M \times 15$ matrix that includes the 3D coordinates of the vertices and face center, along with normal vectors, supplemented by adjacency matrices for graph learning. TSGCNet’s input is an $M \times 24$ matrix, which includes additional normal vector information for each vertex and the face.

To ensure a fair comparison, all methods were trained under a consistent setup, with the exception of MeshSegNet, which utilized a batch size of 10 to enhance convergence, as recommended in its original study. This methodological rigor allows for a comprehensive assessment of the proposed technique’s performance relative to established benchmarks in the field.

Results

In this section, the authors present a comparative analysis of their proposed method, DilatedToothSegNet, against other state-of-the-art techniques for 3D surface segmentation and 3D dental model segmentation. The results indicate that DilatedToothSegNet demonstrates superior performance in segmentation accuracy, highlighting its effectiveness in handling complex dental structures.

Additionally, the authors conducted a series of training experiments to assess the impact of various key components integrated into their approach. These experiments provide insights into the contributions of each component, further validating the robustness and efficiency of DilatedToothSegNet in the context of dental segmentation tasks.

Discussion

In the discussion of dental model segmentation, the paper highlights the evolution from early geometric-based methods to advanced deep learning techniques that enable fully automated segmentation. Initial approaches, which relied on manual intervention and semi-automated processes, have been largely supplanted by methods utilizing deep learning frameworks. Notably, Xu et al. introduced a method that transforms dental models into 2D images for training convolutional neural networks (CNNs), while Tian et al. employed octree partitioning for voxelization followed by 3D CNNs. However, these methods often necessitate extensive pre- and post-processing, which can compromise spatial information. Recent advancements, such as MeshSegNet and TSGCNet, leverage raw surface data directly from intraoral scanners (IOSs), employing techniques like multi-scale graph-constrained learning and edge convolution to enhance segmentation accuracy by capturing local geometric contexts.

The paper also details the use of the Teeth3DS dataset, comprising 1,800 unique dental surfaces validated by dental professionals, to train and evaluate the proposed segmentation network. The dataset’s structure, including two distinct train-test splits, is designed to facilitate reproducibility while highlighting challenges such as the imbalance in cases with and without artificial sockets. The proposed network architecture classifies each face of a dental mesh into one of 17 classes, utilizing a combination of local and dilated feature learning blocks to effectively capture both local and broader contextual information. Experimental results demonstrate that the proposed method outperforms existing techniques in terms of overall accuracy, mean intersection over union (mIoU), and Dice score, particularly in challenging cases involving missing or misaligned teeth, underscoring the efficacy of the dilated edge convolution approach in learning discriminative features from complex dental geometries.

Limitations

The section on limitations discusses the constraints and potential weaknesses of the study’s methodology and findings. It highlights that certain assumptions made during the research may not hold true in all contexts, which could affect the generalizability of the results. Additionally, the sample size may have been insufficient to draw definitive conclusions, potentially leading to biases in the data interpretation.

Furthermore, the study acknowledges the possibility of measurement errors in the data collection process, which could impact the reliability of the results. The authors suggest that future research should address these limitations by employing larger and more diverse samples, as well as refining measurement techniques to enhance the accuracy of findings. Overall, while the study contributes valuable insights, these limitations warrant careful consideration when applying the results to broader contexts.