مجموعة بيانات مصورة داخل الفم مع تعليقات لاكتشاف تسوس الأسنان Annotated intraoral image dataset for dental caries detection

المجلة: Scientific Data، المجلد: 12، العدد: 1
DOI: https://doi.org/10.1038/s41597-025-05647-9
PMID: https://pubmed.ncbi.nlm.nih.gov/40715095
تاريخ النشر: 2025-07-25
المؤلف: S. Ahmed وآخرون
الموضوع الرئيسي: الأشعة السينية السنية والتصوير

نظرة عامة

تقدم هذه الدراسة أول مجموعة بيانات مصورة داخل الفم متاحة للجمهور ومُعَلَّمة تهدف إلى تعزيز الكشف عن تسوس الأسنان المدعوم بالذكاء الاصطناعي (AI)، مما يعالج فجوة كبيرة في الموارد الحالية. تتكون مجموعة البيانات من 6,313 صورة مأخوذة من أفراد تتراوح أعمارهم بين 10 إلى 24 عامًا في ميثي، السند، باكستان. تم إنشاء التعليقات التوضيحية باستخدام برنامج LabelMe، وتم التحقق منها بواسطة أطباء أسنان ذوي خبرة، وتم تنسيقها لتكون متوافقة مع نماذج الذكاء الاصطناعي المختلفة، بما في ذلك YOLO (أنت تنظر مرة واحدة فقط)، PASCAL VOC، وCOCO (الأشياء الشائعة في السياق). تشمل الصور مجموعة من المناظر الداخلية للفم، سواء مع أو بدون أجهزة سحب الخد، مما يوفر تمثيلًا شاملاً للأسنان المختلطة والدائمة.

في تقييم مجموعة البيانات، تم تدريب خمسة نماذج ذكاء اصطناعي—YOLOv5s، YOLOv8s، YOLOv11، SSD-MobileNet-v2، وFaster R-CNN—حيث حقق YOLOv8s أعلى أداء، كما يتضح من متوسط الدقة (mAP) البالغ 0.841 عند 0.5 تقاطع على اتحاد (IoU). لا تعزز هذه الأبحاث فقط مجال التشخيصات السنية المعتمدة على الذكاء الاصطناعي ولكنها أيضًا تؤسس معيارًا لجهود الكشف عن التسوس في المستقبل. ومع ذلك، تعترف الدراسة بالقيود، مثل الاعتماد على جهاز محمول واحد لالتقاط الصور، وتقترح أن الأبحاث المستقبلية يجب أن تستكشف الأسنان الأولية وتستخدم مجموعة أوسع من أدوات التصوير.

مقدمة

تسلط المقدمة الضوء على القضية العالمية الملحة للأمراض الفموية، وخاصة تسوس الأسنان، الذي يؤثر على حوالي 3.5 مليار فرد، مع زيادة ملحوظة في الانتشار بين الأطفال والمراهقين في البلدان ذات الدخل المنخفض والمتوسط (LMICs). غالبًا ما تكون طرق التشخيص التقليدية، مثل الفحوصات البصرية وتصوير الأشعة السينية، غير متسقة بسبب اعتمادها على خبرة الأطباء وعوامل خارجية مثل الإضاءة والرؤية. تؤكد هذه التناقضات على الحاجة إلى حلول تشخيصية مبتكرة، وخاصة دمج الذكاء الاصطناعي (AI) من خلال تقنيات التعلم العميق (DL) والتعلم الآلي (ML)، التي أظهرت وعدًا في تعزيز دقة التشخيص.

تتمثل إحدى التحديات الكبيرة في تطوير نماذج ذكاء اصطناعي فعالة للكشف عن التسوس في توفر مجموعات بيانات عالية الجودة وكبيرة الحجم. بينما تتوفر مجموعات بيانات الأشعة السينية بكثرة، هناك نقص ملحوظ في الصور الداخلية للفم المعلمة المتاحة للجمهور. استخدمت الدراسات الحالية مجموعات بيانات كبيرة، مثل واحدة تحتوي على أكثر من 24,000 صورة معلمة، لكن الوصول إليها مقيد، مما يحد من جهود البحث الأوسع. بالمقابل، توجد مجموعة بيانات أصغر متاحة للجمهور تحتوي على 718 صورة معلمة، لكن حجمها غير كافٍ لتدريب نماذج الذكاء الاصطناعي بشكل قوي. تهدف هذه الأبحاث إلى معالجة هذه القيود من خلال توفير مجموعة بيانات شاملة ومتاحة للجمهور من الصور الداخلية للفم على Zenodo، مما يسهل التقدم في التشخيصات السنية المدفوعة بالذكاء الاصطناعي.

الطرق

تحدد قسم “الطرق” في ورقة البحث التصميم التجريبي والتقنيات التحليلية المستخدمة للتحقيق في سؤال البحث. استخدمت الدراسة نهجًا كميًا، يتضمن تحليلات إحصائية لتقييم البيانات التي تم جمعها من تجارب مختلفة. تضمنت المنهجيات المحددة تجارب محكومة، حيث تم التلاعب بالمتغيرات بشكل منهجي لملاحظة آثارها على النتائج المعنية.

شملت جمع البيانات استخدام أدوات وبروتوكولات موحدة لضمان الموثوقية والصلاحية. تم إجراء التحليل باستخدام برامج إحصائية متقدمة، وتطبيق تقنيات مثل تحليل الانحدار واختبار الفرضيات لاستخلاص استنتاجات ذات مغزى من البيانات. يبرز القسم أهمية إمكانية التكرار والشفافية في عملية البحث، موضحًا الخطوات المتخذة لتقليل التحيز وضمان قوة النتائج.

النتائج

في هذه الدراسة، تم تدريب وتقييم خمسة نماذج للكشف عن الكائنات—YOLOv5s، YOLOv8s، YOLOv11، SSD-MobileNet-v2-FPNLite-320، وFaster R-CNN—على مجموعة بيانات مُنَظَّمة تتكون من 6,313 صورة، تم تقسيمها إلى 5,050 للتدريب، 631 للتحقق، و631 للاختبار. لتعزيز قابلية تعميم النماذج، تم استخدام تقنيات زيادة البيانات الافتراضية، وتم تدريب جميع النماذج باستخدام معلمات قياسية لتسهيل مقارنة عادلة.

أشارت النتائج إلى أن YOLOv8s تفوق على جميع النماذج الأخرى، محققًا أعلى قيم للدقة والاسترجاع ومتوسط الدقة (mAP)، مما يجعله النموذج الأكثر فعالية للتطبيقات التي تتطلب دقة عالية واسترجاع. بالمقابل، ظهر YOLOv5s كخيار قابل للتطبيق في السيناريوهات التي تكون فيها أوقات التدريب المختصرة ضرورية. على الرغم من أن YOLOv11 أظهر قوة، إلا أنه تطلب عددًا أكبر من دورات التدريب للوصول إلى مستويات الأداء القابلة للمقارنة مع تلك الخاصة بـ YOLOv8s.

المناقشة

امتثلت مجموعة بيانات الجمع والتنظيم للمبادئ الأخلاقية التي وضعتها لجنة المراجعة الأخلاقية (ERC) في مستشفى جامعة آغا خان (AKUH)، مما يضمن الحصول على موافقة مستنيرة من جميع المشاركين، بما في ذلك موافقة القاصرين. تضمنت البيانات صورًا للمراهقين والشباب، مع معايير صارمة للإدراج والاستبعاد للحفاظ على جودة التشخيص وملاءمته. سهل تطبيق موبايل عملية جمع البيانات، حيث تم دمج تخزين سحابي Firebase لنقل الصور بكفاءة، بينما تم توفير تدريب شامل للأطباء لضمان التقاط الصور بشكل متسق.

تم إجراء دراسة تجريبية شملت 101 مشارك للتحقق من منهجية جمع البيانات، مما أدى إلى تحسينات قبل جمع البيانات على نطاق واسع. تم وضع تعليقات توضيحية على الصور الداخلية للفم بواسطة أطباء أسنان ذوي خبرة باستخدام أداة LabelMe، مع عملية تحقق صارمة لضمان موثوقية مجموعة البيانات. تم تقييم موثوقية المقيّمين باستخدام معامل كابا لكوهين، مما أسفر عن درجة اتفاق عالية تبلغ 0.89. مجموعة البيانات، التي تم تنظيمها في مجلدات هيكلية للصور والتعليقات التوضيحية بتنسيقات متعددة (LabelMe، YOLO، PASCAL VOC، COCO)، متاحة على مستودع Zenodo، مما يسهل استخدامها لتدريب ومعايرة نماذج التعلم العميق في الكشف عن تسوس الأسنان.

القيود

تقدم مجموعة البيانات المستخدمة في هذه الدراسة عدة قيود ملحوظة قد تؤثر على النتائج. أحد العيوب الكبيرة هو نقص التعليقات التوضيحية لتسوس الأسنان، الذي ينتشر في منطقة الدراسة. تنبع هذه الإغفالات من التركيز الأساسي للدراسة على تسوس الأسنان والتحديات المرتبطة بتمييز التسوس عن الآفات غير المجوفة في الصور الداخلية ثنائية الأبعاد. يمكن أن تؤدي التشابهات البصرية بين هذه الحالات إلى تصنيف خاطئ أو تفسير خاطئ، مما يبرز الحاجة إلى أن تتضمن مجموعات البيانات المستقبلية تعليقات توضيحية متميزة للتسوس، ويفضل أن تكون مدعومة بفحوصات سريرية.

بالإضافة إلى ذلك، فإن اعتماد مجموعة البيانات على الصور الملتقطة فقط باستخدام نوع واحد من كاميرات الهواتف المحمولة يحد من قابلية تعميم جودة الصورة عبر أجهزة مختلفة. علاوة على ذلك، فإن استبعاد الأفراد الذين تقل أعمارهم عن 10 سنوات يحد من قابلية تطبيق مجموعة البيانات للكشف عن التسوس في الأطفال، خاصة فيما يتعلق بالأسنان الأولية. من الضروري معالجة هذه القيود في الأبحاث المستقبلية لتعزيز دقة وملاءمة منهجيات الكشف عن تسوس الأسنان.

Journal: Scientific Data, Volume: 12, Issue: 1
DOI: https://doi.org/10.1038/s41597-025-05647-9
PMID: https://pubmed.ncbi.nlm.nih.gov/40715095
Publication Date: 2025-07-25
Author(s): S. Ahmed et al.
Primary Topic: Dental Radiography and Imaging

Overview

This study presents the first publicly available annotated intraoral image dataset aimed at enhancing Artificial Intelligence (AI)-driven dental caries detection, addressing a significant gap in existing resources. The dataset consists of 6,313 images sourced from individuals aged 10 to 24 years in Mithi, Sindh, Pakistan. Annotations were created using LabelMe software, verified by experienced dentists, and formatted for compatibility with various AI models, including YOLO (You Only Look Once), PASCAL VOC, and COCO (Common Objects in Context). The images encompass a range of intraoral views, both with and without cheek retractors, providing a comprehensive representation of mixed and permanent dentitions.

In evaluating the dataset, five AI models—YOLOv5s, YOLOv8s, YOLOv11, SSD-MobileNet-v2, and Faster R-CNN—were trained, with YOLOv8s achieving the highest performance, indicated by a mean Average Precision (mAP) of 0.841 at a 0.5 Intersection over Union (IoU). This research not only advances the field of AI-based dental diagnostics but also establishes a benchmark for future caries detection efforts. However, the study acknowledges limitations, such as the reliance on a single mobile device for image capture, and suggests that future research should investigate primary dentition and utilize a broader range of imaging tools.

Introduction

The introduction highlights the pressing global issue of oral diseases, particularly dental caries, which affects approximately 3.5 billion individuals, with a notable increase in prevalence among children and adolescents in Low and Middle-Income Countries (LMICs). The traditional diagnostic methods, such as visual examinations and X-ray imaging, are often inconsistent due to their reliance on clinician experience and external factors like lighting and visibility. This inconsistency underscores the need for innovative diagnostic solutions, particularly the integration of Artificial Intelligence (AI) through Deep Learning (DL) and Machine Learning (ML) techniques, which have shown promise in enhancing diagnostic accuracy.

A significant challenge in developing effective AI models for caries detection is the availability of high-quality, large-scale datasets. While radiographic datasets are plentiful, there is a notable scarcity of publicly available annotated intraoral images. Existing studies have utilized substantial datasets, such as one with over 24,000 annotated images, but access is restricted, limiting broader research efforts. Conversely, a smaller publicly available dataset of 718 annotated images exists, yet its size is insufficient for robust AI model training. This research aims to address these limitations by providing a comprehensive and openly accessible dataset of annotated intraoral images on Zenodo, thereby facilitating advancements in AI-driven dental diagnostics.

Methods

The “Methods” section of the research paper outlines the experimental design and analytical techniques employed to investigate the research question. The study utilized a quantitative approach, incorporating statistical analyses to evaluate the data collected from various experiments. Specific methodologies included controlled experiments, where variables were systematically manipulated to observe their effects on the outcomes of interest.

Data collection involved the use of standardized instruments and protocols to ensure reliability and validity. The analysis was performed using advanced statistical software, applying techniques such as regression analysis and hypothesis testing to draw meaningful conclusions from the data. The section emphasizes the importance of replicability and transparency in the research process, detailing the steps taken to mitigate bias and ensure the robustness of the findings.

Results

In this study, five object detection models—YOLOv5s, YOLOv8s, YOLOv11, SSD-MobileNet-v2-FPNLite-320, and Faster R-CNN—were trained and evaluated on a curated dataset consisting of 6,313 images, which were divided into 5,050 for training, 631 for validation, and 631 for testing. To enhance the models’ generalizability, default data augmentation techniques were employed, and all models were trained using standard parameters to facilitate a fair comparison.

The results indicated that YOLOv8s outperformed all other models, achieving the highest precision, recall, and mean Average Precision (mAP) values, thus positioning it as the most effective model for applications requiring high accuracy and recall. Conversely, YOLOv5s emerged as a viable option for scenarios where reduced training times are essential. Although YOLOv11 demonstrated robustness, it necessitated a greater number of training epochs to reach performance levels comparable to those of YOLOv8s.

Discussion

The dataset collection and curation adhered to ethical guidelines established by the Ethical Review Committee (ERC) of The Aga Khan University Hospital (AKUH), ensuring informed consent was obtained from all participants, including assent from minors. The data included images of adolescents and young adults, with strict inclusion and exclusion criteria to maintain diagnostic quality and relevance. A mobile application facilitated the data collection process, integrating Firebase cloud storage for efficient image transfer, while comprehensive training was provided to dentists for consistent image capture.

A pilot study involving 101 participants was conducted to validate the data collection methodology, leading to refinements before full-scale data collection. The intraoral images were annotated by experienced dentists using the LabelMe tool, with a rigorous validation process ensuring the reliability of the dataset. Inter-rater reliability was assessed using Cohen’s Kappa coefficient, yielding a high agreement score of 0.89. The dataset, organized into structured folders for images and annotations in multiple formats (LabelMe, YOLO, PASCAL VOC, COCO), is available on the Zenodo repository, facilitating its use for training and benchmarking deep learning models in dental caries detection.

Limitations

The dataset utilized in this study presents several notable limitations that may impact the findings. A significant shortcoming is the lack of annotations for dental fluorosis, which is prevalent in the study region. This omission stems from the study’s primary focus on dental caries and the challenges associated with differentiating fluorosis from non-cavitated carious lesions in 2D intraoral images. The visual similarities between these conditions can lead to mislabeling or misinterpretation, highlighting the need for future datasets to include distinct annotations for fluorosis, ideally supported by clinical examinations.

Additionally, the dataset’s reliance on images captured solely with a single type of mobile phone camera restricts the generalizability of the image quality across different devices. Furthermore, the exclusion of individuals under the age of 10 limits the dataset’s applicability for detecting caries in children, particularly concerning primary dentition. Addressing these limitations in future research is essential for enhancing the accuracy and applicability of dental caries detection methodologies.