استخدام نماذج التعلم الآلي في التنبؤ بمجموعات خطر تسوس الأسنان وعوامل الخطر المتعلقة بالصحة الفموية لدى البالغين Utilization of machine learning models in predicting caries risk groups and oral health-related risk factors in adults

المجلة: BMC Oral Health، المجلد: 24، العدد: 1
DOI: https://doi.org/10.1186/s12903-024-04210-z
PMID: https://pubmed.ncbi.nlm.nih.gov/38589865
تاريخ النشر: 2024-04-08
المؤلف: Burak Tunahan Çiftçi وآخرون
الموضوع الرئيسي: صحة الأسنان واستخدام الرعاية

نظرة عامة

**نظرة عامة**

تهدف هذه الدراسة إلى تحليل عوامل الخطر التي تؤثر على صحة الفم لدى البالغين وتقييم فعالية خوارزميات التعلم الآلي المختلفة في التنبؤ بهذه العوامل. تم مسح مجموعة من 2000 مريض تتراوح أعمارهم بين 18 عامًا وما فوق في جامعة غازي عنتاب، حيث أكملوا استبيانًا مكونًا من 30 عنصرًا يقيم العوامل المؤثرة على مؤشر الأسنان المتسوسة والمفقودة والمملوءة (DMFT). تم إجراء الفحوصات السريرية والإشعاعية، وتم تقسيم البيانات إلى مجموعات تدريب (75%) واختبار (25%). تم استخدام خوارزميات تعلم آلي مختلفة، بما في ذلك بايز الساذج، والانحدار اللوجستي، وآلة الدعم الناقل، وشجرة القرار، والغابة العشوائية، والبيرسيبترون متعدد الطبقات (MLP)، لتحليل مجموعة البيانات المعالجة مسبقًا. كما تم استخدام اختبار ارتباط بيرسون لتقييم العلاقة بين درجات DMFT وعوامل خطر صحة الفم.

كشفت النتائج عن ارتباطات كبيرة بين درجات DMFT والعديد من عوامل الخطر، بما في ذلك العمر، والجنس، ومؤشر كتلة الجسم، وتكرار تنظيف الأسنان، والوضع الاجتماعي والاقتصادي، والحالات الصحية مثل ارتفاع ضغط الدم والسكري (p < 0.05). من بين نماذج التعلم الآلي، أظهرت خوارزمية MLP أعلى أداء، حيث حققت دقة 95.8%، ودرجة F1 بلغت 96%، ومعدلات دقة واسترجاع بلغت 96%. تستنتج الدراسة أن استبيانًا بسيطًا يمكن أن يحدد بفعالية الأفراد المعرضين لخطر تسوس الأسنان ويسلط الضوء على عوامل الخطر الرئيسية، مما يبرز أهمية التدابير الوقائية للتخفيف من تدهور صحة الفم والمشاكل الصحية المرتبطة بها. تشير النتائج الواعدة لتطبيقات التعلم الآلي في تقييم مخاطر التسوس إلى إمكانية البحث الإضافي في هذا المجال.

الطرق

يستعرض قسم “الطرق” تصميم التجربة والتقنيات التحليلية المستخدمة في الدراسة. استخدم الباحثون نهجًا كميًا، مع دمج التحليلات الإحصائية لتقييم البيانات التي تم جمعها. تضمنت المنهجيات المحددة تجارب محكومة، حيث تم التلاعب بالمتغيرات بشكل منهجي لمراقبة تأثيراتها على النتائج المعنية.

شملت جمع البيانات أدوات موحدة لضمان الموثوقية والصلاحية، مع التركيز على تقليل التحيز. تم إجراء التحليل باستخدام برامج إحصائية مناسبة، وتطبيق اختبارات مثل ANOVA وتحليل الانحدار لتحديد الفروق والعلاقات المهمة بين المتغيرات. يبرز القسم صرامة الطرق لدعم قوة النتائج والاستنتاجات المستخلصة من البحث.

النتائج

في هذه الدراسة التي شملت 2000 مريض، كشفت التوزيعات عبر مجموعات المخاطر لصحة الأسنان، كما هو موضح بواسطة درجة DMFT، أن 27.3% تم تصنيفهم على أنهم منخفضو المخاطر، و42.5% كمعتدلي المخاطر، و30.2% كمرتفعو المخاطر. تتكون المجموعة من 42.9% ذكور و57.1% إناث، مع جزء كبير (44.5%) تتراوح أعمارهم بين 18 و30 عامًا. أشارت التحليلات الإحصائية إلى ارتباطات كبيرة بين تصنيفات مجموعات المخاطر وعوامل مختلفة، بما في ذلك العمر، والجنس، ومؤشر كتلة الجسم، وممارسات النظافة الفموية، والوضع الاجتماعي والاقتصادي، والحالات الصحية (p < 0.05). أظهرت تحليلات الارتباط وجود ارتباط سلبي منخفض بين العمر واستهلاك الوجبات الخفيفة السكرية ($r = -0.24$) وبين تكرار تنظيف الأسنان والحالة الاجتماعية ($r = -0.25$)، بينما لوحظ ارتباط إيجابي معتدل بين العمر والحالة الاجتماعية ($r = 0.46$). تم استخدام نماذج التعلم الآلي للتنبؤ بمجموعات مخاطر DMFT، حيث حقق نموذج البيرسيبترون متعدد الطبقات (MLP) أعلى دقة بلغت 95.8%، بينما سجل نموذج بايز الساذج (NB) أدنى دقة عند 29.8%. كما أظهر نموذج MLP أيضًا مقاييس أداء قوية، بما في ذلك درجة F1 بلغت 96%، ودقة، واسترجاع. بالمقابل، حقق نموذج RF درجة F1 بلغت 87%، بينما حقق نموذج شجرة القرار (DT) 82%. حدد تحليل أهمية الميزات مستوى التعليم كأهم متنبئ (0.061)، يليه العمر (0.056)، بينما كان لصعوبات الأنشطة اليومية تأثير ضئيل. تؤكد هذه النتائج على القدرات التنبؤية للتعلم الآلي في تقييم مخاطر صحة الأسنان وتبرز أهمية التحصيل التعليمي في مثل هذه التقييمات.

المناقشة

شملت الدراسة 2000 مريض تتراوح أعمارهم بين 18 عامًا وما فوق، مع التركيز على العلاقة بين عوامل مختلفة ومخاطر تسوس الأسنان، كما تم قياسه بواسطة مؤشر DMFT (الأسنان المتسوسة والمفقودة والمملوءة). التزمت البحث بالإرشادات الأخلاقية واستخدمت استبيانًا شاملاً لجمع البيانات حول الخصائص السكانية، وسلوكيات الصحة، والتاريخ الطبي. أشارت النتائج الرئيسية إلى أن عوامل مثل العمر، والجنس، ومؤشر كتلة الجسم، وتكرار تنظيف الأسنان، والحالات الصحية المزمنة (مثل ارتفاع ضغط الدم والسكري) تؤثر بشكل كبير على مخاطر التسوس. ومن الجدير بالذكر أن الدراسة وجدت أنه بينما كان استهلاك الوجبات الخفيفة السكرية وممارسات النظافة الفموية غير الكافية مرتبطين بزيادة مخاطر التسوس، لم يظهر التدخين واستهلاك الكحول ارتباطًا كبيرًا، ربما بسبب انخفاض انتشارهما في العينة.

استخدمت المنهجية خوارزميات التعلم الآلي، بما في ذلك بايز الساذج، والانحدار اللوجستي، وآلات الدعم الناقل، وأشجار القرار، والغابات العشوائية، والبيرسيبترونات متعددة الطبقات، لتصنيف المرضى إلى مجموعات مخاطر بناءً على درجات DMFT الخاصة بهم. أظهر خوارزمية الغابة العشوائية أعلى دقة (86%)، بينما حقق نموذج البيرسيبترون متعدد الطبقات دقة بلغت 96%، مما يبرز إمكانيات التعلم الآلي في التنبؤ بمخاطر تسوس الأسنان. تؤكد الدراسة على أهمية معالجة عوامل الخطر المتعلقة بصحة الفم وتقترح أن التعلم الآلي يمكن أن يكون أداة قيمة لتحسين نتائج صحة الفم وتوجيه التدابير الوقائية في الممارسة السريرية.

Journal: BMC Oral Health, Volume: 24, Issue: 1
DOI: https://doi.org/10.1186/s12903-024-04210-z
PMID: https://pubmed.ncbi.nlm.nih.gov/38589865
Publication Date: 2024-04-08
Author(s): Burak Tunahan Çiftçi et al.
Primary Topic: Dental Health and Care Utilization

Overview

**Overview**

This study aimed to analyze risk factors affecting oral health in adults and evaluate the efficacy of various machine learning algorithms in predicting these factors. A cohort of 2000 patients aged 18 and older was surveyed at Gaziantep University, where they completed a 30-item questionnaire assessing factors influencing the decayed, missing, and filled teeth (DMFT) index. Clinical and radiological examinations were conducted, and the data were divided into training (75%) and test (25%) groups. Various machine learning algorithms, including naive Bayes, logistic regression, support vector machine, decision tree, random forest, and Multilayer Perceptron (MLP), were employed to analyze the preprocessed dataset. Pearson’s correlation test was also utilized to evaluate the relationship between DMFT scores and oral health risk factors.

The findings revealed significant associations between DMFT scores and several risk factors, including age, sex, body mass index, tooth brushing frequency, socioeconomic status, and health conditions such as hypertension and diabetes (p < 0.05). Among the machine learning models, the MLP algorithm exhibited the highest performance, achieving an accuracy of 95.8%, an F1-score of 96%, and precision and recall rates of 96%. The study concludes that a simple questionnaire can effectively identify individuals at risk of dental caries and highlight key risk factors, emphasizing the importance of preventive measures to mitigate oral health deterioration and its associated health issues. The promising results of machine learning applications in caries risk assessment suggest potential for further research in this domain.

Methods

The “Methods” section outlines the experimental design and analytical techniques employed in the study. The researchers utilized a quantitative approach, incorporating statistical analyses to evaluate the data collected. Specific methodologies included controlled experiments, where variables were systematically manipulated to observe their effects on the outcomes of interest.

Data collection involved standardized instruments to ensure reliability and validity, with a focus on minimizing bias. The analysis was conducted using appropriate statistical software, applying tests such as ANOVA and regression analysis to determine significant differences and relationships among the variables. The section emphasizes the rigor of the methods to support the robustness of the findings and conclusions drawn from the research.

Results

In this study involving 2000 patients, the distribution across risk groups for dental health, as indicated by the DMFT score, revealed that 27.3% were classified as low-risk, 42.5% as moderate-risk, and 30.2% as high-risk. The cohort comprised 42.9% males and 57.1% females, with a significant portion (44.5%) aged between 18 and 30 years. Statistical analyses indicated significant associations between risk group classifications and various factors, including age, sex, body mass index, oral hygiene practices, socioeconomic status, and health conditions (p < 0.05). Correlation analyses showed a low negative correlation between age and sugary snack consumption ($r = -0.24$) and between tooth brushing frequency and marital status ($r = -0.25$), while a moderate positive correlation was observed between age and marital status ($r = 0.46$). Machine learning models were employed to predict DMFT risk groups, with the Multi-Layer Perceptron (MLP) model achieving the highest accuracy of 95.8%, while the Naive Bayes (NB) model recorded the lowest accuracy at 29.8%. The MLP model also demonstrated strong performance metrics, including a 96% F1-score, precision, and recall. In contrast, the RF model achieved an F1-score of 87%, and the Decision Tree (DT) model achieved 82%. Feature importance analysis identified education level as the most significant predictor (0.061), followed closely by age (0.056), while difficulties in daily activities had minimal impact. These findings underscore the predictive capabilities of machine learning in assessing dental health risks and highlight the importance of educational attainment in such assessments.

Discussion

The study involved 2000 patients aged 18 and older, focusing on the relationship between various factors and dental caries risk, as measured by the DMFT (decayed, missing, filled teeth) index. The research adhered to ethical guidelines and utilized a comprehensive questionnaire to gather data on demographics, health behaviors, and medical history. Key findings indicated that factors such as age, sex, body mass index, tooth brushing frequency, and chronic health conditions (e.g., hypertension and diabetes) significantly influenced caries risk. Notably, the study found that while sugary snack consumption and inadequate oral hygiene practices were associated with higher caries risk, smoking and alcohol consumption did not show a significant correlation, possibly due to their low prevalence in the sample.

The methodology employed machine learning algorithms, including Naive Bayes, logistic regression, support vector machines, decision trees, random forests, and multilayer perceptrons, to classify patients into risk groups based on their DMFT scores. The random forest algorithm demonstrated the highest accuracy (86%), while the multilayer perceptron model achieved 96% accuracy, highlighting the potential of machine learning in predicting dental caries risk. The study underscores the importance of addressing oral health-related risk factors and suggests that machine learning can be a valuable tool for improving oral health outcomes and guiding preventive measures in clinical practice.