تعزيز اكتشاف مرض باركنسون من خلال التعلم العميق القائم على الميزات باستخدام التشفير التلقائي والشبكات العصبية Enhancing parkinson disease detection through feature based deep learning with autoencoders and neural networks

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-88293-w
PMID: https://pubmed.ncbi.nlm.nih.gov/40075106
تاريخ النشر: 2025-03-13
المؤلف: P. Valarmathi وآخرون
الموضوع الرئيسي: آليات وعلاجات مرض باركنسون

نظرة عامة

تقدم هذه الدراسة نهجًا جديدًا للكشف عن مرض باركنسون (PD) من خلال تحليل الموجات الصوتية، باستخدام تقنيات الشبكات العصبية العميقة المعتمدة على الميزات (FB-DNN). تؤكد الدراسة على أهمية التشخيص الدقيق وفي الوقت المناسب في تحسين نتائج المرضى وتقدم مشفرًا تلقائيًا لاستخراج الميزات بشكل فعال من البيانات الصوتية. حقق نموذج FB-DNN، المدرب على هذه البيانات المستخرجة، دقة ملحوظة بلغت 96.15%، متفوقًا على نماذج أخرى مثل الشبكات العصبية العميقة التقليدية (DNN) والشبكات العصبية (NN)، التي سجلت دقة بلغت 91.43% و88.18% على التوالي. لا تعزز هذه المنهجية دقة التشخيص فحسب، بل تسهل أيضًا التعرف السريع، مما قد يؤدي إلى خيارات علاجية أكثر فعالية.

تؤكد النتائج على الدور الحاسم لمرشح النطاق المعدل في معالجة إشارات الصوت، مما يحسن جودة ودقة التحليل. تقترح الدراسة أن دمج تقنيات متقدمة مثل التعلم العميق الجماعي والتعلم الانتقالي يمكن أن يعزز أداء النموذج وقابليته للتعميم عبر مجموعات سكانية متنوعة. تشمل اتجاهات البحث المستقبلية اختبار النموذج على مجموعات بيانات متنوعة، بما في ذلك التركيبة السكانية المختلفة والإعدادات السريرية، لتقييم موثوقيته وقابليته للتطبيق في العالم الحقيقي. بشكل عام، تضع هذه العمل أساسًا قويًا لتقدم التشخيصات الطبية وتبرز الإمكانية لتحقيق تحسينات كبيرة في الكشف المبكر ورعاية المرضى للأفراد المتأثرين بمرض باركنسون.

طرق

تقدم الدراسة منهجية جديدة للكشف عن مرض باركنسون (PD) من خلال تحليل إشارات الصوت بالاقتران مع تقنيات الشبكات العصبية العميقة المتقدمة (FB-DNN). تستخدم مرحلة المعالجة المسبقة مرشح النطاق المعدل لتعزيز جودة وخصوصية إشارات الصوت، مما يضمن الحفاظ على الميزات ذات الصلة للتحليل.

لاستخراج معلومات هامة من البيانات الصوتية المعالجة، يتم استخدام مشفر تلقائي – نوع من الشبكات العصبية الاصطناعية المصممة لاستخراج الميزات. تلتقط هذه الشبكة بفعالية الأنماط المعقدة داخل إشارات الصوت، محولة البيانات عالية الأبعاد إلى تمثيل أكثر قابلية للإدارة مع تقليل الضوضاء. تشكل الميزات المستخرجة بواسطة المشفر التلقائي أساس المهام التصنيفية اللاحقة، حيث يتم تطبيق طريقة جماعية للتفريق بين عينات الصوت التي تشير إلى مرض PD وتلك الخاصة بالأفراد غير المصابين بالمرض.

مناقشة

في السنوات الأخيرة، تم تطبيق نماذج التعلم الآلي (ML) بشكل متزايد على أبحاث مرض باركنسون (PD)، مدفوعة بالتعاون بين التخصصات بين الطب الحيوي وعلوم الكمبيوتر. يؤثر مرض PD على أكثر من 10 ملايين فرد على مستوى العالم، مما يطرح تحديات تشخيصية كبيرة، خاصة في تحقيق تشخيصات دقيقة وفي الوقت المناسب، حيث تم الإبلاغ عن دقة التشخيص الأولية بنسبة 53% فقط خلال السنوات الخمس الأولى من ظهور الأعراض. لمعالجة هذه التحديات، استكشف الباحثون طرق تصنيف متنوعة، بما في ذلك تحليل التمييز الخطي (LDA)، وجيران الأقرب (KNN)، وأشجار القرار (DT)، والشبكات العصبية (NN)، باستخدام متجهات الميزات الموزونة حسب الصلة. من الجدير بالذكر أن الدراسات التي تستفيد من مقياس تقييم مرض باركنسون الموحد (UPDRS) قد أظهرت معدلات نجاح تشخيصية متفوقة مقارنة بالأساليب الحديثة.

ظهر التحول نحو المراقبة عن بُعد غير الغازية كبديل واعد، مما يتيح تقييمات موثوقة وفعالة من حيث التكلفة مع تخفيف الأعباء عن المرضى. أظهرت التطورات الأخيرة في تقنيات ML، مثل الغابات العشوائية وتعزيز التدرج المتطرف (XGBoost)، دقة تنبؤ محسنة لشدة أعراض مرض PD. بالإضافة إلى ذلك، تم تحديد دمج التحليل الصوتي للصوت كطريقة غير تدخلية للكشف المبكر عن مرض PD، مستفيدة من التغيرات في جودة الصوت التي تسبق الأعراض الحركية. على الرغم من هذه التقدمات، تواجه النماذج الحالية قيودًا، بما في ذلك الاعتماد على مجموعات بيانات صغيرة وعدم كفاية معالجة البيانات الصوتية، مما يمكن أن يعيق دقة التشخيص. يجب أن تركز الأبحاث المستقبلية على استخدام مجموعات بيانات أكبر وأكثر تنوعًا وتعزيز تقنيات معالجة الصوت لتحسين قوة وعمومية نماذج تشخيص مرض PD.

القيود

ت stem القيود المفروضة على نموذج FB-DNN المقترح للكشف عن مرض باركنسون بشكل أساسي من عدم قدرته على حساب التباينات في ميزات الصوت عبر مراحل مختلفة من المرض. بينما يظهر النموذج دقة عالية، إلا أنه يفتقر إلى التحليل المقارن عبر المراحل المبكرة والمتوسطة والمتقدمة، وهو أمر حاسم نظرًا لأن الاضطرابات الصوتية يمكن أن تختلف بشكل كبير اعتمادًا على تقدم المرض. تهدف الأبحاث المستقبلية إلى معالجة ذلك من خلال استخدام مجموعات بيانات موصوفة بالمراحل لتدريب نماذج FB-DNN محددة بالمراحل، مما يعزز الحساسية للأعراض المبكرة التي قد تكون خفية وسهلة التجاهل.

بالإضافة إلى ذلك، فإن قابلية تطبيق النموذج مقيدة بمجموعات بيانات محددة، حيث تم تدريبه بشكل أساسي على الصوت المسجل في بيئات خاضعة للرقابة، مما قد لا يعكس الظروف الواقعية. تسلط هذه القيود الضوء على الحاجة لاختبار النموذج على مجموعات بيانات أكثر تنوعًا تشمل لغات وتركيبات سكانية متنوعة. علاوة على ذلك، بينما يظهر النموذج كفاءة حسابية، فإن نشره في بيئات محدودة الموارد لا يزال يمثل تحديًا. تم اقتراح استراتيجيات مثل تقليم النموذج وتحسينه للأجهزة الطرفية للتخفيف من هذه المشكلة. أخيرًا، فإن الطبيعة “الصندوق الأسود” الكامنة في نماذج التعلم العميق تطرح تحديات في قابلية التفسير في الممارسة السريرية. ستتضمن التحسينات المستقبلية أدوات تفسير مثل SHAP وLIME لتحسين ثقة الأطباء وفهمهم لتنبؤات النموذج. بشكل عام، فإن معالجة هذه القيود أمر ضروري للتطبيق العملي لإطار FB-DNN في الإعدادات السريرية.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-88293-w
PMID: https://pubmed.ncbi.nlm.nih.gov/40075106
Publication Date: 2025-03-13
Author(s): P. Valarmathi et al.
Primary Topic: Parkinson\'s Disease Mechanisms and Treatments

Overview

This research presents a novel approach for the detection of Parkinson’s disease (PD) through audio wave analysis, utilizing Feature Based -Deep Neural Network (FB-DNN) techniques. The study emphasizes the importance of accurate and timely diagnosis in improving patient outcomes and introduces an Autoencoder for effective feature extraction from audio data. The FB-DNN model, trained on this extracted data, achieved a remarkable accuracy of 96.15%, outperforming other models such as traditional Deep Neural Networks (DNN) and Neural Networks (NN), which recorded accuracies of 91.43% and 88.18%, respectively. This methodology not only enhances diagnostic precision but also facilitates prompt identification, potentially leading to more effective treatment options.

The findings underscore the critical role of the Modified Band Pass Filter in preprocessing audio signals, thereby improving the quality and precision of the analysis. The study suggests that the integration of advanced techniques such as ensemble deep learning and transfer learning could further enhance the model’s performance and generalizability across diverse populations. Future research directions include testing the model on varied datasets, including different demographics and clinical settings, to assess its reliability and feasibility in real-world applications. Overall, this work lays a solid foundation for advancing medical diagnostics and highlights the potential for significant improvements in early detection and patient care for individuals affected by Parkinson’s disease.

Methods

The study presents a novel methodology for detecting Parkinson’s Disease (PD) through audio signal analysis combined with Feedforward Deep Neural Network (FB-DNN) techniques. The preprocessing phase employs a Modified Band Pass Filter to enhance the quality and specificity of the audio signals, ensuring that relevant features are preserved for analysis.

To extract significant information from the processed audio data, an Autoencoder—a type of artificial neural network designed for feature extraction—is utilized. This network effectively captures complex patterns within the audio signals, transforming high-dimensional data into a more manageable representation while reducing noise. The features extracted by the Autoencoder form the basis for subsequent classification tasks, where an ensemble method is applied to differentiate between audio samples indicative of PD and those from individuals without the disease.

Discussion

In recent years, machine learning (ML) models have increasingly been applied to Parkinson’s disease (PD) research, driven by interdisciplinary collaborations between biomedicine and computer science. PD, affecting over 10 million individuals globally, poses significant diagnostic challenges, particularly in achieving timely and accurate diagnoses, with initial diagnostic accuracy reported at only 53% within the first five years of symptom onset. To address these challenges, researchers have explored various classification methods, including linear discriminant analysis (LDA), K-nearest neighbor (KNN), decision trees (DT), and neural networks (NN), utilizing feature vectors weighted by relevance. Notably, studies leveraging the Unified Parkinson’s Disease Rating Scale (UPDRS) have demonstrated superior diagnostic success rates compared to state-of-the-art approaches.

The shift towards non-invasive telemonitoring for PD screening has emerged as a promising alternative, enabling reliable and cost-effective assessments while alleviating patient burdens. Recent advancements in ML techniques, such as random forests and Extreme Gradient Boosting (XGBoost), have shown improved prediction accuracy for PD symptom severity. Additionally, the integration of acoustic analysis of voice has been identified as a non-intrusive method for early PD detection, capitalizing on vocal quality changes that precede motor symptoms. Despite these advancements, existing models face limitations, including reliance on small datasets and inadequate preprocessing of audio data, which can hinder diagnostic accuracy. Future research should focus on utilizing larger, more diverse datasets and enhancing audio preprocessing techniques to improve the robustness and generalizability of PD diagnostic models.

Limitations

The limitations of the proposed FB-DNN model for detecting Parkinson’s disease primarily stem from its inability to account for variations in audio features across different stages of the disease. While the model demonstrates high accuracy, it lacks comparative analysis across early, middle, and advanced stages, which is crucial since audio impairments can vary significantly depending on the disease’s progression. Future research aims to address this by utilizing stage-annotated datasets to train stage-specific FB-DNNs, thereby enhancing sensitivity to early symptoms that may be subtle and easily overlooked.

Additionally, the model’s applicability is restricted to specific datasets, as it was primarily trained on audio recorded in controlled environments, which may not reflect real-world conditions. This limitation highlights the need for testing the model on more diverse datasets that encompass various languages and demographics. Furthermore, while the model exhibits computational efficiency, its deployment in resource-limited settings remains a challenge. Strategies such as model pruning and optimization for edge devices are proposed to mitigate this issue. Lastly, the inherent ‘black box’ nature of deep learning models poses interpretability challenges in clinical practice. Future enhancements will incorporate interpretability tools like SHAP and LIME to improve clinician trust and understanding of the model’s predictions. Overall, addressing these limitations is essential for the practical application of the FB-DNN framework in clinical settings.