درع التعب لدى السائقين (DDSH): نظام كشف تعب السائقين في الوقت الحقيقي Driver drowsiness shield (DDSH): a real-time driver drowsiness detection system

المجلة: ROBOMECH Journal، المجلد: 12، العدد: 1
DOI: https://doi.org/10.1186/s40648-025-00307-4
تاريخ النشر: 2025-05-15
المؤلف: Archita Bhanja وآخرون
الموضوع الرئيسي: النوم والإرهاق المرتبط بالعمل

نظرة عامة

تقدم ورقة البحث نظامًا للكشف عن النعاس في الوقت الحقيقي يهدف إلى تعزيز سلامة المرور من خلال استخدام خوارزميات التعلم العميق. يستخدم النظام التعلم الانتقالي بناءً على بنية MobileNet لتصنيف حالات العين كمنفتحة أو مغلقة، مستفيدًا من مجموعة بيانات تم تنسيقها بعناية من مجموعة بيانات MRL للعيون. حقق النموذج مقاييس أداء مثيرة للإعجاب، بما في ذلك دقة 90%، دقة 100%، استرجاع 83.3%، ودرجة F1 قدرها 90.9%. يسمح دمج النموذج المدرب في تطبيق في الوقت الحقيقي بمراقبة حالات عيون السائقين من خلال تدفقات الفيديو، مما يتنبأ بفعالية بعلامات النعاس ويوفر حلاً عمليًا للتخفيف من الحوادث المتعلقة بالتعب.

في الختام، يبرز المؤلفون التنفيذ الناجح لـ MobileNet لتصنيف حالات العين في الوقت الحقيقي ويؤكدون على أهمية جودة مجموعة البيانات في تحقيق دقة عالية. تهدف الأعمال المستقبلية إلى تعزيز قوة النظام من خلال دمج مصادر بيانات متنوعة وتحسين الخوارزميات باستخدام تقنيات مثل آليات الانتباه والنمذجة الزمنية. كما يقترح المؤلفون التعاون مع مهندسي السيارات وباحثي العوامل البشرية لتحسين تكامل الأجهزة وتصميم واجهة المستخدم. بالإضافة إلى ذلك، يخططون لتنفيذ حلول الحوسبة الطرفية لتقليل زمن الاستجابة ومعالجة الاعتبارات الأخلاقية المتعلقة بالخصوصية واستقلالية السائق. الهدف النهائي هو تقليل مخاطر الحوادث وتحسين سلامة الطرق مع استكشاف تطبيقات أوسع في الصناعات التي تتطلب اليقظة. الكود الخاص بالمشروع متاح للجمهور على GitHub.

مقدمة

تسلط مقدمة ورقة البحث هذه الضوء على الإمكانات التحويلية للذكاء الاصطناعي (AI) عبر مختلف القطاعات، مع التركيز بشكل خاص على نجاح الشبكات العصبية الاصطناعية (ANNs) في تحقيق دقة عالية في التطبيقات الحيوية مثل الرعاية الصحية والعدالة الجنائية. تحدد الورقة تعب السائقين كقضية سلامة هامة، تتفاقم بسبب ساعات القيادة الطويلة وقلة الراحة، والتي تفشل الطرق التقليدية للكشف في معالجتها بشكل كاف. لمكافحة هذه المشكلة، تقترح الدراسة تطوير نظام للكشف عن نعاس السائقين بشكل آلي وفي الوقت الحقيقي باستخدام تقنيات التعلم العميق (DL) والتعلم الآلي (ML).

يهدف النظام المقترح إلى مراقبة حالة عيون السائق بشكل مستمر من خلال تدفقات فيديو كاميرا الويب، مستخدمًا التعلم الانتقالي بناءً على بنية MobileNet لتصنيف العيون كمنفتحة أو مغلقة. عند اكتشاف علامات النعاس، سيصدر النظام تنبيهات في الوقت المناسب لمنع الحوادث المحتملة. تؤكد الدراسة على أهمية المعالجة المسبقة الدقيقة لصور العيون من مجموعة بيانات Media Research Lab (MRL) وتدمج مقاييس التقييم مثل الدقة والدقة والاسترجاع ودرجة F1 لتقييم أداء النموذج. في النهاية، تسعى الدراسة إلى تعزيز سلامة الطرق من خلال توفير حل مرن وفعال للكشف عن تعب السائقين، مما يمثل تقدمًا كبيرًا في تطبيق التكنولوجيا من أجل السلامة العامة. ستفصل الأقسام التالية من الورقة التقنيات المستخدمة، ومجموعات البيانات، وبنية النموذج، والنتائج التجريبية، والاستنتاجات المستخلصة من البحث.

طرق

في هذه الدراسة، استخدم المؤلفون بنية MobileNet لتصنيف حالات العين (منفتحة أو مغلقة) باستخدام نهج التعلم العميق. يستخدم النموذج دالة تنشيط ReLU عبر جميع طبقاته، وينتهي بدالة تنشيط softmax في الطبقة النهائية لتوليد احتمالات الفئات. يتم التدريب باستخدام دالة خسارة متوسط مربع الخطأ ومُحسِّن Adam، مع تضمين تقسيم تحقق بنسبة 10% من مجموعة التدريب لمراقبة الأداء والتخفيف من الإفراط في التكيف. يتم تقييم قدرة تعميم النموذج على مجموعة اختبار تتكون من صور غير مرئية.

تستمد مجموعة البيانات المستخدمة للتدريب والتقييم من مجموعة بيانات MRL Eyes 2018، والتي تتضمن صورًا مصنفة كـ “عيون مغلقة” و”عيون مفتوحة”. تعتبر خطوات المعالجة المسبقة حاسمة لتحضير الصور الخام: يتم تحويلها إلى تدرج الرمادي، وإعادة تحجيمها إلى 224×224 بكسل، وتطبيعها إلى نطاق قيم بكسل من 0 إلى 1. يسهل هذا النهج المنهجي استخراج الميزات بشكل فعال ويعزز قدرة النموذج على التمييز بين حالات اليقظة والنعاس بناءً على حالة العين. يتم تقديم مواصفات النظام التفصيلية ومعلمات النموذج في الجدولين 1 و2، على التوالي.

نتائج

يقدم قسم النتائج تقييمًا لنموذج الكشف عن نعاس السائق باستخدام مصفوفة الالتباس، التي تقيس أداء النموذج التنبؤي من خلال تفصيل عدد الإيجابيات الحقيقية (TP) والسلبيات الحقيقية (TN) والإيجابيات الكاذبة (FP) والسلبيات الكاذبة (FN). تعتبر هذه المصفوفة أداة حاسمة لتقييم دقة نظام التعرف، مما يسمح بتحليل شامل لأداء النموذج.

تم استخدام بنية MobileNet بسبب كفاءتها وملاءمتها لمهام التصنيف الثنائي. تم استخدام الأوزان المدربة مسبقًا من ImageNet لتهيئة النموذج، وتم إعادة تحجيم الصور المدخلة من 84 × 84 بكسل إلى 224 × 224 بكسل لتعزيز قدرة النموذج على التقاط التفاصيل الدقيقة مع الحفاظ على التوافق مع MobileNet. تشير مقاييس التقييم، بما في ذلك منحنى الخصائص التشغيلية المستقبلية (ROC) والمساحة تحت المنحنى (AUC)، إلى أداء قوي عبر عتبات مختلفة، مما يبرز القدرات الاستثنائية للنموذج في الكشف عن نعاس السائقين.

نقاش

يؤكد قسم النقاش في ورقة البحث على الدور الحاسم لأنظمة الكشف عن النعاس في تعزيز سلامة الطرق من خلال منع الحوادث الناتجة عن تعب السائقين. يكشف استعراض الأدبيات الشامل عن منهجيات متنوعة، بما في ذلك الأساليب المعتمدة على المستشعرات، والأساليب المعتمدة على الصور، والأساليب الهجينة، مع التركيز على التقنيات غير المتطفلة وخوارزميات التعلم الآلي (ML). على الرغم من التقدم، لا تزال التحديات قائمة، خاصة في تحقيق أداء قوي عبر ظروف بيئية متنوعة واختلافات فردية، حيث تشير العديد من الأنظمة إلى معدلات دقة أقل من 90%. تشمل المساهمات الملحوظة استخدام معالم الوجه ومقاييس مثل نسبة العين (EAR) ونسبة الفم للكشف عن النعاس، بالإضافة إلى تنفيذ نماذج متقدمة مثل الشبكات العصبية التلافيفية (CNNs) المدمجة مع الشبكات الذاكرة الطويلة القصيرة (LSTM).

يستفيد نظام الكشف عن النعاس المقترح من التعلم الانتقالي مع MobileNet، مما يحسن تصنيف حالة العين في الوقت الحقيقي من خلال نهج منظم يتضمن إعداد البيانات، وتدريب النموذج، والكشف في الوقت الحقيقي باستخدام OpenCV. يظهر النموذج تحسينات كبيرة في دقة التدريب، حيث حقق دقة تحقق قدرها 1.0000 وانخفاضًا ملحوظًا في خسارة التدريب. يسمح الهيكل الخفيف للنظام بنشره في بيئات متنوعة، مما يجعله ذا قيمة خاصة للتطبيقات في النقل لمسافات طويلة وخدمات مشاركة الركوب. تهدف الأعمال المستقبلية إلى تعزيز قوة النموذج من خلال دمج بيانات متعددة الأنماط، وتحسين الخوارزميات لتقليل الإيجابيات الكاذبة، واستكشاف التعاونات لتكامل الأجهزة مع أنظمة مساعدة السائق المتقدمة (ADAS) الحالية. يطمح المشروع إلى المساهمة في تطبيقات أوسع في الرعاية الصحية ومراقبة سلامة مكان العمل، مع الهدف النهائي المتمثل في تحسين سلامة السائقين وتقليل مخاطر الحوادث.

Journal: ROBOMECH Journal, Volume: 12, Issue: 1
DOI: https://doi.org/10.1186/s40648-025-00307-4
Publication Date: 2025-05-15
Author(s): Archita Bhanja et al.
Primary Topic: Sleep and Work-Related Fatigue

Overview

The research paper presents a real-time drowsiness detection system aimed at enhancing traffic safety by utilizing deep learning algorithms. The system employs Transfer Learning based on the MobileNet architecture to classify eye states as open or closed, leveraging a carefully curated dataset from the MRL Eye Dataset. The model achieved impressive performance metrics, including 90% accuracy, 100% precision, 83.3% recall, and an F1-score of 90.9%. The integration of the trained model into a real-time application allows for monitoring drivers’ eye conditions through video streams, effectively predicting signs of drowsiness and providing a practical solution to mitigate fatigue-related accidents.

In the conclusion, the authors highlight the successful implementation of MobileNet for real-time eye state categorization and emphasize the importance of dataset quality in achieving high accuracy. Future work aims to enhance the system’s robustness by incorporating diverse data sources and optimizing algorithms with techniques such as attention mechanisms and temporal modeling. The authors also propose collaborations with automotive engineers and human factors researchers to improve hardware integration and user interface design. Additionally, they plan to implement edge computing solutions to minimize latency and address ethical considerations related to privacy and driver autonomy. The ultimate goal is to reduce accident risks and improve road safety while exploring broader applications in industries requiring vigilance. The code for the project is publicly accessible on GitHub.

Introduction

The introduction of this research paper highlights the transformative potential of Artificial Intelligence (AI) across various sectors, particularly emphasizing the success of Artificial Neural Networks (ANNs) in achieving high accuracy in critical applications such as healthcare and criminal justice. The paper identifies driver fatigue as a significant safety concern, exacerbated by long driving hours and insufficient rest, which traditional detection methods fail to adequately address. To combat this issue, the study proposes the development of an automated, real-time driver drowsiness detection system utilizing Deep Learning (DL) and Machine Learning (ML) techniques.

The proposed system aims to continuously monitor a driver’s eye state through webcam video streams, employing transfer learning based on the MobileNet architecture to classify eyes as open or closed. Upon detecting signs of drowsiness, the system will issue timely alerts to prevent potential accidents. The research emphasizes the importance of thorough preprocessing of eye images from the Media Research Lab (MRL) Eye Dataset and incorporates evaluation metrics such as accuracy, precision, recall, and F1-score to assess the model’s performance. Ultimately, the study seeks to enhance road safety by providing a flexible and effective solution for detecting driver fatigue, marking a significant advancement in the application of technology for public safety. The subsequent sections of the paper will detail the technologies used, datasets, model architecture, experimental results, and conclusions drawn from the research.

Methods

In this study, the authors employed the MobileNet architecture for classifying eye states (open or closed) using a deep learning approach. The model utilizes the ReLU activation function throughout its layers, culminating in a softmax activation in the final layer to generate class probabilities. Training is conducted with the mean squared error loss function and the Adam optimizer, incorporating a 10% validation split from the training set to monitor performance and mitigate overfitting. The model’s generalization capability is assessed on a test set comprised of unseen images.

The dataset utilized for training and evaluation is derived from the MRL Eyes 2018 dataset, which includes images categorized as “Closed Eyes” and “Open Eyes.” Preprocessing steps are critical for preparing the raw images: they are converted to grayscale, resized to 224×224 pixels, and normalized to a pixel value range of 0 to 1. This systematic approach facilitates effective feature extraction and enhances the model’s ability to distinguish between alert and drowsy states based on eye status. Detailed system specifications and model parameters are provided in Tables 1 and 2, respectively.

Results

The results section presents an evaluation of a driver drowsiness detection model using a confusion matrix, which quantifies the model’s predictive performance by detailing counts of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). This matrix serves as a critical tool for assessing the accuracy of the identification system, allowing for a comprehensive analysis of model performance.

The MobileNet architecture was employed due to its efficiency and suitability for binary classification tasks. Pre-trained weights from ImageNet were utilized to initialize the model, and input images were resized from 84 x 84 pixels to 224 x 224 pixels to enhance the model’s ability to capture fine-grained details while maintaining compatibility with MobileNet. The evaluation metrics, including the Receiver Operating Characteristic (ROC) curve and the Area Under the Curve (AUC), indicate strong performance across various thresholds, highlighting the model’s exceptional discriminative capabilities in detecting driver drowsiness.

Discussion

The discussion section of the research paper emphasizes the critical role of drowsiness detection systems in enhancing road safety by preventing accidents caused by driver fatigue. A comprehensive literature survey reveals various methodologies, including sensor-based, image-based, and hybrid approaches, with a focus on non-intrusive techniques and machine learning (ML) algorithms. Despite advancements, challenges remain, particularly in achieving robust performance across diverse environmental conditions and individual differences, with many systems reporting accuracy rates below 90%. Notable contributions include the use of facial landmarks and metrics like the Eye Aspect Ratio (EAR) and Mouth Aspect Ratio for drowsiness detection, as well as the implementation of advanced models like Convolutional Neural Networks (CNNs) combined with Long Short-Term Memory (LSTM) networks.

The proposed drowsiness detection system leverages transfer learning with MobileNet, optimizing real-time eye state classification through a structured approach that includes data preparation, model training, and real-time detection using OpenCV. The model demonstrates significant improvements in training accuracy, achieving a validation accuracy of 1.0000 and a notable reduction in training loss. The system’s lightweight architecture allows for deployment in various environments, making it particularly valuable for applications in long-haul transportation and ride-sharing services. Future work aims to enhance the model’s robustness by integrating multi-modal data, optimizing algorithms to reduce false positives, and exploring collaborations for hardware integration with existing Advanced Driver-Assistance Systems (ADAS). The project aspires to contribute to broader applications in healthcare and workplace safety monitoring, ultimately aiming to improve driver safety and reduce accident risks.