تطبيقات التعلم الآلي في توقع حوادث السلامة في صناعة البناء Machine learning applications for predicting safety incidents in construction industry

المجلة: Scientific Reports، المجلد: 16، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-34763-0
PMID: https://pubmed.ncbi.nlm.nih.gov/41519974
تاريخ النشر: 2026-01-10
المؤلف: Saleh Alsulamy وآخرون
الموضوع الرئيسي: أبحاث الصحة والسلامة المهنية

نظرة عامة

تبحث الدراسة في حوادث مواقع البناء في المملكة العربية السعودية، مع التأكيد على الحاجة إلى اتخاذ تدابير استباقية لتقليل الوفيات والإصابات الخطيرة. من خلال تحليل مجموعة بيانات تضم 203 حادثة من 2018 إلى 2024، تستخدم الدراسة ستة خوارزميات تعلم آلي (ML) – الجار الأقرب (KNN)، آلة الدعم الناقل (SVM)، شجرة القرار (DT)، الغابة العشوائية (RF)، تعزيز التدرج (GB)، وتعزيز التدرج المتطرف (XGB) – للتنبؤ بطبيعة الحوادث (NOI) وشدتها (SOI). تشير النتائج إلى أن جميع النماذج حققت دقة تزيد عن 60%، حيث حقق XGB أعلى دقة بنسبة 89% في التنبؤ بشدة الحوادث. ومن الجدير بالذكر أن دمج NOI كميزة تفسيرية عزز من تنبؤات SOI، مما يبرز العلاقة المتبادلة بين خصائص الحوادث وشدتها.

تخلص الدراسة إلى أن طرق التعزيز، وخاصة XGB و GB، تفوقت بشكل كبير على النماذج غير المعززة، مما يوضح فعاليتها في التقاط التعقيدات والتفاعلات داخل بيانات السلامة. تشمل العوامل المؤثرة الرئيسية التي تم تحديدها من خلال تحليل SHAP NOI، وهطول الأمطار، وتاريخ الحادث، وحجم القوة العاملة، مع مساهمات إضافية من الموقع، والتدريب على السلامة، والامتثال لمعدات الحماية الشخصية (PPE). توفر هذه النتائج رؤى قيمة لشركات البناء، مما يسهل تطوير استراتيجيات استجابة طارئة مستهدفة وتحسين تخصيص موارد الإسعافات الأولية في الموقع.

مقدمة

يرتبط قطاع البناء بشكل ملحوظ بتحديات كبيرة في الصحة والسلامة، كما يتضح من معدلات الإصابات القاتلة وغير القاتلة العالية بين العمال. في الولايات المتحدة، تم الإبلاغ عن أكثر من 1,000 حالة وفاة للعمال في عام 2019، مما يمثل ما يقرب من 20% من جميع حالات الوفاة في القطاع الخاص. وبالمثل، فإن صناعة البناء في المملكة المتحدة لديها معدل وفيات للعمال يزيد أربع مرات عن متوسط الصناعة بشكل عام. في الدول النامية مثل إيران، على الرغم من أن البناء يوظف أقل من 12% من القوة العاملة، إلا أن شدة الإصابات لا تزال مرتفعة بشكل مقلق، مما يبرز الحاجة إلى تحقيقات شاملة بعد الحوادث لإبلاغ تحسينات السلامة.

تعمل التطورات الأخيرة في الذكاء الاصطناعي (AI) وتعلم الآلة (ML) على تحويل عمليات اتخاذ القرار داخل صناعة البناء. تعزز هذه التقنيات أنظمة دعم القرار (DSS) من خلال أتمتة العمليات، وتحسين تخصيص الموارد، وتحديد المخاطر المحتملة، مما يزيد من الكفاءة ويقلل التكاليف. على سبيل المثال، أظهرت الدراسات فعالية خوارزميات ML في التنبؤ بتأخيرات الإنتاج وتحسين إدارة الموارد، محققة معدلات دقة عالية في تطبيقات متنوعة. على عكس الطرق التقليدية التي تعتمد على نماذج إحصائية محددة مسبقًا، يمكن لـ ML تحليل مجموعات بيانات معقدة لكشف الأنماط المخفية والتنبؤ باتجاهات الحوادث، مما يعزز موثوقية تقييمات السلامة ويمكّن استراتيجيات التخفيف من المخاطر الاستباقية. وهذا يضع ML كأداة حيوية في تحسين ممارسات السلامة داخل قطاع البناء.

طرق البحث

في هذا القسم، يوضح المؤلفون منهجية بحثهم، مع التأكيد على قابليتها للتكيف لتطبيقات أوسع. يوضحون النهج المنهجي المتبع لضمان قوة وموثوقية نتائجهم، والذي يتضمن مجموعة من التقنيات الكمية والنوعية. تم تصميم المنهجية لتكون مرنة، مما يسمح بإجراء تعديلات تلبي مختلف السياقات وأسئلة البحث.

كما يناقش المؤلفون أهمية إطارهم المنهجي في تعزيز قابلية تعميم النتائج. من خلال استخدام مجموعة متنوعة من الأدوات والاستراتيجيات، يهدفون إلى معالجة القيود والتحيزات المحتملة، مما يعزز من صحة استنتاجاتهم. لا تسهل هذه القابلية للتكيف فقط تطبيق نتائجهم عبر مجالات مختلفة، بل تشجع أيضًا على إجراء أبحاث مستقبلية لبناء عملهم بطريقة ذات مغزى.

مناقشة

تسلط قسم المناقشة في ورقة البحث الضوء على الدور المتطور لتعلم الآلة (ML) في تعزيز سلامة البناء من خلال النمذجة التنبؤية، مبتعدًا عن التحليلات الوصفية التقليدية. أظهرت الدراسات الأخيرة إمكانية ML في تحليل مجموعات بيانات كبيرة وتحديد الأنماط المعقدة المتعلقة بعوامل مختلفة تؤثر على حوادث السلامة. ومن الجدير بالذكر أن الأبحاث السابقة قد ركزت على التنبؤ بمخاطر الوفيات باستخدام نماذج مثل الانحدار اللوجستي، وأشجار القرار، والغابات العشوائية، على الرغم من أنه تم الإشارة إلى بعض القيود، مثل استبعاد متغيرات مستوى الإصابة. استكشفت دراسات أخرى تصنيف الحوادث وأهمية استخراج الميزات، مما يبرز الأهمية التشغيلية لممارسات الموقع في تقليل المخاطر.

تستند الدراسة الحالية إلى هذه النتائج من خلال استخدام تقنيات ML المتقدمة، بما في ذلك تعزيز التدرج وتعزيز التدرج المتطرف، للتنبؤ بطبيعة وشدة حوادث البناء. تشمل المنهجية عملية جمع بيانات شاملة، تعالج قضايا مثل عدم توازن الفئات من خلال تقنيات مثل تقنية زيادة العينة الأقلية الاصطناعية (SMOTE). تشير النتائج إلى أن نموذج XGB تفوق على الآخرين في كل من طبيعة الحادث (NOI) وتنبؤات شدة الحادث (SOI)، محققًا دقتين بنسبة 89% و83%، على التوالي. تتضمن الدراسة أيضًا تحليل SHAP لتقديم رؤى حول أكثر المتنبئين تأثيرًا، مما يعزز من قابلية تفسير النموذج ويدعم ممارسات إدارة السلامة المعتمدة على البيانات، خاصة في سياق صناعة البناء في المملكة العربية السعودية.

القيود

تستند قيود هذه الدراسة بشكل أساسي إلى مجموعة البيانات، التي تتكون فقط من 203 حادثة تتعلق بسلامة البناء من منطقتين في المملكة العربية السعودية على مدى ست سنوات. تعيق هذه العينة الصغيرة والنطاق الجغرافي المحدود قابلية تعميم النتائج، خاصة لأنواع الحوادث الأقل تكرارًا. علاوة على ذلك، قد تؤدي التناقضات في تقارير الحوادث والبيانات المفقودة للمتغيرات التفسيرية الرئيسية – مثل الوقت مع صاحب العمل وأنواع معدات الحماية الشخصية (PPE) المحددة – إلى إدخال تحيز وتقليل دقة النموذج. تعقد غياب تنسيق موحد لتسجيل الحوادث مقارنة المتغيرات، والاعتماد على السجلات الإدارية التاريخية يعني أن العوامل الحرجة مثل الحوادث القريبة وظروف مستوى الطاقم قد يتم تجاهلها.

بالإضافة إلى ذلك، قد تؤثر الطبيعة المتطورة للوائح السلامة وممارسات الموقع خلال فترة الدراسة على اتساق البيانات. لم يتم التحقق من صحة النماذج التي تم تطويرها في هذا البحث مقابل بيانات من مناطق أو منظمات أخرى، مما يحد من قابليتها للتطبيق. لمعالجة هذه القيود، ينبغي أن تركز الأبحاث المستقبلية على استخدام مجموعات بيانات أكبر وأكثر تنوعًا تم جمعها وفقًا لبروتوكولات موحدة والتحقق منها مع عينات مستقلة لتعزيز قوة وموثوقية النتائج.

Journal: Scientific Reports, Volume: 16, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-34763-0
PMID: https://pubmed.ncbi.nlm.nih.gov/41519974
Publication Date: 2026-01-10
Author(s): Saleh Alsulamy et al.
Primary Topic: Occupational Health and Safety Research

Overview

The research investigates construction site accidents in Saudi Arabia, emphasizing the need for proactive measures to reduce fatalities and severe injuries. By analyzing a dataset of 203 incidents from 2018 to 2024, the study employs six machine learning (ML) algorithms—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB)—to predict both the nature of incidents (NOI) and their severity (SOI). The results indicate that all models achieved over 60% accuracy, with XGB achieving the highest accuracy of 89% for SOI prediction. Notably, incorporating NOI as an explanatory feature enhanced SOI predictions, highlighting the interrelationship between incident characteristics and severity.

The study concludes that boosting methods, particularly XGB and GB, significantly outperformed non-boosting models, demonstrating their effectiveness in capturing the complexities and interactions within safety data. Key influential factors identified through SHAP analysis include NOI, precipitation, date of incident, and workforce size, with additional contributions from location, safety training, and personal protective equipment (PPE) compliance. These findings provide valuable insights for construction firms, facilitating the development of targeted emergency response strategies and optimizing the allocation of first-aid resources on-site.

Introduction

The construction sector is notably associated with significant health and safety challenges, as evidenced by high rates of both fatal and non-fatal injuries among workers. In the United States, over 1,000 worker fatalities were reported in 2019, representing nearly 20% of all private sector fatalities. Similarly, the UK construction industry has a fatality rate for workers that is four times higher than the overall industry average. In developing nations like Iran, despite construction employing less than 12% of the workforce, the severity of injuries remains alarmingly high, underscoring the need for thorough post-incident investigations to inform safety improvements.

Recent advancements in Artificial Intelligence (AI) and Machine Learning (ML) are transforming decision-making processes within the construction industry. These technologies enhance decision support systems (DSS) by automating processes, improving resource allocation, and identifying potential hazards, thereby increasing efficiency and reducing costs. For instance, studies have demonstrated the effectiveness of ML algorithms in predicting production delays and optimizing resource management, achieving high accuracy rates in various applications. Unlike traditional methods that rely on predefined statistical models, ML can analyze complex datasets to uncover hidden patterns and predict incident trends, thereby enhancing the reliability of safety assessments and enabling proactive risk mitigation strategies. This positions ML as a vital tool in improving safety practices within the construction sector.

Methods

In this section, the authors outline their research methodology, emphasizing its adaptability for broader applications. They detail the systematic approach taken to ensure the robustness and reliability of their findings, which includes a combination of quantitative and qualitative techniques. The methodology is designed to be flexible, allowing for modifications that cater to various contexts and research questions.

The authors also discuss the significance of their methodological framework in enhancing the generalizability of the results. By employing a diverse set of tools and strategies, they aim to address potential limitations and biases, thereby strengthening the validity of their conclusions. This adaptability not only facilitates the application of their findings across different fields but also encourages future research to build upon their work in a meaningful way.

Discussion

The discussion section of the research paper highlights the evolving role of machine learning (ML) in enhancing construction safety through predictive modeling, moving away from traditional descriptive analyses. Recent studies have demonstrated the potential of ML to analyze large datasets and identify complex patterns related to various factors influencing safety incidents. Notably, previous research has focused on predicting fatality risks using models such as Logistic Regression, Decision Trees, and Random Forests, although some limitations, such as the exclusion of injury-level variables, have been noted. Other studies have explored incident classification and the importance of feature extraction, emphasizing the operational relevance of site practices in risk reduction.

The current study builds on these findings by employing advanced ML techniques, including Gradient Boosting and Extreme Gradient Boosting, to predict the nature and severity of construction incidents. The methodology includes a comprehensive data collection process, addressing issues such as class imbalance through techniques like Synthetic Minority Oversampling Technique (SMOTE). The results indicate that the XGB model outperformed others in both nature of incident (NOI) and severity of incident (SOI) predictions, achieving accuracies of 89% and 83%, respectively. The study also incorporates SHAP analysis to provide insights into the most influential predictors, thereby enhancing model interpretability and supporting data-driven safety management practices, particularly within the context of the Saudi Arabian construction industry.

Limitations

The limitations of this study are primarily rooted in the dataset, which comprises only 203 construction-safety incidents from two regions in Saudi Arabia over six years. This small sample size and restricted geographic scope hinder the generalizability of the findings, particularly for less frequent incident types. Furthermore, inconsistencies in incident reporting and missing data for key explanatory variables—such as time with employer and specific personal protective equipment (PPE) types—may introduce bias and reduce model accuracy. The absence of a standardized incident recording format complicates variable comparison, and reliance on historical administrative records means that critical factors like near-misses and crew-level conditions may be overlooked.

Additionally, the evolving nature of safety regulations and site practices during the study period could impact data consistency. The models developed in this research have not been validated against data from other regions or organizations, which limits their applicability. To address these limitations, future research should focus on utilizing larger, more diverse datasets collected under standardized protocols and validated with independent samples to enhance the robustness and transferability of findings.