كشف الشذوذ لمراقبة جودة البيانات الآلية في كاشف CMS Anomaly Detection for Automated Data Quality Monitoring in the CMS Detector

المجلة: EPJ Research Infrastructures، المجلد: 10، العدد: 1
DOI: https://doi.org/10.1007/s41781-025-00147-2
PMID: https://pubmed.ncbi.nlm.nih.gov/41726844
تاريخ النشر: 2026-02-09
المؤلف: Andrew Brinkerhoff وآخرون
الموضوع الرئيسي: دراسات فيزياء الجسيمات النظرية والتجريبية

نظرة عامة

يقدم القسم نظام “AutoDQM”، المصمم لمراقبة جودة البيانات تلقائيًا في كواشف الجسيمات الكبيرة، وبشكل خاص الكاشف المدمج للميونات (CMS) في مصادم الهادرونات الكبير في سيرن. يستخدم هذا النظام تقنيات إحصائية متقدمة وتعلم الآلة غير المراقب لتسهيل التقييمات السريعة والشاملة لجودة البيانات.

تقوم الدراسة بتقييم خوارزميات كشف الشذوذ باستخدام دالة الاحتمال بيتا-ثنائية الحدود وتحليل المكونات الرئيسية على مجموعة البيانات الكاملة لاصطدامات البروتون-بروتون من CMS في عام 2022. تشير النتائج إلى أن AutoDQM ينجح في تحديد البيانات الشاذة “السيئة”، التي تتأثر بشكل كبير بأعطال الكاشف، بمعدل يتراوح بين 4 إلى 6 مرات أعلى من البيانات “الجيدة”. وهذا يظهر فعالية النظام كأداة قوية لمراقبة جودة البيانات بشكل عام في تجارب الفيزياء عالية الطاقة.

مقدمة

تجربة الكاشف المدمج للميونات (CMS) هي كاشف جسيمات متطور في مصادم الهادرونات الكبير في سيرن (LHC)، يهدف بشكل أساسي إلى تحليل اصطدامات البروتون-بروتون عالية الطاقة. لقد لعبت دورًا حاسمًا في الاكتشاف المشترك لز boson هيغز جنبًا إلى جنب مع تجربة ATLAS، وتركز حاليًا على استكشاف ظواهر فيزيائية جديدة، بما في ذلك المادة المظلمة وعدم التماثل بين المادة والمادة المضادة في الكون. يستخدم CMS خوارزمية شاملة “لتدفق الجسيمات” لتحديد وقياس الجسيمات المختلفة، مستفيدًا من متتبع سيليكون متعدد الطبقات وكالوريمترات متقدمة لالتقاط إيداعات الطاقة من الجسيمات المشحونة والمحايدة. يتم تعزيز كفاءة النظام من خلال آلية تحفيز من مستويين تقوم بتصفية أحداث الاصطدام للتحليل التفصيلي.

تحدٍ كبير لـ CMS هو الحفاظ على جودة البيانات من خلال المراقبة المستمرة لعمليات الكاشف وإعادة البناء. يتضمن ذلك معالجة بيانات الاصطدام في الوقت الحقيقي لتوليد الرسوم البيانية التي تقيم أداء الكاشف، مع تدخل موظفين مدربين عند اكتشاف الشذوذ. يعد نظام مراقبة جودة البيانات (DQM) ضروريًا لتحديد وتخفيف المشكلات، مما يقلل من كمية البيانات “السيئة” المجمعة. في هذا السياق، يقدم البحث AutoDQM، أداة قائمة على الويب تستفيد من الأساليب الإحصائية وتعلم الآلة غير المراقب لمراقبة جودة البيانات تلقائيًا. تستخدم الأداة تقنيات كشف الشذوذ، بما في ذلك دالة الاحتمال بيتا-ثنائية الحدود وتحليل المكونات الرئيسية (PCA)، ويتم تقييم أدائها باستخدام بيانات من تشغيل 2022، مع مناقشة النتائج والخطط المستقبلية في الأقسام التالية.

النتائج

تشير النتائج إلى أن كل من خوارزميات بيتا-ثنائية الحدود وPCA تميز بفعالية بين الجولات الجيدة والسيئة في سياق مراقبة جودة بيانات CMS. بشكل محدد، عندما يكون متوسط عدد علامات الرسم البياني منخفضًا (أقل من 3) للجولات الجيدة، تكشف الرسوم البيانية لتكرار التردد العالي (HF) أن الجولات السيئة تظهر 3 إلى 4 مرات أكثر من العلامات. تؤكد رسوم RF ROC هذه النتيجة، حيث تظهر أنه بينما يتم وضع علامة على أقل من 12% من الجولات الجيدة، تتجاوز 35 إلى 50% من الجولات السيئة العتبة. يُعترف بأن AutoDQM، مثل طرق كشف الشذوذ الأخرى، لا يمكن أن يُتوقع منها تحديد جميع الجولات السيئة، ولا يمكنها تحقيق معدل وضع علامات 0% للجولات الجيدة، حيث قد تحتوي بعض الجولات الجيدة على شذوذات حقيقية.

تحسن أداء اختبارات بيتا-ثنائية الحدود χ² وZ’ max بشكل كبير مع زيادة عدد الجولات المرجعية، حيث تكون توزيعات شغل الجسيمات المعاد بناؤها حساسة لظروف التراكم المتغيرة. تأخذ خوارزميات PCA في الاعتبار بشكل طبيعي هذا الاعتماد على التراكم من خلال التدريب على مجموعة متنوعة من الجولات. على الرغم من وجود تباين بين الخوارزميات، لا تبرز طريقة واحدة كأفضل؛ بل يتم الحصول على أفضل النتائج عند تطبيق جميع اختبارات الجودة الثلاثة في وقت واحد. في هذا السيناريو، تشير رسوم HF ROC إلى أن الجولات السيئة تحتوي على 4 إلى 6 مرات أكثر من العلامات مقارنة بالجولات الجيدة، حيث أن أكثر من 55% من الجولات السيئة تحتوي على 3 علامات على الأقل، مقارنة بـ 13% فقط من الجولات الجيدة. يتم التأكيد على عد العلامات المميزة، حيث تساهم العلامات المتداخلة من اختبارات متعددة في إجمالي عدد الشذوذات.

مناقشة

تعزز أداة AutoDQM عمليات مراقبة جودة البيانات التقليدية (DQM) في تجربة CMS من خلال أتمتة كشف الشذوذات في الرسوم البيانية الناتجة عن أنظمة الكاشف الفرعي. تعتمد الطرق التقليدية بشكل كبير على الفحص البصري للعديد من الرسوم البيانية، مما يجعلها مملة وعرضة للأخطاء. تستخدم AutoDQM اختبارات إحصائية، بما في ذلك دالة بيتا-ثنائية الحدود وتحليل المكونات الرئيسية (PCA)، لتحديد الانحرافات عن توزيعات البيانات المتوقعة بشكل منهجي. من خلال استخدام مجموعة من المقاييس الإحصائية، مثل نسبة الاحتمالية وقيم السحب، تبرز AutoDQM بفعالية المناطق الشاذة في الرسوم البيانية، مما يمكّن المشغلين من تحديد المشكلات المحتملة بسرعة في أداء الكاشف.

تظهر تقييم أداء الأداة فعاليتها، حيث تحدد بنجاح أكثر من 50% من البيانات “السيئة” مع تقليل الإيجابيات الكاذبة في البيانات “الجيدة” إلى أقل من 15%. يتم تحقيق ذلك من خلال استراتيجية تقييم قوية تقارن البيانات الحالية مع عدة جولات مرجعية، مما يضمن أخذ التغيرات النظامية في الاعتبار. علاوة على ذلك، يمتد تطبيق AutoDQM إلى مراقبة كاشف الميون، حيث أثبتت قدرتها على اكتشاف تغييرات كبيرة في الأداء، مما يسهل التدخل الفوري من الخبراء. يهدف التطوير المستمر لـ AutoDQM إلى دمج أنظمة كاشف فرعية إضافية، مما يعزز القدرة العامة على إدارة التعقيد المتزايد للبيانات المجمعة في تجارب فيزياء الجسيمات.

Journal: EPJ Research Infrastructures, Volume: 10, Issue: 1
DOI: https://doi.org/10.1007/s41781-025-00147-2
PMID: https://pubmed.ncbi.nlm.nih.gov/41726844
Publication Date: 2026-02-09
Author(s): Andrew Brinkerhoff et al.
Primary Topic: Particle physics theoretical and experimental studies

Overview

The section presents the “AutoDQM” system, designed for Automated Data Quality Monitoring in large particle detectors, specifically the Compact Muon Solenoid (CMS) at the CERN Large Hadron Collider. This system employs advanced statistical techniques and unsupervised machine learning to facilitate rapid and thorough assessments of data quality.

The research evaluates anomaly detection algorithms utilizing the beta-binomial probability function and principal component analysis on the complete dataset of proton-proton collisions from CMS in 2022. The findings indicate that AutoDQM successfully identifies anomalous “bad” data, which is significantly impacted by detector malfunctions, at a rate 4 to 6 times higher than that of “good” data. This demonstrates the system’s efficacy as a robust tool for general data quality monitoring in high-energy physics experiments.

Introduction

The Compact Muon Solenoid (CMS) experiment is a sophisticated particle detector at the CERN Large Hadron Collider (LHC), primarily aimed at analyzing high-energy proton-proton collisions. It played a crucial role in the joint discovery of the Higgs boson alongside the ATLAS experiment and is currently focused on exploring new physics phenomena, including dark matter and the matter-antimatter asymmetry of the universe. The CMS employs a comprehensive “particle-flow” algorithm to identify and measure various particles, utilizing a multi-layer silicon tracker and advanced calorimeters to capture energy deposits from charged and neutral particles. The system’s efficiency is further enhanced by a two-tier trigger mechanism that filters collision events for detailed analysis.

A significant challenge for the CMS is maintaining data quality through continuous monitoring of the detector and reconstruction processes. This involves real-time processing of collision data to generate histograms that assess detector performance, with trained personnel intervening when anomalies are detected. The Data Quality Monitoring (DQM) system is essential for identifying and mitigating issues, thereby reducing the amount of “bad” data collected. In this context, the paper introduces AutoDQM, a web-based tool that leverages statistical methods and unsupervised machine learning for automated DQM. The tool employs anomaly detection techniques, including the beta-binomial probability function and principal component analysis (PCA), and its performance is evaluated using data from the 2022 run, with results and future plans discussed in subsequent sections.

Results

The results indicate that both the beta-binomial and PCA algorithms effectively discriminate between good and bad runs in the context of monitoring CMS data quality. Specifically, when the mean number of histogram flags is low (less than 3) for good runs, the high-frequency (HF) ROC plots reveal that bad runs exhibit 3 to 4 times more flags. The random forest (RF) ROC plots corroborate this finding, showing that while less than 12% of good runs are flagged, 35 to 50% of bad runs exceed the threshold. It is acknowledged that AutoDQM, like other anomaly detection methods, cannot be expected to identify all bad runs, nor can it achieve a 0% flagging rate for good runs, as some good runs may contain genuine anomalies.

The performance of the beta-binomial χ² and Z’ max tests improves significantly with an increased number of reference runs, as the reconstructed particle occupancy distributions are sensitive to varying pileup conditions. The PCA algorithms inherently account for this pileup dependence by training on a diverse set of runs. Although there is variability among the algorithms, no single method stands out as superior; rather, the best results are obtained when all three quality tests are applied concurrently. In this scenario, the HF ROC plot indicates that bad runs have 4 to 6 times more flags than good runs, with over 55% of bad runs having at least 3 flags, compared to only 13% of good runs. The counting of distinct flags is emphasized, where overlapping flags from multiple tests contribute to the total anomaly count.

Discussion

The AutoDQM tool enhances the traditional Data Quality Monitoring (DQM) processes in the CMS experiment by automating the detection of anomalies in histograms generated from subdetector systems. Traditional methods rely heavily on visual inspection of numerous histograms, which is both tedious and prone to error. AutoDQM employs statistical tests, including the beta-binomial function and Principal Component Analysis (PCA), to systematically identify deviations from expected data distributions. By utilizing a combination of statistical metrics, such as the likelihood ratio and pull values, AutoDQM effectively highlights anomalous regions in histograms, enabling shifters to quickly pinpoint potential issues in detector performance.

The tool’s performance evaluation demonstrates its efficacy, successfully identifying over 50% of “bad” data while minimizing false positives in “good” data to less than 15%. This is achieved through a robust assessment strategy that compares current data against multiple reference runs, ensuring that systematic variations are accounted for. Furthermore, AutoDQM’s application extends to monitoring the muon detector, where it has proven capable of detecting significant performance changes, thereby facilitating timely expert intervention. The ongoing development of AutoDQM aims to incorporate additional subdetector systems, enhancing the overall capability to manage the increasing complexity of data collected in particle physics experiments.