استراتيجيات التعلم العميق مع CReToNeXt-YOLOv5 للكشف المتقدم عن مشاعر وجه الخنزير Deep learning strategies with CReToNeXt-YOLOv5 for advanced pig face emotion detection

المجلة: Scientific Reports، المجلد: 14، العدد: 1
DOI: https://doi.org/10.1038/s41598-024-51755-8
PMID: https://pubmed.ncbi.nlm.nih.gov/38242984
تاريخ النشر: 2024-01-19
المؤلف: Lili Nie وآخرون
الموضوع الرئيسي: دراسات سلوك الحيوان ورفاهيته

نظرة عامة

تسلط هذه الدراسة الضوء على الدور الحيوي للتعبيرات الوجهية في الخنازير كوسيلة متطورة للتواصل تعكس مشاعرها ورفاهيتها. لمواجهة التحديات التي تطرحها بنية العضلات الوجهية المحدودة للخنازير في التعرف على هذه التعبيرات، طور المؤلفون نموذجًا جديدًا للتعرف على التعبيرات الوجهية يسمى CReToNeXt-YOLOv5. يتضمن هذا النموذج عدة تحسينات، بما في ذلك الانتقال من دالة خسارة CIOU إلى دالة خسارة EIOU لتحسين ديناميكيات التدريب ودمج آلية الانتباه التناسبي لتعزيز الحساسية تجاه ميزات التعبير الدقيقة. حقق النموذج دقة متوسطة (mAP) تبلغ 89.4%، مما يمثل تحسينًا كبيرًا بنسبة 6.7% مقارنةً بنموذج YOLOv5 الأصلي، مما يوضح إمكانيته في تعزيز مراقبة رفاهية الحيوانات وممارسات إدارة الماشية.

تؤكد الدراسة على أهمية الكشف الدقيق وتصنيف التعبيرات الوجهية للخنازير لفهم حالاتهم العاطفية والنفسية. أداء نموذج CReToNeXt-YOLOv5 المتفوق، الذي يتجاوز النماذج المعروفة مثل Faster R-CNN وYOLOv4 بنسبة 64.14% و61.73% على التوالي، يبرز فعاليته. على الرغم من إنجازاته، يواجه النموذج تحديات تتعلق بالبيئات المتنوعة ومعدلات التعرف لفئة الحيادية. ستركز الأعمال المستقبلية على تحسين النموذج وتوسيع مجموعة البيانات لتعزيز المتانة وقابلية التطبيق في البيئات الواقعية، مما يسهم في تحسين معايير رفاهية الحيوانات.

طرق

يستعرض قسم “الطرق” الإطار التجريبي المستخدم في الدراسة. يوضح جمع البيانات وتحليلها، مع تحديد البروتوكولات المتبعة لضمان الموثوقية والصلاحية. استخدم الباحثون نهجًا منهجيًا لجمع البيانات التجريبية، والتي تضمنت ظروفًا محكومة وقياسات موحدة لتقليل التباين.

بالإضافة إلى ذلك، يصف القسم التقنيات الإحصائية المطبقة لتحليل البيانات، مما يضمن أن النتائج قوية وذات دلالة إحصائية. الطرق المستخدمة ضرورية لتكرار الدراسة والتحقق من النتائج، مما يسهم في صرامة البحث بشكل عام.

نتائج

في هذا القسم، يقدم المؤلفون نتائج تجاربهم باستخدام نموذج CReToNeXt-YOLOv5 للتعرف على التعبيرات الوجهية للخنازير. حقق النموذج دقة متوسطة (mAP) تبلغ 89.4%، مما يمثل تحسينًا كبيرًا بنسبة 6.7% مقارنةً بنموذج YOLOv5 الأصلي. لوحظت تحسينات ملحوظة عبر فئات عاطفية مختلفة، بما في ذلك زيادة بنسبة 9% في التعرف على الخوف وزيادة بنسبة 11.8% في التعرف على التعبيرات الحيادية. تؤكد الدراسة على فعالية آلية الانتباه التناسبي، التي تفوقت على وحدة الانتباه الكتلي التلافيفي (CBAM) في تحسين أداء النموذج، خاصة في التعرف على التعبيرات الحيادية الدقيقة. تتيح هذه الآلية للنموذج التركيز على المناطق الحرجة في خريطة الميزات، مما يحسن قدرته على التمييز بين التعبيرات المتشابهة.

بالإضافة إلى ذلك، أدت دمج دالة خسارة التقاطع المعزز (EIOU) إلى تحسين انحدار صندوق النموذج، مما يعالج قيود حسابات IOU التقليدية. هذا التعديل ضروري للكشف بدقة عن التعبيرات الوجهية الدقيقة، حيث يقلل من العقوبات على التنبؤات القريبة من الخطأ ويعزز استقرار الكشف بشكل عام. ساهم دمج وحدة CReToNeXt في تعزيز النموذج من خلال تمكين استخراج الميزات عبر مسارين، مما يلتقط التفاصيل المحلية والسياقية الضرورية للتعرف على التباينات الدقيقة في التعبيرات الوجهية للخنازير. بشكل عام، تؤكد النتائج على متانة النموذج وإمكاناته للتطبيقات في الوقت الحقيقي في التعرف على تعبيرات الإجهاد الحراري في الخنازير، مما يبرز أهمية التجارب التكرارية والقرارات المستندة إلى البيانات في تحسين قدرات الكشف.

مناقشة

قدمت الدراسة منهجية شاملة لجمع وتحليل التعبيرات الوجهية للخنازير، باستخدام مجموعة بيانات مستمدة من 20 خنزيرًا من سلالة لاندرايس في مقاطعة شانشي. شمل جمع البيانات إعداد كاميرا مخطط له جيدًا مع زوايا متعددة لالتقاط التعبيرات العفوية أثناء أنشطة مختلفة، مما أسفر عن مجموعة بيانات متنوعة من 2700 صورة مصنفة بعد التحقق الدقيق. تم تصنيف التعبيرات بناءً على مؤشرات سلوكية معروفة، وتصنيفها إلى “سعيد”، “غاضب”، “خائف”، و”حيادي”، مع ميزات وجهية محددة مرتبطة بكل حالة عاطفية. تم توسيع مجموعة البيانات لاحقًا إلى 5400 صورة من خلال تقنيات المعالجة المسبقة والتعزيز، مما يعزز فعالية تدريب النموذج.

للتعرف على هذه التعبيرات، استخدمت الدراسة نموذج CReToNeXt-YOLOv5، الذي يدمج تقنيات متقدمة مثل دالة خسارة EIOU وآلية الانتباه التناسبي لتحسين دقة الكشف. أظهر النموذج مقاييس أداء قوية، حيث حقق دقة متوسطة (mAP) تبلغ 0.894 واسترجاعًا قدره 0.990، مما يدل على قدرة عالية على الكشف. ومع ذلك، لا تزال التحديات قائمة في التمييز بين الحالات العاطفية المتشابهة، خاصة “الخوف” و”الغضب”، بالإضافة إلى التعرف بدقة على التعبير “الحيادي” الدقيق. تؤكد النتائج على متانة النموذج مع تسليط الضوء على مجالات التحسين المستقبلية، خاصة في تعزيز التمييز بين التعبيرات الدقيقة وتحسين قدرات تصفية الضوضاء.

القيود

في قسم القيود المتعلقة بنموذج CReToNeXt-YOLOv5 للتعرف على وتصنيف التعبيرات الوجهية للخنازير، يتم تحديد عدة تحديات رئيسية. تعكس مجموعة البيانات بشكل أساسي سلوك الخنازير في سياقات محددة، خاصة فئة “السعادة” خلال أوقات التغذية، مما قد يحد من قابلية تعميم النموذج على التعبيرات السعيدة في سيناريوهات أخرى. يخلق هذا التركيز معيارًا للتحليل ولكنه قد لا يشمل النطاق الكامل لتعبيرات الخنازير خارج سياقات التغذية.

علاوة على ذلك، فإن التباين داخل فئة “الحيادية”، التي تشمل بيئات متنوعة، يطرح تحديات إضافية لدقة النموذج التنبؤية. يمكن أن تؤثر القضايا العملية أثناء جمع البيانات، مثل حركة الخنازير وزوايا الكاميرا وظروف الإضاءة، على جودة البيانات على الرغم من الجهود المبذولة لاستخدام نهج متعدد الكاميرات وزوايا متعددة. أخيرًا، قد لا تمثل مجموعة البيانات بشكل كاف تنوع سلالات الخنازير أو الفئات العمرية أو حالات الصحة، مما يقدم تحيزًا محتملاً. يجب أن تهدف الأبحاث المستقبلية إلى معالجة هذه القيود لتعزيز متانة أدوات التعرف على التعبيرات الوجهية للخنازير.

Journal: Scientific Reports, Volume: 14, Issue: 1
DOI: https://doi.org/10.1038/s41598-024-51755-8
PMID: https://pubmed.ncbi.nlm.nih.gov/38242984
Publication Date: 2024-01-19
Author(s): Lili Nie et al.
Primary Topic: Animal Behavior and Welfare Studies

Overview

This research highlights the critical role of facial expressions in pigs as a sophisticated means of communication reflecting their emotions and well-being. To address the challenges posed by pigs’ limited facial muscle structure in recognizing these expressions, the authors developed a novel facial expression recognition model named CReToNeXt-YOLOv5. This model incorporates several enhancements, including the transition from the CIOU to the EIOU loss function for improved training dynamics and the integration of a Coordinate Attention mechanism to enhance sensitivity to subtle expression features. The model achieved a mean average precision (mAP) of 89.4%, representing a significant improvement of 6.7% over the original YOLOv5, thereby demonstrating its potential to advance animal welfare monitoring and livestock management practices.

The study emphasizes the importance of accurately detecting and classifying pig facial expressions to understand their emotional and psychological states. The CReToNeXt-YOLOv5 model’s superior performance, surpassing established models like Faster R-CNN and YOLOv4 by 64.14% and 61.73%, respectively, underscores its efficacy. Despite its achievements, the model faces challenges related to diverse environments and recognition rates for the Neutral category. Future work will focus on refining the model and expanding the dataset to enhance robustness and applicability in real-world settings, ultimately contributing to improved animal welfare standards.

Methods

The “Methods” section outlines the experimental framework employed in the study. It details the collection and analysis of data, specifying the protocols followed to ensure reliability and validity. The researchers utilized a systematic approach to gather experimental data, which included controlled conditions and standardized measurements to minimize variability.

Additionally, the section describes the statistical techniques applied to analyze the data, ensuring that the findings are robust and statistically significant. The methods employed are crucial for replicating the study and validating the results, thereby contributing to the overall rigor of the research.

Results

In this section, the authors present the results of their experiments using the CReToNeXt-YOLOv5 model for recognizing swine facial expressions. The model achieved a mean Average Precision (mAP) of 89.4%, marking a significant enhancement of 6.7% over the original YOLOv5 model. Notable improvements were observed across various emotional categories, including a 9% increase in fear and an 11.8% increase in neutral expression recognition. The study emphasizes the effectiveness of the Coordinate Attention Mechanism, which outperformed the Convolutional Block Attention Module (CBAM) in enhancing model performance, particularly in recognizing subtle neutral expressions. This mechanism allows the model to focus on critical regions of the feature map, improving its ability to distinguish between similar expressions.

Additionally, the integration of the Enhanced Intersection over Union (EIOU) loss function refined the model’s bounding box regression, addressing limitations of traditional IOU computations. This adjustment is crucial for accurately detecting nuanced facial expressions, as it minimizes penalties for near-miss predictions and enhances overall detection stability. The incorporation of the CReToNeXt module further advanced the model by enabling dual-path feature extraction, which captures both localized and contextual details essential for recognizing subtle variations in pig facial expressions. Overall, the findings underscore the model’s robustness and potential for real-time applications in recognizing thermal stress expressions in pigs, highlighting the importance of iterative experimentation and data-driven decisions in optimizing detection capabilities.

Discussion

The research presented a comprehensive methodology for collecting and analyzing pig facial expressions, utilizing a dataset derived from 20 Landrace pigs in Shanxi Province. The data collection involved a well-planned camera setup with multiple angles to capture spontaneous expressions during various activities, resulting in a diverse dataset of 2700 categorized images after rigorous validation. Expressions were labeled based on established behavioral indicators, categorizing them into “happy,” “angry,” “fear,” and “neutral,” with specific facial features associated with each emotional state. The dataset was subsequently expanded to 5400 images through preprocessing and augmentation techniques, enhancing the model’s training efficacy.

For the recognition of these expressions, the study employed the CReToNeXt-YOLOv5 model, which integrates advanced techniques such as the EIOU loss function and the Coordinate Attention mechanism to improve detection accuracy. The model demonstrated strong performance metrics, achieving a mean Average Precision (mAP) of 0.894 and a Recall of 0.990, indicating high detection capability. However, challenges remained in distinguishing between similar emotional states, particularly “fear” and “anger,” as well as accurately identifying the subtle “neutral” expression. The findings underscore the model’s robustness while highlighting areas for future refinement, particularly in enhancing the differentiation of nuanced expressions and improving noise filtering capabilities.

Limitations

In the section on limitations regarding the CReToNeXt-YOLOv5 model for recognizing and classifying swine facial expressions, several key challenges are identified. The dataset primarily reflects swine behavior in specific contexts, particularly the “Happy” category during feeding times, which may limit the model’s generalizability to joyful expressions in other scenarios. This focus creates a benchmark for analysis but may not encompass the full range of swine expressions outside of feeding contexts.

Moreover, the variability within the “Neutral” category, which includes diverse environments, poses additional challenges to the model’s predictive accuracy. Practical issues during data collection, such as swine movement, camera angles, and lighting conditions, could further compromise data quality despite efforts to use a multi-camera and multi-angle approach. Lastly, the dataset may not adequately represent the diversity of swine breeds, age groups, or health conditions, introducing potential bias. Future research should aim to address these limitations to enhance the robustness of swine facial expression recognition tools.