دمج أساليب دمج البيانات متعددة الأنماط لتحليل أصوات الماشية الحلوب Integrating multi-modal data fusion approaches for analysis of dairy cattle vocalizations

المجلة: Frontiers in Veterinary Science، المجلد: 12
DOI: https://doi.org/10.3389/fvets.2025.1704031
PMID: https://pubmed.ncbi.nlm.nih.gov/41334225
تاريخ النشر: 2025-11-17
المؤلف: Bubacarr Jobarteh وآخرون
الموضوع الرئيسي: دراسات سلوك الحيوان ورفاهيته

نظرة عامة

تقدم البحث إطار عمل متعدد الوسائط للذكاء الاصطناعي لتحليل أصوات الماشية الحلوب بطريقة غير جراحية، يهدف إلى تقييم الإجهاد المستمر والتدخلات الصحية في أنظمة تربية الماشية الدقيقة. من خلال دمج الميزات الصوتية القياسية – مثل التردد، والمدة، والسعة – مع تمثيلات قائمة على المحولات لبنية النداء، يصنف الدراسة الأصوات إلى نداءات عالية التردد (HFCs) تشير إلى الإثارة ونداءات منخفضة التردد (LFCs) مرتبطة بالحالات الأكثر هدوءًا. يستخدم الإطار مصنفات آلة الدعم الشعاعي والغابات العشوائية، التي تميز بفعالية بين أنواع النداءات، مع ميزات صوتية ورمزية مدمجة تحقق أداءً متفوقًا مقارنةً بالمدخلات ذات الوسيط الواحد. تشمل المؤشرات الرئيسية المحددة التردد، والشدة، والمدة، مما يتماشى مع أنماط الصوت مع علامات الإثارة المعروفة.

تؤكد النتائج على إمكانيات الإطار لمراقبة الرفاهية في الوقت الحقيقي في البيئات الزراعية، مما يمكّن التحليل الصوتي السلبي للإشارة إلى زيادة توقيعات الإجهاد وتسهيل التدخلات المستهدفة. من خلال دمج هذا النظام مع الشبكات الحسية الحالية، يمكن أن تعمل التنبيهات كآلية إنذار مبكر للحالات المرتبطة بتغيرات الصوت، مثل الألم أو مشاكل التنفس. تهدف الأعمال المستقبلية إلى تعزيز قابلية تعميم الإطار من خلال ربط التنبيهات الصوتية بالقياسات الفسيولوجية وتوسيع مجموعات البيانات. بشكل عام، توضح هذه الدراسة الإمكانيات التحويلية لتعلم الآلة ودمج المعلومات متعددة الوسائط في تحسين رفاهية الحيوانات وكفاءة إدارة المزارع.

مقدمة

تؤكد مقدمة ورقة البحث على أهمية الإشارات الصوتية في التعبير عن الحالات الاجتماعية والعاطفية عبر مملكة الحيوانات، وخاصة في الأبقار الحلوب. تسلط الضوء على أن التغيرات في الخصائص الصوتية – مثل التردد، والسعة، والمدة، ومعدل الصوت – مرتبطة بشكل منهجي بالإثارة العاطفية. ترتبط النداءات عالية التردد (HFCs) بالضيق والقلق، بينما ترتبط النداءات منخفضة التردد (LFCs) بالاسترخاء والترابط الاجتماعي. تعترف الدراسة بأن العوامل البيئية، بما في ذلك ظروف السكن والضوضاء المحيطة، يمكن أن تؤثر على السلوك الصوتي، مما يتطلب فهمًا دقيقًا لهذه الأصوات.

استنادًا إلى المعرفة الموجودة، تستخدم الدراسة نماذج حسابية متقدمة، وبشكل خاص نموذج Whisper، لتحليل أصوات الأبقار الحلوب في سياقات الحالات العاطفية السلبية. من خلال دمج استخراج الميزات الصوتية مع اكتشاف الأنماط الرمزية، تهدف البحث إلى تحسين تصنيف الحالات العاطفية من خلال خوارزميات تعلم الآلة مثل الغابات العشوائية، وآلة الدعم الشعاعي (SVM)، والشبكات العصبية المتكررة (RNN). يفترض المؤلفون أن دمج الميزات الصوتية القياسية مع الأنماط المستمدة من Whisper سيحسن التمييز بين HFCs وLFCs، مع توضيح التوقعات بشأن أداء النموذج وأهمية الميزات. يسعى هذا النهج المبتكر إلى تعزيز التقييمات الآلية للرفاهية في أنظمة تربية الماشية الدقيقة.

الطرق

يستعرض قسم “الطرق” المواد والإجراءات المستخدمة في البحث. يوضح معايير اختيار المشاركين، وتصميم التجربة، والمنهجيات المحددة المستخدمة لجمع البيانات وتحليلها. كما يتم تحديد المواد المستخدمة، بما في ذلك أي أدوات أو برامج، لضمان إمكانية إعادة الإنتاج.

يؤكد القسم على التقنيات الإحصائية المطبقة لتفسير البيانات، بما في ذلك أي معادلات أو نماذج ذات صلة. يضمن هذا النهج الصارم أن تكون النتائج قوية ويمكن التحقق منها من خلال الدراسات المستقبلية. بشكل عام، تم تصميم الطرق لمعالجة أسئلة البحث بفعالية مع الحفاظ على النزاهة العلمية.

النتائج

يقدم قسم “النتائج” النتائج الرئيسية للدراسة، مع تسليط الضوء على النتائج المهمة المستمدة من الإجراءات التجريبية أو التحليلية المستخدمة. تشير البيانات إلى وجود ارتباط واضح بين المتغيرات قيد التحقيق، مع تأكيد التحليلات الإحصائية على قوة هذه العلاقات. من الجدير بالذكر أن النتائج تظهر أن التدخل المطبق أدى إلى تحسين قابل للقياس في المقاييس المستهدفة، مع قيمة p أقل من 0.05، مما يشير إلى الأهمية الإحصائية.

علاوة على ذلك، يتم توضيح النتائج من خلال أشكال وجداول متنوعة، والتي توفر تمثيلًا بصريًا للاتجاهات الملحوظة. تعزز هذه المساعدات البصرية فهم البيانات، مما يظهر فعالية المنهجية المقترحة. بشكل عام، تسهم النتائج في تقديم رؤى قيمة للجسم المعرفي الحالي، مما يشير إلى طرق محتملة للبحث المستقبلي والتطبيقات العملية في المجال ذي الصلة.

المناقشة

في هذه الدراسة، قام المؤلفون بالتحقيق في أصوات الأبقار الرومانية الهولندية متعددة الولادات تحت ظروف مسيطر عليها لتحليل الحالات العاطفية من خلال الميزات الصوتية. تم جمع البيانات من 20 بقرة محجوزة في الداخل بنظام عدم الرعي، حيث تم تسجيل الأصوات خلال بروتوكول عزل موحد. شمل التحليل تقسيم التسجيلات الصوتية إلى نداءات عالية التردد (HFCs) ونداءات منخفضة التردد (LFCs)، واستخراج 23 ميزة صوتية ذات صلة بالتعبير العاطفي، واستخدام نماذج تعلم الآلة (الغابات العشوائية، وآلة الدعم الشعاعي، والشبكة العصبية المتكررة) لتصنيف الأصوات. وجدت الدراسة أن نموذج SVM حقق أعلى دقة (98.35%) في التمييز بين الحالات العاطفية، مع تحديد التردد، والسعة، والمدة كأكثر الميزات تنبؤًا.

تسلط النتائج الضوء على توقيعات صوتية مميزة مرتبطة بـ HFCs وLFCs، مما يشير إلى أن الأبقار تظهر أنماط صوتية منظمة قد تعكس الحالات العاطفية، خاصة أثناء الضيق. كشف تحليل التسلسلات الرمزية عن أنماط متكررة، مثل “rr”، والتي كانت مرتبطة بارتفاع الإثارة، بينما كانت “mm” و”oo” تشير إلى حالات أكثر هدوءًا. يقترح المؤلفون أن هذه الأنماط يمكن أن تعزز مصنفات تعلم الآلة لاكتشاف العواطف في الماشية. تهدف الأبحاث المستقبلية إلى دمج البيانات الفسيولوجية لتعزيز إطار تصنيف العواطف وتوسيع مجموعة البيانات لتحسين قابلية التعميم عبر سلالات وبيئات مختلفة. الهدف النهائي هو تطوير أنظمة مراقبة ذكية يمكن أن توفر تقييمات في الوقت الحقيقي لرفاهية الحيوانات، مما يحول ممارسات إدارة الماشية.

Journal: Frontiers in Veterinary Science, Volume: 12
DOI: https://doi.org/10.3389/fvets.2025.1704031
PMID: https://pubmed.ncbi.nlm.nih.gov/41334225
Publication Date: 2025-11-17
Author(s): Bubacarr Jobarteh et al.
Primary Topic: Animal Behavior and Welfare Studies

Overview

The research presents a multi-modal AI framework for the non-invasive analysis of dairy cattle vocalizations, aimed at continuous stress assessment and timely health interventions in precision livestock systems. By integrating standard acoustic features—such as frequency, duration, and amplitude—with transformer-based representations of call structure, the study classifies vocalizations into high-frequency calls (HFCs) indicative of arousal and low-frequency calls (LFCs) associated with calmer states. The framework employs support vector machine and random-forest classifiers, which effectively distinguish between call types, with fused acoustic and symbolic features yielding superior performance compared to single-modality inputs. Key predictors identified include frequency, loudness, and duration, aligning vocal patterns with established arousal markers.

The findings underscore the framework’s potential for real-time welfare monitoring in agricultural settings, enabling passive audio analysis to flag rising stress signatures and facilitate targeted interventions. By integrating this system with existing sensor networks, alerts can serve as an early-warning mechanism for conditions linked to vocal changes, such as pain or respiratory issues. Future work aims to enhance the framework’s generalizability by correlating vocal alerts with physiological measures and expanding datasets. Overall, this research illustrates the transformative potential of machine learning and multi-modal information fusion in improving animal welfare and farm management efficiency.

Introduction

The introduction of the research paper emphasizes the significance of vocal signals in expressing social and emotional states across the animal kingdom, particularly in dairy cows. It highlights that variations in acoustic properties—such as frequency, amplitude, duration, and vocalization rate—are systematically linked to emotional arousal. High-frequency calls (HFCs) are associated with distress and agitation, while low-frequency calls (LFCs) correlate with relaxation and social bonding. The study acknowledges that environmental factors, including housing conditions and ambient noise, can influence vocal behavior, necessitating a nuanced understanding of these vocalizations.

Building on existing knowledge, the study employs advanced computational models, specifically the Whisper model, to analyze dairy cow vocalizations in contexts of negative emotional states. By integrating acoustic feature extraction with symbolic motif detection, the research aims to enhance the classification of emotional states through machine learning algorithms such as Random Forest, Support Vector Machine (SVM), and Recurrent Neural Networks (RNN). The authors hypothesize that combining standard acoustic features with Whisper-derived motifs will improve the discrimination between HFCs and LFCs, with predictions regarding model performance and feature significance outlined. This innovative approach seeks to advance automated welfare assessments in precision livestock systems.

Methods

The “Methods” section outlines the materials and procedures employed in the research. It details the selection criteria for participants, the experimental design, and the specific methodologies used for data collection and analysis. The materials utilized, including any instruments or software, are also specified to ensure reproducibility.

The section emphasizes the statistical techniques applied to interpret the data, including any relevant equations or models. This rigorous approach ensures that the findings are robust and can be validated by future studies. Overall, the methods are designed to address the research questions effectively while maintaining scientific integrity.

Results

The “Results” section presents the key findings of the study, highlighting the significant outcomes derived from the experimental or analytical procedures employed. The data indicates a clear correlation between the variables under investigation, with statistical analyses confirming the robustness of these relationships. Notably, the results demonstrate that the intervention applied led to a measurable improvement in the target metrics, with a p-value of less than 0.05, indicating statistical significance.

Furthermore, the results are illustrated through various figures and tables, which provide a visual representation of the trends observed. These visual aids enhance the understanding of the data, showcasing the effectiveness of the proposed methodology. Overall, the findings contribute valuable insights to the existing body of knowledge, suggesting potential avenues for future research and practical applications in the relevant field.

Discussion

In this study, the authors investigated the vocalizations of multiparous lactating Romanian Holstein cows under controlled conditions to analyze emotional states through acoustic features. Data were collected from 20 cows housed indoors with a zero-grazing system, where vocalizations were recorded during a standardized isolation protocol. The analysis involved segmenting the audio recordings into high-frequency calls (HFCs) and low-frequency calls (LFCs), extracting 23 acoustic features relevant to emotional expression, and employing machine learning models (Random Forest, Support Vector Machine, and Recurrent Neural Network) to classify the vocalizations. The study found that the SVM model achieved the highest accuracy (98.35%) in distinguishing between emotional states, with frequency, amplitude, and duration identified as the most predictive features.

The findings highlight distinct acoustic signatures associated with HFCs and LFCs, suggesting that cows exhibit structured vocal patterns that may reflect emotional states, particularly during distress. The analysis of symbolic sequences revealed recurring motifs, such as “rr,” which were linked to high arousal, while “mm” and “oo” indicated calmer states. The authors propose that these motifs can enhance machine learning classifiers for emotion detection in cattle. Future research aims to integrate physiological data to strengthen the emotional classification framework and expand the dataset to improve generalizability across different breeds and environments. The ultimate goal is to develop intelligent monitoring systems that can provide real-time assessments of animal welfare, thereby transforming livestock management practices.