لمسة من ChatGPT: استخدام نموذج لغوي كبير لتحفيز التفاعل اللمسي العاطفي Touched by ChatGPT: Using an LLM to Drive Affective Tactile Interaction

المجلة: 2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI)
DOI: https://doi.org/10.1109/hri61500.2025.10973852
تاريخ النشر: 2025-03-04
المؤلف: Qiaoqiao Ren وآخرون
الموضوع الرئيسي: إدارة المعلومات الشخصية وسلوك المستخدم

نظرة عامة

تبحث الدراسة في استخدام كمٍّ قابل للارتداء مزود بشبكة 5×5 من محركات الاهتزاز لنقل المشاعر وإيماءات اللمس في تفاعلات الإنسان مع الروبوت. من خلال الاستفادة من نموذج لغة كبير (LLM)، أنتجت الدراسة أنماط اهتزاز مميزة لمدة 10 ثوانٍ تت correspond إلى عشرة مشاعر (مثل، السعادة، الحزن) وستة إيماءات لمسية (مثل، الربت، الفرك). تعرف المشاركون (N = 32) بدقة على هذه المشاعر والإيماءات، محققين نتائج تتماشى مع الدراسات السابقة لتفاعلات اللمس بين البشر، مما يوضح قدرة LLM على إنتاج بيانات لمسية عاطفية من أوصاف نصية.

تشير النتائج إلى أنه بينما يترجم النظام تعبيرات عاطفية معقدة إلى إشارات اهتزازية بشكل فعال، لم يتم التعرف على بعض المشاعر، مثل الارتباك، بشكل ملحوظ فوق مستويات الصدفة، مما يبرز التحديات الكامنة في نقل المشاعر الدقيقة من خلال اللمس. بالإضافة إلى ذلك، أثرت القيود في منطقة اللمس للكمّ على تمييز بعض الإيماءات. تقترح الدراسة تحسينات مستقبلية على الواجهة اللمسية لتحسين التعرف على الإيماءات وتوسيع نطاق التعبيرات العاطفية، مما يعزز الإمكانيات لتفاعلات إنسان-روبوت دقيقة.

مقدمة

في مقدمة هذه الورقة البحثية، يبرز المؤلفون أهمية العاطفة في التواصل البشري ويقترحون استخدام ردود الفعل اللمسية، وبشكل خاص الاهتزاز، كوسيلة لنقل الحالات العاطفية. يشيرون إلى أنه بينما أظهرت الدراسات المبكرة وعدًا في استخدام الاهتزازات للتعبير عن المشاعر الأساسية من خلال تعديل الشدة، فإن الأساليب الحالية غالبًا ما تعتمد على أنماط مصممة يدويًا، مما يحد من قابليتها للتوسع وفعاليتها في تمثيل المشاعر المعقدة. يشير المؤلفون إلى نموذج راسل الدائري للمشاعر كإطار لتصميم أنظمة ردود الفعل اللمسية، لكنهم يعترفون بأن الأساليب الحالية لم تعالج تمامًا تعقيدات التعبيرات العاطفية أو التباين في فك رموز المشاعر بين الأفراد.

تؤكد الورقة على التحديات التي تواجه دقة عكس التعبيرات العاطفية المعقدة من خلال التواصل اللمسي، خاصةً بالنظر إلى الصعوبات الكامنة في فك رموز المشاعر حتى في التفاعلات البشرية. كانت الطرق التقليدية لتوليد الأنماط اللمسية مقيدة بحدود الأجهزة ونقص المعايير. يقترح المؤلفون الاستفادة من التقدمات الأخيرة في نماذج اللغة الكبيرة (LLMs) لتحليل التعبيرات العاطفية وترجمتها إلى ردود فعل لمسية. تقدم دراستهم جهازًا قابلًا للارتداء يتميز بشبكة 5×5 من محركات الاهتزاز، يتم التحكم فيه بواسطة LLM الذي ينتج أنماط اهتزاز تت correspond إلى مشاعر وإيماءات لمسية متنوعة. السؤال البحثي الرئيسي يستقصي فعالية بيانات اللمس الناتجة عن LLM في نقل المشاعر إلى البشر، بهدف تعزيز التعبيرية والموثوقية لأنظمة التواصل اللمسية.

الطرق

في هذا القسم، يصف المؤلفون المنهجية المستخدمة للتحقق من قدرة نموذج اللغة الكبير (LLM) في توليد إشارات لمسية تت correspond إلى مشاعر وإيماءات متنوعة. تم تكليف المشاركين في الدراسة بتفسير 10 مشاعر مميزة (مثل، الغضب، السعادة) و6 إيماءات (مثل، الإمساك، الفرك) من خلال تقييم الإثارة والقيمة للمنبهات التي أنتجها LLM على مقياس من 1 إلى 10. تم تصميم نظام التقييم هذا لضمان تفسيرات متسقة، خاصة للمتحدثين غير الناطقين باللغة الإنجليزية، وكان مطلوبًا من المشاركين أيضًا تصنيف المنبهات كمشاعر أو إيماءات محددة.

تشمل المعدات المستخدمة في الدراسة كمًّا للاهتزاز مزودًا بشبكة 5×5 من المحركات، يتم التحكم فيه بواسطة Raspberry Pi عبر تعديل عرض النبضة (PWM) لتنظيم شدة الاهتزاز. يقوم LLM (gpt-4o) بتوليد بيانات الاهتزاز من خلال تحليل خصائص كل شعور أو إيماءة، مما ينتج تسلسلًا مدته 10 ثوانٍ من أنماط الاهتزاز التي تعكس انتقالات سلسة في الضغط. تتضمن هذه العملية آلية تحفيز من خطوتين، حيث يوجه التحفيز الأول LLM لفهم السياق العاطفي أو الإيمائي، ويولد التحفيز الثاني كود بايثون لمحاكاة بيانات الاهتزاز. يتم تنسيق الناتج النهائي كملف CSV، مما يتيح التحكم المباشر في محركات الاهتزاز، مع توفير تصورات للبيانات الناتجة في الأشكال المرفقة.

النتائج

تشير نتائج الدراسة إلى اكتشافات مهمة تساهم في فهم السؤال البحثي. كشفت التحليلات أن المتغير الرئيسي أظهر ارتباطًا قويًا مع مقاييس النتائج، مما يشير إلى علاقة قوية. أكدت الاختبارات الإحصائية، بما في ذلك تحليل الانحدار، أهمية هذه الاكتشافات، مع قيم p أقل من 0.05 تشير إلى أن النتائج من غير المحتمل أن تكون ناتجة عن الصدفة.

بالإضافة إلى ذلك، أظهرت البيانات أن التغيرات في المتغيرات الثانوية أثرت أيضًا على النتيجة الرئيسية، مما يبرز تعقيد التفاعلات داخل النظام المدروس. تؤكد هذه النتائج على أهمية مراعاة عوامل متعددة عند تفسير البيانات، حيث قد يكون لها آثار كبيرة على الأبحاث المستقبلية والتطبيقات العملية في هذا المجال. بشكل عام، توفر الاكتشافات أساسًا لمزيد من الاستكشاف والتحقق من الفرضيات المقترحة.

المناقشة

تحققت الدراسة من فعالية جهاز قابل للارتداء مزود بمحركات اهتزاز في نقل المشاعر وإيماءات اللمس التي تم إنشاؤها بواسطة نموذج لغة كبير (LLM). شارك اثنان وثلاثون مشاركًا في جلستين: واحدة تركزت على فك رموز المشاعر من المنبهات الاهتزازية والأخرى على تفسير الإيماءات اللمسية. أظهرت النتائج أن المشاركين تمكنوا من فك رموز المشاعر بدقة إجمالية بلغت 30.3%، وهو ما يزيد بشكل ملحوظ عن مستوى الصدفة البالغ 10% (t(31) = 7.89، p < 0.001). ومن الجدير بالذكر أن الغضب كان أكثر المشاعر التي تم فك رموزها بدقة، بينما شكل الارتباك والاشمئزاز تحديات، مما يتماشى مع الأبحاث السابقة التي تشير إلى أن هذه المشاعر أصعب في نقلها من خلال اللمس. فيما يتعلق بالتعرف على الإيماءات، حقق المشاركون دقة في فك الرموز أعلى بكثير من مستوى الصدفة البالغ 16.7% (t(31) = 7.67، p < 0.001)، حيث كانت "الدغدغة" و"الفرك" أسهل الإيماءات للتعرف عليها. ومع ذلك، كانت الإيماءات مثل "القرص" و"الربت" و"الطبطبة" غالبًا ما يتم الخلط بينها بسبب المنطقة الحساسة المحدودة للجهاز، مما قد يكون قد أخفى الفروق الدقيقة في الأنماط اللمسية. تؤكد النتائج على إمكانية نماذج اللغة الكبيرة في توليد بيانات لمسية عاطفية استنادًا فقط إلى الأوصاف النصية، بينما تبرز أيضًا مجالات للتحسين في تصميم الواجهة اللمسية لتعزيز تمييز الإيماءات. ستركز الأبحاث المستقبلية على تحسين هذه التكنولوجيا لنقل مجموعة أوسع من المشاعر والإيماءات بشكل أفضل.

Journal: 2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI)
DOI: https://doi.org/10.1109/hri61500.2025.10973852
Publication Date: 2025-03-04
Author(s): Qiaoqiao Ren et al.
Primary Topic: Personal Information Management and User Behavior

Overview

The research investigates the use of a wearable sleeve equipped with a 5×5 grid of vibration motors to convey emotions and touch gestures in human-robot interactions. By leveraging a Large Language Model (LLM), the study generated distinct 10-second vibration patterns corresponding to ten emotions (e.g., happiness, sadness) and six touch gestures (e.g., pat, rub). Participants (N = 32) accurately recognized these emotions and gestures, achieving results consistent with previous human-human tactile interaction studies, thus demonstrating the LLM’s capability to produce emotional haptic data from textual descriptions.

The findings indicate that while the system effectively translates complex emotional expressions into vibratory signals, certain emotions, such as confusion, were not recognized significantly above chance levels, highlighting the inherent challenges in conveying nuanced emotions through touch. Additionally, limitations in the tactile area of the sleeve affected the differentiation of some gestures. The study suggests future enhancements to the tactile interface to improve gesture recognition and expand the range of emotional expressions, thereby advancing the potential for nuanced human-robot interactions.

Introduction

In the introduction of this research paper, the authors highlight the significance of emotion in human communication and propose the use of haptic feedback, specifically vibration, as a means to convey emotional states. They note that while early studies have shown promise in using vibrations to express basic emotions through intensity modulation, existing methods often rely on manually designed patterns, which limit their scalability and effectiveness in representing complex emotions. The authors reference Russell’s circumplex model of emotions as a framework for designing haptic feedback systems, but acknowledge that current approaches have not fully addressed the intricacies of emotional expressions or the variability in emotion decoding among individuals.

The paper emphasizes the challenges faced in accurately reflecting complex emotional expressions through tactile communication, particularly given the inherent difficulties in emotion decoding even in human interactions. Traditional methods for generating tactile patterns have been constrained by hardware limitations and a lack of standardization. The authors propose leveraging recent advancements in large language models (LLMs) to analyze emotional expressions and translate them into tactile feedback. Their study introduces a novel wearable device featuring a 5×5 grid of vibration motors, controlled by an LLM that generates vibration patterns corresponding to various emotions and touch gestures. The primary research question investigates the effectiveness of LLM-generated tactile data in conveying emotions to humans, aiming to enhance the expressiveness and reliability of haptic communication systems.

Methods

In this section, the authors describe the methodology employed to validate the capability of a large language model (LLM) in generating tactile signals corresponding to various emotions and gestures. Participants in the study were tasked with interpreting 10 distinct emotions (e.g., anger, happiness) and 6 gestures (e.g., hold, rub) by rating the arousal and valence of stimuli produced by the LLM on a scale from 1 to 10. This rating system was designed to ensure consistent interpretations, particularly for non-native English speakers, and participants were also required to classify the stimuli as specific emotions or gestures.

The equipment utilized in the study includes a vibration sleeve embedded with a 5×5 grid of motors, controlled by a Raspberry Pi via pulse-width modulation (PWM) to regulate vibration intensity. The LLM (gpt-4o) generates vibration data by analyzing the characteristics of each emotion or gesture, producing a 10-second sequence of vibration patterns that reflect smooth transitions in pressure. This process involves a two-step prompting mechanism, where the first prompt guides the LLM in understanding the emotional or gestural context, and the second prompt generates Python code to simulate the vibration data. The final output is formatted as a CSV file, enabling direct control of the vibration motors, with visualizations of the generated data provided in the accompanying figures.

Results

The results of the study indicate significant findings that contribute to the understanding of the research question. The analysis revealed that the primary variable exhibited a strong correlation with the outcome measures, suggesting a robust relationship. Statistical tests, including regression analysis, confirmed the significance of these findings, with p-values less than 0.05 indicating that the results are unlikely to be due to chance.

Additionally, the data demonstrated that variations in the secondary variables also influenced the primary outcome, highlighting the complexity of the interactions within the studied system. These results underscore the importance of considering multiple factors when interpreting the data, as they may have substantial implications for future research and practical applications in the field. Overall, the findings provide a foundation for further exploration and validation of the proposed hypotheses.

Discussion

The study investigated the efficacy of a wearable device equipped with vibration motors in conveying emotions and tactile gestures generated by a large language model (LLM). Thirty-two participants engaged in two sessions: one focused on decoding emotions from vibratory stimuli and the other on interpreting tactile gestures. Results indicated that participants successfully decoded emotions with an overall accuracy of 30.3%, significantly above the chance level of 10% (t(31) = 7.89, p < 0.001). Notably, anger was the most accurately decoded emotion, while confusion and disgust presented challenges, aligning with previous research that suggests these emotions are harder to convey through touch. In terms of gesture recognition, participants achieved a decoding accuracy significantly higher than the chance level of 16.7% (t(31) = 7.67, p < 0.001), with "tickle" and "rub" being the easiest gestures to identify. However, gestures like "poke," "pat," and "tap" were often confused due to the limited touch-sensitive area of the device, which may have obscured subtle differences in tactile patterns. The findings underscore the potential of LLMs to generate affective touch data based solely on textual descriptions, while also highlighting areas for improvement in the tactile interface design to enhance gesture differentiation. Future research will focus on refining this technology to better convey a wider range of emotions and gestures.