تحليل المحادثة والتقنيات الحوارية: إيجاد القواسم المشتركة بين الأكاديميا والصناعة Conversation analysis and conversational technologies: Finding the common ground between academia and industry

المجلة: Discourse & Communication، المجلد: 18، العدد: 6
DOI: https://doi.org/10.1177/17504813241267118
تاريخ النشر: 2024-09-05
المؤلف: Elizabeth Stokoe وآخرون
الموضوع الرئيسي: اللغة، الخطاب، استراتيجيات الاتصال

نظرة عامة

في خاتمة الورقة، يستند المؤلفون إلى نقد غارفينكل للمطالبات المبالغ فيها المتعلقة بـ “وعي الآلة” للدعوة إلى نهج حذر في تجسيد السلوك البشري في تحليل التفاعل الاجتماعي. يشيرون إلى وجهة نظر ساكس، مؤكدين على أهمية فهم تفاصيل الديناميات الحوارية دون إسقاط صفات بشرية على الآلات. يجادل المؤلفون بأن كل من تحليل المحادثة والتقنيات الحوارية تشترك في تركيز مشترك على موضوع المحادثة، مشيرين إلى أن العديد من المساهمات في العدد الخاص توضح كيف يمكن اعتبار الهياكل الحوارية تقنيات مشتركة تنشأ من التفاعلات الاجتماعية. من خلال إعطاء الأولوية لتكنولوجيا المحادثة على “الحوارية” للتكنولوجيا، يهدف المؤلفون إلى تجاوز الأطر المعرفية التي قد تحد من فهمنا للتفاعل بين البشر والآلات وطبيعة المحادثة نفسها.

مقدمة

تهدف مقدمة هذا العدد الخاص من *الخطاب والتواصل* إلى سد الفجوة بين الأكاديمية والصناعة من خلال تقديم أبحاث متقدمة في علم الأساليب الإثنوغرافية وتحليل المحادثة حول التقنيات الحوارية. تؤكد على ضرورة التعاون في ضوء التقدم السريع في مجالات مثل نماذج اللغة الكبيرة (LLMs) ومعالجة اللغة الطبيعية (NLP)، إلى جانب القضايا الملحة المتعلقة بالثقة والأخلاقيات والتحيز. يتضمن العدد الخاص أحد عشر ورقة أكاديمية مكملة بتعليقات من الصناعة من محترفين في منظمات مثل جوجل وآي بي إم ومايكروسوفت، تعكس الحالة الحالية لمجالاتهم المعنية.

يجادل المؤلفون بأن فهمًا دقيقًا لتحليل المحادثة أمر حاسم ليس فقط لتطوير التقنيات الحوارية ولكن أيضًا لأي منتج أو عملية تدعي الاستفادة من المحادثة. يبرزون أنه بينما قد تبدو المنتجات الحوارية غير رسمية وعفوية، فإن التحليل الأعمق يكشف أن جميع أشكال التفاعل، بما في ذلك التواصل القياسي، هي بطبيعتها حوارية. تضع المقدمة الأساس للعدد الخاص من خلال توضيح القضايا الأساسية لتحليل المحادثة من خلال مثال لتفاعل إنسان-إنسان، وبالتحديد مكالمة هاتفية تناظرية، ومقارنتها بتفاعل إنسان-كمبيوتر يتضمن LLMs. تثير هذه المقارنة أسئلة حاسمة حول مدى قدرة التقنيات الحوارية على “فهم” مستخدميها البشر بشكل حقيقي.

نقاش

في هذا القسم، يحلل المؤلفون مقطعين من محادثة—واحد من مكالمة هاتفية تقليدية والآخر تم إنشاؤه بواسطة نموذج لغة كبير (LLM) يعمل كطرف مشارك في المحادثة. يبرزون الفروق الكبيرة في تبادل الأدوار، وتشكيل الأفعال، وتنظيم التسلسل بين التفاعلين. في المكالمة الأصلية، يستخدم المشاركون بفعالية مزايا اللغة المنطوقة، مثل وحدات بناء الدور (TCUs) وأماكن الانتقال ذات الصلة (TRPs)، لنقل الإلحاح والحفاظ على التماسك. بالمقابل، يفتقر التفاعل الذي تم إنشاؤه بواسطة LLM إلى هذه الميزات، مما ينتج عنه أدوار متعددة TCUs بدون TRPs، مما يعيق التدفق الطبيعي والاستجابة المتوقعة في المحادثة البشرية.

يناقش المؤلفون أيضًا كيف يفشل LLM في التعرف على الإشارات الحوارية والاستجابة بشكل مناسب، مثل إعلان دوني المسبق عن “خمن ماذا”، الذي يُعتبر بداية لعبة تخمين بدلاً من دعوة لاستجابة محددة. توضح هذه الفجوة عدم قدرة LLM على إدارة الحقوق المعرفية وتصميم المستلم، مما يؤدي إلى نقص في التفاعل المشترك وفشل في التكيف مع إلحاح التفاعل. يجادل المؤلفون بأنه بينما يمكن لـ LLMs إنتاج نصوص، إلا أنها لا تعيد إنتاج الديناميات الدقيقة للمحادثة الأصيلة، مما يبرز الحاجة إلى أن تستند التقنيات الحوارية إلى رؤى من تحليل المحادثة (CA) لتحسين تصميمها ووظيفتها.

Journal: Discourse & Communication, Volume: 18, Issue: 6
DOI: https://doi.org/10.1177/17504813241267118
Publication Date: 2024-09-05
Author(s): Elizabeth Stokoe et al.
Primary Topic: Language, Discourse, Communication Strategies

Overview

In the conclusion of the paper, the authors draw on Garfinkel’s critique of the exaggerated claims surrounding ‘machine sentience’ to advocate for a cautious approach to anthropomorphizing human behavior in social interaction analysis. They reference Sacks’s perspective, emphasizing the importance of understanding the nuances of conversational dynamics without projecting human-like qualities onto machines. The authors argue that both conversation analysis and conversational technologies share a common focus on the object of conversation, highlighting that many contributions in the Special Issue demonstrate how conversational structures can be viewed as co-produced technologies arising from social interactions. By prioritizing the technology of conversation over the ‘conversationality’ of technology, the authors aim to move beyond cognitivist frameworks that may limit our understanding of the interplay between humans, machines, and the nature of conversation itself.

Introduction

The introduction of this Special Issue of *Discourse and Communication* aims to bridge the gap between academia and industry by presenting advanced ethnomethodological and conversation analytic research on conversational technologies. It emphasizes the necessity for collaboration in light of rapid advancements in fields such as large language models (LLMs) and natural language processing (NLP), alongside pressing issues of trust, ethics, and bias. The Special Issue features eleven academic papers complemented by industry commentaries from professionals at organizations like Google, IBM, and Microsoft, reflecting on the current state of their respective fields.

The authors argue that a nuanced understanding of conversation analysis is crucial not only for developing conversational technologies but also for any product or process that claims to leverage conversation. They highlight that while conversational products may appear informal and casual, a deeper analysis reveals that all forms of interaction, including standardized communication, are inherently conversational. The introduction sets the stage for the Special Issue by illustrating core concerns of conversation analysis through an example of human-human interaction, specifically an analogue telephone call, and juxtaposing it with human-computer interaction involving LLMs. This comparison raises critical questions about the extent to which conversational technologies can genuinely “understand” their human users.

Discussion

In this section, the authors analyze two extracts of a conversation—one from a traditional telephone call and the other generated by a large language model (LLM) acting as a participant in the conversation. They highlight significant differences in turn-taking, action formation, and sequence organization between the two interactions. In the original call, the participants effectively utilize spoken language’s affordances, such as turn constructional units (TCUs) and transition relevance places (TRPs), to convey urgency and maintain coherence. In contrast, the LLM-generated interaction lacks these features, producing multi-TCU turns without TRPs, which hinders the natural flow and responsiveness expected in human conversation.

The authors further discuss how the LLM fails to recognize and respond appropriately to conversational cues, such as Donny’s pre-announcement of “Guess what,” which is treated as the start of a guessing game rather than an invitation for a specific response. This misalignment illustrates the LLM’s inability to manage epistemic rights and recipient design, resulting in a lack of intersubjectivity and a failure to adapt to the urgency of the interaction. The authors argue that while LLMs can generate text, they do not replicate the nuanced dynamics of authentic conversation, emphasizing the need for conversational technologies to draw on insights from conversation analysis (CA) to improve their design and functionality.