العوامل المؤثرة على الثقة في اتخاذ القرار الخوارزمي: تجربة غير مباشرة قائمة على السيناريو Factors influencing trust in algorithmic decision-making: an indirect scenario-based experiment

المجلة: Frontiers in Artificial Intelligence، المجلد: 7
DOI: https://doi.org/10.3389/frai.2024.1465605
PMID: https://pubmed.ncbi.nlm.nih.gov/39968162
تاريخ النشر: 2025-02-04
المؤلف: Fernando Marmolejo‐Ramos وآخرون
الموضوع الرئيسي: الأخلاقيات والآثار الاجتماعية للذكاء الاصطناعي

نظرة عامة

تبحث الدراسة في العلاقة بين الثقة في الخوارزميات والقدرة الإحصائية عبر مختلف مستويات القرار. وتجد أن الأفراد الذين يتمتعون بقدرة إحصائية أعلى يميلون إلى الثقة بالخوارزميات أكثر في السيناريوهات ذات المخاطر المنخفضة، حيث تعتبر الألفة مع الخوارزميات أيضًا عاملاً. وعلى العكس، في الحالات ذات المخاطر العالية، يرتبط زيادة القدرة الإحصائية بانخفاض الثقة في الخوارزميات. وهذا يشير إلى أنه بينما تجهز القدرة الإحصائية الأفراد لتقييم القرارات الخوارزمية بشكل نقدي، قد تؤدي أيضًا إلى الشك في موثوقيتها في السياقات المعقدة وذات المخاطر العالية.

تؤكد الدراسة على أهمية تعزيز القدرة الإحصائية وذكاء الاصطناعي لتحسين قدرة الأفراد على تقييم القرارات الخوارزمية جنبًا إلى جنب مع العوامل ذات الصلة الأخرى، مما يمنع الاعتماد المفرط على الخوارزميات التي قد لا تعكس بدقة تعقيدات السلوك البشري. على الرغم من أن النتائج تشير إلى عدم وجود اختلافات إحصائية كبيرة بين الدول فيما يتعلق بالثقة في الخوارزميات، إلا أنها تبرز الحاجة إلى مزيد من البحث، بما في ذلك الملاحظات المباشرة والقياسات الفسيولوجية، لفهم ديناميات الثقة في الأنظمة الخوارزمية بشكل أفضل.

مقدمة

تسلط مقدمة ورقة البحث الضوء على التأثير التحويلي للثورة الصناعية الرابعة، وخاصة من خلال انتشار الذكاء الاصطناعي (AI) وتعلم الآلة (ML). في قلب هذه التقنيات توجد الخوارزميات، التي تُستخدم بشكل متزايد من قبل المؤسسات والحكومات لإدارة كميات هائلة من المعلومات وتعزيز عمليات اتخاذ القرار. تعتبر جائحة COVID-19 مثالًا محوريًا، حيث تُظهر الدور الحاسم للخوارزميات في نمذجة ديناميات الفيروس ودعم التطبيقات الطبية المختلفة. يمتد هذا الاعتماد على اتخاذ القرارات الخوارزمية إلى ما هو أبعد من الصحة العامة إلى قطاعات مثل المراقبة والمالية، مما يبرز الحاجة إلى فهم الآثار الاجتماعية وموثوقية هذه الأنظمة.

تهدف الدراسة إلى التحقيق في العوامل التي تؤثر على الثقة العامة في الخوارزميات، مع التركيز على الأهمية الاجتماعية للخوارزميات، وموثوقيتها المعلنة، ودرجة معرفة الأفراد بالبيانات. تثير أسئلة أساسية تتعلق بطبيعة الثقة في الخوارزميات، بما في ذلك ما إذا كانت تختلف حسب السياق، وتأثير فهم العمليات الخوارزمية، ودور القدرات المعرفية. تمهد المقدمة الطريق لاستكشاف شامل للذكاء الاصطناعي/تعلم الآلة، والبيانات، والخوارزميات، إلى جانب مناقشة حول القابلية للتفسير والثقة، مما يؤدي إلى صياغة فرضيات الدراسة.

الطرق

في هذه الدراسة، تم جمع البيانات من 3,260 مشاركًا عبر 20 دولة، وتم تحليل ردود 1,921 فردًا قدموا بيانات كاملة. كان متوسط عمر العينة 26.03 عامًا (SD = 9.88)، مع توزيع جنسي بنسبة 59.5% نساء، و38.2% رجال، و1.8% يعرفون أنفسهم كآخرين. تم الحصول على الموافقة الأخلاقية من اللجان الأخلاقية المحلية قبل جمع البيانات، وتم تجنيد المشاركين عبر وسائل التواصل الاجتماعي وتقديم الموافقة المستنيرة من خلال صفحة معلومات عبر الإنترنت.

تتكون الاستبيان عبر الإنترنت من أربعة مكونات رئيسية: (1) استبيان ديموغرافي يقيم اللغة الأولى للمشاركين، وبلد الإقامة، والعمر، والجنس، ومستوى التعليم، والألفة مع الخوارزميات، مقاسة على مقياس بصري متدرج (VAS) من 0 (غير مألوف جدًا) إلى 5 (مألوف جدًا)؛ (2) مقياس “الميل إلى الثقة” المكون من ستة عناصر المعتمد على VAS، المعدل من Merritt et al. (2013)؛ (3) اختيار 14 عنصرًا من مقياس المعرفة الأساسية في الإحصاء (BLIS)، يغطي مفاهيم إحصائية متنوعة؛ و(4) 12 سيناريوًًا يوضح استخدام الخوارزميات، مصنفة إلى حالات ذات مخاطر منخفضة وعالية، يتبعها سؤالان يتم تقييمهما على مقياس VAS من 0 (غير محتمل على الإطلاق) إلى 5 (محتمل جدًا). تم تنفيذ مراحل الدراسة باستخدام Qualtrics، مع تقديم أحكام الخبراء حول السيناريوهات في مواد إضافية.

النتائج

تشير نتائج الدراسة إلى أن النموذج الأولي تم تأكيده من خلال التقييم التراجعي التدريجي، مع تقديم ملخصات مفصلة في الجداول 1 و2، وجدول شبيه بـ ANOVA في الجدول 3. تم تقييم افتراضات النموذج الخطي باستخدام حزمة R gvlma، مما كشف عن بعض الانتهاكات؛ ومع ذلك، أشار مخطط QQ إلى عدم وجود انحراف كبير عن الطبيعية. وبالتالي، تم استخدام نموذج مختلط خطي قوي، مما أسفر عن تقديرات تتماشى مع تلك الناتجة عن النموذج المختلط الخطي، مما يؤكد قوة مثل هذه النماذج تجاه انتهاكات الافتراضات التوزيعية.

تم العثور على تقاطع النموذج المختلط الخطي ليكون 1.46، مما يترجم إلى احتمال 29.32% للثقة، أو التوصية، أو استخدام الخوارزميات في السيناريوهات القابلة للتفسير وذات المخاطر العالية للنساء الشابات ذوات درجات BLIS وADA المنخفضة. زاد هذا الاحتمال في السيناريوهات ذات المخاطر المنخفضة (34.2%) ومع درجات ADA الأعلى (40.3%)، بينما انخفض مع درجات BLIS الأعلى (17.2%)، والعمر الأكبر (29.1%)، أو الردود من المشاركين الذكور (27.1%). ومن الجدير بالذكر أن دولًا مثل اليابان والولايات المتحدة والمملكة المتحدة أظهرت انخفاضات كبيرة في الثقة بالخوارزميات. كشفت التحليلات أن القدرة الإحصائية والألفة مع ADA ترتبط إيجابيًا بالثقة في الخوارزميات، بينما يرتبط العمر سلبًا. بالإضافة إلى ذلك، كان المشاركون الذكور أقل احتمالًا للثقة في الخوارزميات مقارنة بنظرائهم من الإناث أو غير الثنائيين. تؤكد النتائج على أهمية السياق، والقدرة الإحصائية، والعوامل الديموغرافية في تشكيل تصورات الثقة في الخوارزميات.

المناقشة

هدفت الدراسة إلى استكشاف كيف تؤثر الخصائص الشخصية، مثل القدرة الإحصائية والديموغرافيات، إلى جانب ميزات الخوارزميات مثل القابلية للتفسير والمخاطر المعنية، على الثقة في الخوارزميات. أشارت النتائج إلى علاقة معقدة: في السيناريوهات ذات المخاطر العالية، ارتبطت القدرة الإحصائية الأعلى بانخفاض الثقة في الخوارزميات، بينما في الحالات ذات المخاطر المنخفضة، كان العكس صحيحًا، مما يشير إلى أن الأفراد قد يشعرون بمزيد من القوة لتحدي الخوارزميات عندما تكون المخاطر أعلى. من المثير للاهتمام أن القابلية للتفسير وحدها لم تؤثر بشكل كبير على مستويات الثقة، مما يشير إلى أن الشفافية البسيطة في العمليات الخوارزمية قد لا تكفي لتخفيف المخاوف.

تساهم هذه النتائج في النقاش المستمر حول الثقة في الذكاء الاصطناعي (AI) وتعلم الآلة (ML)، مما يبرز تعقيد التفاعلات بين البشر والخوارزميات. يتم الاعتراف بحدود الدراسة، وتدعو إلى مزيد من البحث لتعميق فهم العوامل التي تؤثر على الثقة في أنظمة الذكاء الاصطناعي، خاصة في السياقات التي تكون فيها القرارات الخوارزمية لها عواقب كبيرة. تؤكد هذه الدراسة على أهمية مراعاة كل من العوامل الفردية والسياقية عند تقييم الثقة في تقنيات الخوارزميات.

القيود

تسلط قسم القيود الضوء على التحديات المرتبطة بتقنيات تعلم الآلة التي تتطلب معالجة بيانات واسعة النطاق ومشاركة بشرية، مثل توليد البيانات، والتعليق، والتحقق من الخوارزميات. غالبًا ما يتم تفويض هذه العمليات إلى شركات التعهيد أو تسهيلها من خلال منصات العمل، مما يمكن أن يقلل من تكاليف الإنتاج. ومع ذلك، تشير أبحاث Miceli وPosada (2022) إلى أن الهياكل التنظيمية داخل هذه الإعدادات غالبًا ما تعطي الأولوية للسيطرة الإدارية والإشراف الخوارزمي، بهدف تعزيز الإنتاجية مع تقليل التحيزات المتصورة للعمال.

تعتبر قيدًا كبيرًا تم تحديده هو قمع ملاحظات العمال، مما يقوض نزاهة عملية إنتاج البيانات. من خلال اعتبار قرارات العملاء كـ “الحقيقة المطلقة” النهائية، قد تؤدي مجموعات البيانات الناتجة عن غير قصد إلى perpetuate التحيزات الموجودة، حيث تعكس الخوارزميات المدربة على هذه البيانات تحيزات العملاء. تؤكد الدراسة على أن جودة البيانات مرتبطة جوهريًا بالمشاركة النشطة وصوت العمال، مما يشير إلى أن تحسين ظروف العمل وتعزيز المشاركة أمران حاسمان لتحسين جودة البيانات في تطبيقات تعلم الآلة.

Journal: Frontiers in Artificial Intelligence, Volume: 7
DOI: https://doi.org/10.3389/frai.2024.1465605
PMID: https://pubmed.ncbi.nlm.nih.gov/39968162
Publication Date: 2025-02-04
Author(s): Fernando Marmolejo‐Ramos et al.
Primary Topic: Ethics and Social Impacts of AI

Overview

The research investigates the relationship between algorithmic trust and statistical literacy across different decision stakes. It finds that individuals with higher statistical literacy tend to trust algorithms more in low-stakes scenarios, where familiarity with algorithms is also a factor. Conversely, in high-stakes situations, increased statistical literacy correlates with decreased trust in algorithms. This suggests that while statistical literacy equips individuals to critically assess algorithmic decisions, it may also lead to skepticism about their reliability in complex, high-stakes contexts.

The study emphasizes the importance of promoting statistical and AI literacy to enhance individuals’ ability to evaluate algorithmic decisions alongside other relevant factors, thereby preventing over-reliance on algorithms that may not adequately reflect the intricacies of human behavior. Although the findings indicate no significant statistical differences among countries regarding trust in algorithms, they highlight the need for further research, including direct observations and physiological measures, to better understand the dynamics of trust in algorithmic systems.

Introduction

The introduction of the research paper highlights the transformative impact of the Fourth Industrial Revolution, particularly through the proliferation of Artificial Intelligence (AI) and Machine Learning (ML). Central to these technologies are algorithms, which are increasingly utilized by institutions and governments to manage vast information and enhance decision-making processes. The COVID-19 pandemic serves as a pivotal example, demonstrating the critical role of algorithms in modeling virus dynamics and supporting various medical applications. This reliance on algorithmic decision-making extends beyond public health to sectors such as surveillance and finance, emphasizing the need to understand the societal implications and reliability of these systems.

The study aims to investigate the factors influencing public trust in algorithms, focusing on the societal relevance of the algorithms, their declared reliability, and the data literacy of individuals. It raises essential questions regarding the nature of trust in algorithms, including whether it varies by context, the impact of understanding algorithmic processes, and the role of cognitive abilities. The introduction sets the stage for a comprehensive exploration of AI/ML, data, and algorithms, alongside a discussion on explainability and trust, culminating in the formulation of the study’s hypotheses.

Methods

In this study, data were collected from 3,260 participants across 20 countries, ultimately analyzing responses from 1,921 individuals who provided complete data. The sample had a mean age of 26.03 years (SD = 9.88), with a gender distribution of 59.5% women, 38.2% men, and 1.8% identifying as other. Ethical approval was obtained from local ethics committees prior to data collection, with participants recruited via social media and providing informed consent through an online information page.

The online survey comprised four main components: (1) a demographic questionnaire assessing participants’ first language, country of residence, age, gender, education level, and familiarity with algorithms, measured on a visual analog scale (VAS) from 0 (not very familiar) to 5 (very familiar); (2) a VAS-based six-item ‘propensity to trust scale’ adapted from Merritt et al. (2013); (3) a selection of 14 items from the Basic Literacy In Statistics (BLIS) scale, covering various statistical concepts; and (4) 12 scenarios depicting algorithm use, categorized into low-stake and high-stake situations, each followed by two questions rated on a VAS from 0 (not at all likely) to 5 (very likely). The study’s phases were executed using Qualtrics, with expert judgments on the scenarios provided in supplementary materials.

Results

The results of the study indicate that the initial model was confirmed through stepwise backward evaluation, with detailed summaries provided in Tables 1, 2, and an ANOVA-like table in Table 3. Assumptions of the linear model were assessed using the R package gvlma, revealing some violations; however, a QQ plot indicated no significant deviation from normality. Consequently, a robust linear mixed model was employed, yielding estimates consistent with those from the linear mixed model, affirming the robustness of such models to distributional assumption violations.

The intercept of the mixed linear model was found to be 1.46, translating to a 29.32% probability of trusting, recommending, or using algorithms in explainable, high-stakes scenarios for young women with lower BLIS and ADA scores. This probability increased in low-stake scenarios (34.2%) and with higher ADA scores (40.3%), while it decreased with higher BLIS scores (17.2%), older age (29.1%), or responses from male participants (27.1%). Notably, countries such as Japan, the US, and the UK exhibited significant decreases in algorithm trust. The analysis revealed that statistical literacy and familiarity with ADA positively correlated with trust in algorithms, whereas age negatively correlated. Additionally, male participants were less likely to trust algorithms compared to their female or non-binary counterparts. The findings underscore the importance of context, statistical literacy, and demographic factors in shaping perceptions of algorithmic trust.

Discussion

The study aimed to explore how personal characteristics, such as statistical literacy and demographics, alongside algorithmic features like explainability and the stakes involved, affect trust in algorithms. Findings indicated a nuanced relationship: in high-stakes scenarios, higher statistical literacy correlated with lower trust in algorithms, while in low-stakes situations, the opposite was true, suggesting that individuals may feel more empowered to question algorithms when the stakes are higher. Interestingly, explainability alone did not significantly impact trust levels, indicating that mere transparency in algorithmic processes may not suffice to alleviate concerns.

These results contribute to the ongoing discourse on trust in artificial intelligence (AI) and machine learning (ML), highlighting the complexity of human-algorithm interactions. The study’s limitations are acknowledged, and it calls for further research to deepen the understanding of the factors influencing trust in AI systems, particularly in contexts where algorithmic decisions have substantial consequences. This research underscores the importance of considering both individual and contextual factors when evaluating trust in algorithmic technologies.

Limitations

The section on limitations highlights the challenges associated with machine learning techniques that necessitate extensive data handling and human involvement, such as data generation, annotation, and algorithmic verification. These processes are often outsourced to business process outsourcing (BPO) companies or facilitated through labor platforms, which can lower production costs. However, research by Miceli and Posada (2022) indicates that the organizational structures within these settings often prioritize managerial control and algorithmic oversight, aiming to enhance productivity while minimizing perceived worker biases.

A significant limitation identified is the suppression of worker feedback, which undermines the integrity of the data production process. By treating client decisions as the definitive “ground truth,” the resulting datasets may inadvertently perpetuate existing biases, as algorithms trained on this data reflect the biases of the clients. The study emphasizes that the quality of the data is intrinsically linked to the active participation and voice of the workers, suggesting that improving working conditions and fostering engagement are crucial for enhancing data quality in machine learning applications.