الأرشيف التصويري: مجموعة بيانات مفتوحة وتطبيق ويب لدراسة الاستعارة Figurative Archive: an open dataset and web-based application for the study of metaphor

المجلة: Scientific Data، المجلد: 13، العدد: 1
DOI: https://doi.org/10.1038/s41597-025-06459-7
PMID: https://pubmed.ncbi.nlm.nih.gov/41501100
تاريخ النشر: 2026-01-07
المؤلف: Maddalena Bressler وآخرون
الموضوع الرئيسي: اللغة والاستعارة والإدراك

نظرة عامة

تقدم ورقة البحث الأرشيف التصويري، وهو قاعدة بيانات مفتوحة شاملة تضم 996 استعارة إيطالية، والتي تلبي الحاجة المتزايدة لمواد تجريبية مصممة بدقة في أبحاث الاستعارة. يتضمن هذا الأرشيف مجموعة متنوعة من الاستعارات، سواء كانت يومية أو أدبية، ويغتني بمقاييس مختلفة قائمة على التقييم والمجموعات، مثل الألفة، والمسافة الدلالية، والتفسيرات المفضلة. تم جمع البيانات من 11 دراسة وتم التحقق منها من خلال الارتباطات بين الألفة ومقاييس أخرى.

من الجدير بالذكر أن الأرشيف التصويري يقدم العديد من الميزات المبتكرة: يتجاوز الموارد السابقة من حيث الحجم، ويشمل مقياس شمولية الاستعارة لتعزيز استخدام اللغة غير التمييزية، وهو متاح من خلال واجهة قائمة على الويب تسمح بالاستشارات المخصصة. كما يقدم المؤلفون إرشادات لاستخدام الأرشيف في الدراسات التي تركز على معالجة الاستعارة والتفاعل بين خصائص الاستعارة في الإدراك البشري والنماذج الحاسوبية.

مقدمة

ت outlines مقدمة ورقة البحث تطور وأهمية دراسات الاستعارة عبر مختلف التخصصات، بما في ذلك علم النفس اللغوي، وعلم الأعصاب اللغوي، وعلم الأعصاب المعرفي. تسلط الضوء على الاهتمام المتزايد في أبحاث الاستعارة، خاصة منذ أوائل العقد الثاني من القرن الحادي والعشرين، والذي يُعزى إلى تطبيق الأساليب التجريبية التي تكشف عن العمليات المعرفية المعنية في فهم الاستعارة. تشير النتائج الرئيسية إلى أن الاستعارات هي كيانات معقدة تتأثر بعوامل مثل الألفة، وإعادة تمثيل الحواس والحركة، والميزات الدلالية، والتي تؤثر على معالجتها والاستجابات السلوكية والعصبية التي تثيرها.

لمعالجة التحديات في أبحاث الاستعارة، مثل الطبيعة المستهلكة للوقت لإنشاء مواد الاختبار وضمان إمكانية إعادة الإنتاج، يقدم المؤلفون الأرشيف التصويري، وهو مجموعة بيانات مفتوحة من الاستعارات الإيطالية. تشمل هذه الموردين وحدتين: وحدة الاستعارات اليومية، التي تحتوي على 464 عنصرًا مع تقييمات متنوعة، بما في ذلك بُعد جديد للشمولية، ووحدة الاستعارات الأدبية، التي تتضمن 532 استعارة أصلية من الأدب الإيطالي. يهدف الأرشيف إلى تسهيل أبحاث الاستعارة من خلال توفير مجموعة بيانات شاملة وسهلة الوصول، وتعزيز إمكانية إعادة الإنتاج، وتمكين التحقيقات المنهجية في خصائص الاستعارة. بالإضافة إلى ذلك، يدعم تقييم قدرات اللغة التصويرية في النماذج اللغوية الكبيرة (LLMs) ويشجع الدراسات عبر اللغات من خلال تقديم ترجمات إنجليزية للاستعارات الإيطالية. تشمل الخطط المستقبلية توسيع مجموعة البيانات وتعزيز الجهود التعاونية في أبحاث الاستعارة على مستوى العالم.

الطرق

ت outlines قسم “الطرق” في ورقة البحث التصميم التجريبي والتقنيات التحليلية المستخدمة للتحقيق في سؤال البحث. استخدمت الدراسة نهجًا كميًا، يتضمن تحليلات إحصائية لتقييم البيانات المجمعة من تجارب مختلفة. تضمنت المنهجيات المحددة تجارب محكومة، حيث تم التلاعب بالمتغيرات بشكل منهجي لمراقبة آثارها على النتائج ذات الاهتمام.

شملت جمع البيانات مقاييس نوعية وكمية، مما يضمن فهمًا شاملاً للظواهر قيد الدراسة. تم إجراء التحليل باستخدام برامج إحصائية قياسية، وتطبيق اختبارات مثل ANOVA وتحليل الانحدار لتحديد دلالة النتائج. يتناول القسم أيضًا طرق أخذ العينات، وخصائص المشاركين، وأي اعتبارات أخلاقية تم أخذها في الاعتبار خلال عملية البحث. بشكل عام، كانت الطرق المستخدمة مصممة بدقة لضمان موثوقية وصحة النتائج.

المناقشة

تتكون وحدة “الاستعارات اليومية” من الأرشيف التصويري من 464 تعبيرًا استعاريًا فريدًا باللغة الإيطالية، مستمدة من تسع دراسات أجراها باحثو NEPLab. يتم تعيين معرف أبجدي رقمي لكل استعارة وتتضمن ترجمة إنجليزية حرفية لسهولة الوصول. تشمل مجموعة البيانات أنواعًا مختلفة من الاستعارات، مثل الاستعارات الاسمية التنبؤية (مثل “هذا المحامي سمكة قرش”)، والأزواج الاسمية (مثل “اللغة-الجسر”)، والاستعارات التنبؤية (مثل “أليس تشكل مستقبلها مع ألبرتو”). يرافق كل تعبير مقاييس نفسية لغوية، بما في ذلك الألفة، والمعنى، والمسافة الدلالية بين الموضوعات والمركبات، مع توافر متنوع عبر العناصر.

شمل عملية الجمع متحدثين أصليين باللغة الإيطالية، بلغ عددهم 630 مشاركًا، والتزمت بالإرشادات الأخلاقية، بما في ذلك الامتثال لـ GDPR. تم تقييم الاستعارات لأبعاد متنوعة، وتعكس مجموعة البيانات هيمنة الهياكل الاسمية (69.18%). يكشف التحليل عن توزيعات متميزة للمقاييس، مثل الألفة والارتباط بالجسد، مما يشير إلى تباين في خصائص الاستعارة. لتعزيز الاتساق، تم توحيد مقاييس التقييم، وأعيد حساب المقاييس المستندة إلى المجموعات باستخدام موارد معاصرة. بالإضافة إلى ذلك، تم جمع تقييمات الشمولية لتقييم احترام الاستعارات وحساسيتها للتنوع، مما يزيد من تطبيق مجموعة البيانات في أبحاث علم النفس اللغوي.

القيود

يسلط قسم القيود الضوء على عدة مجالات للبحث المستقبلي المتعلقة بالأرشيف التصويري، وهو مورد يهدف إلى تعزيز دراسات الاستعارة، خاصة في السياق الإيطالي. أولاً، يبرز الحاجة لاستكشاف مجموعة أوسع من أنواع الاستعارات، بما في ذلك الاستعارات الاسمية التنبؤية، والاستعارات الجينية، والاستعارات التنبؤية، حيث قد لا تعكس التمثيلات الحالية التنوع الموجود في سياقات مختلفة، مثل الإعلانات واللغة الإقناعية. يمكن أن تفحص التحقيقات المستقبلية كيف تعمل هذه الهياكل الاستعارية عبر إعدادات متنوعة.

ثانيًا، تعترف الدراسة بأن مقاييس التقييم الحالية تعتمد بشكل أساسي على عينات من البالغين الشباب المتعلمين، مما يشير إلى أن العوامل الديموغرافية مثل العمر والخلفية التعليمية يجب أن تؤخذ في الاعتبار في الدراسات المستقبلية، حيث قد تؤثر على معالجة الاستعارة. يسمح التصميم الوحدوي للأرشيف بدمج مجموعات بيانات جديدة قد تنشأ من معالجة هذه القيود. أخيرًا، تم الإشارة إلى تحدي ترجمة الاستعارات الإيطالية إلى الإنجليزية، حيث إن الاستعارات متجذرة بعمق في السياقات الثقافية واللغوية، وغالبًا ما تقاوم الترجمة المباشرة. هذا يبرز دور الأرشيف كمورد أساسي لأبحاث الاستعارة عبر اللغات، بينما يشير أيضًا إلى التعقيدات المعنية في مثل هذه المساعي.

Journal: Scientific Data, Volume: 13, Issue: 1
DOI: https://doi.org/10.1038/s41597-025-06459-7
PMID: https://pubmed.ncbi.nlm.nih.gov/41501100
Publication Date: 2026-01-07
Author(s): Maddalena Bressler et al.
Primary Topic: Language, Metaphor, and Cognition

Overview

The research paper presents the Figurative Archive, an extensive open database comprising 996 Italian metaphors, which addresses the growing need for rigorously constructed experimental materials in metaphor research. This Archive includes a diverse range of metaphors, both everyday and literary, and is enriched with various rating and corpus-based measures, such as familiarity, semantic distance, and preferred interpretations. The data was collected from 11 studies and validated through correlations between familiarity and other metrics.

Notably, the Figurative Archive offers several innovative features: it surpasses previous resources in size, incorporates a measure of metaphor inclusiveness to promote non-discriminatory language use, and is accessible via a web-based interface that allows for customized consultations. The authors also provide guidelines for utilizing the Archive in studies focused on metaphor processing and the interplay between metaphor characteristics in human cognition and computational models.

Introduction

The introduction of the research paper outlines the evolution and significance of metaphor studies across various disciplines, including psycholinguistics, neurolinguistics, and cognitive neuroscience. It highlights the increasing interest in metaphor research, particularly since the early 2010s, attributed to the application of experimental methods that reveal the cognitive processes involved in metaphor comprehension. Key findings indicate that metaphors are complex entities influenced by factors such as familiarity, sensorimotor reenactment, and semantic features, which affect their processing and the behavioral and neural responses they elicit.

To address challenges in metaphor research, such as the time-consuming nature of creating testing materials and ensuring reproducibility, the authors present the Figurative Archive, an open dataset of Italian metaphors. This resource includes two modules: the Everyday Metaphors module, featuring 464 items with various ratings, including a novel dimension of inclusiveness, and the Literary Metaphors module, comprising 532 original metaphors from Italian literature. The Archive aims to facilitate metaphor research by providing a comprehensive and accessible dataset, promoting reproducibility, and enabling systematic investigations into metaphor properties. Additionally, it supports the evaluation of figurative language abilities in Large Language Models (LLMs) and encourages cross-linguistic studies by offering English translations of the Italian metaphors. Future plans include expanding the dataset and fostering collaborative efforts in metaphor research globally.

Methods

The “Methods” section of the research paper outlines the experimental design and analytical techniques employed to investigate the research question. The study utilized a quantitative approach, incorporating statistical analyses to evaluate the data collected from various experiments. Specific methodologies included controlled experiments, where variables were systematically manipulated to observe their effects on the outcomes of interest.

Data collection involved both qualitative and quantitative measures, ensuring a comprehensive understanding of the phenomena under study. The analysis was performed using standard statistical software, applying tests such as ANOVA and regression analysis to determine the significance of the results. The section also details the sampling methods, participant demographics, and any ethical considerations taken into account during the research process. Overall, the methods employed were rigorously designed to ensure the reliability and validity of the findings.

Discussion

The “Everyday Metaphors” module of the Figurative Archive consists of 464 unique metaphorical expressions in Italian, derived from nine studies conducted by NEPLab researchers. Each metaphor is assigned an alphanumeric ID and includes a literal English translation for accessibility. The dataset encompasses various metaphor types, such as nominal predicative metaphors (e.g., “That lawyer is a shark”), nominal word pairs (e.g., “language-bridge”), and predicate metaphors (e.g., “Alice shapes her future with Alberto”). Each expression is accompanied by psycholinguistic measures, including familiarity, meaningfulness, and semantic distance between topics and vehicles, with varying availability across items.

The collection process involved native Italian speakers, totaling 630 participants, and adhered to ethical guidelines, including GDPR compliance. The metaphors were rated for various dimensions, and the dataset reflects a predominance of nominal structures (69.18%). The analysis reveals distinct distributions of measures, such as familiarity and body-relatedness, indicating variability in metaphor characteristics. To enhance consistency, rating scales were standardized, and corpus-based measures were recalculated using contemporary resources. Additionally, inclusiveness ratings were collected to evaluate the metaphors’ respectfulness and sensitivity to diversity, further enriching the dataset’s applicability in psycholinguistic research.

Limitations

The section on limitations highlights several areas for future research concerning the Figurative Archive, a resource aimed at advancing metaphor studies, particularly in the Italian context. Firstly, it emphasizes the need to explore a wider variety of metaphor types, including nominal predicative, genitive, and predicate metaphors, as current representations may not fully capture the diversity found in different contexts, such as advertising and persuasive language. Future investigations could examine how these metaphorical structures function across various settings.

Secondly, the research acknowledges that existing rating measures are predominantly based on samples of young, educated adults, suggesting that demographic factors like age and educational background should be considered in future studies, as they may influence metaphor processing. The modular design of the Archive allows for the integration of new datasets that could emerge from addressing these limitations. Lastly, the challenge of translating Italian metaphors into English is noted, as metaphors are deeply rooted in cultural and linguistic contexts, often resisting direct translation. This underscores the Archive’s role as a foundational resource for broader cross-linguistic metaphor research, while also indicating the complexities involved in such endeavors.