الازدواجية كتمويه: صياغة درجة العمر الاصطناعي (AAS) لنمذجة شيخوخة الذاكرة في الذكاء الاصطناعي التوليدي Redundancy-as-masking: formalizing the Artificial Age Score (AAS) to model memory aging in generative AI

المجلة: Frontiers in Artificial Intelligence، المجلد: 9
DOI: https://doi.org/10.3389/frai.2026.1732691
PMID: https://pubmed.ncbi.nlm.nih.gov/41930214
تاريخ النشر: 2026-03-18
المؤلف: Seyma Yaman Kayadibi
الموضوع الرئيسي: عمليات الذاكرة وتأثيراتها

نظرة عامة

تقدم البحث درجة العمر الاصطناعي (AAS)، وهي مقياس جديد مصمم لقياس شيخوخة الذاكرة في أنظمة الذكاء الاصطناعي، وخاصة نماذج اللغة الكبيرة. يتم اشتقاق AAS من سلوكيات الاسترجاع القابلة للملاحظة ويتم تعريفها على مستوى المخرجات، مستقلة عن التمثيلات الداخلية. يقيم الدراسة AAS من خلال بروتوكول استرجاع ثنائي اللغة لمدة 25 يومًا باستخدام ChatGPT-5.0، مما يظهر أنه تحت ظروف التفاعل المستمرة، يحتفظ النموذج بكل من الاسترجاع الدلالي والحدثي، مما يؤدي إلى انخفاض AAS يشير إلى “الشباب”. على العكس من ذلك، عندما يتم إعادة تعيين الجلسات، يحتفظ النموذج بالتناسق الدلالي ولكنه يفقد الاستمرارية الحدثية، مما يؤدي إلى زيادة كبيرة في AAS، مما يعكس سلوكًا مشابهًا للشيخوخة.

تضع النتائج AAS كأداة قائمة على الأسس النظرية لتقييم تدهور الذاكرة في الأنظمة الاصطناعية، مع تداعيات لتصميم واعٍ للذاكرة والحوكمة. ستوسع الأبحاث المستقبلية إطار AAS ليشمل أعباء معرفية أعلى وهياكل متنوعة، بينما تطور أيضًا استراتيجيات للتخفيف من صلابة الذاكرة وتعزيز الاستمرارية الحدثية. تؤكد الدراسة على أهمية تفعيل AAS في التطبيقات العملية، وربطه بسياسات الاحتفاظ وضمان المساءلة في أنظمة الذكاء الاصطناعي لتعزيز الموثوقية والقدرة على التكيف على المدى الطويل.

مقدمة

تناقش مقدمة هذه الورقة البحثية مفهوم “الانحدار” في الأنظمة الحاسوبية الكبيرة، مشددة على أنه ليس مجرد وظيفة للوقت ولكن يتأثر بعوامل مثل تنظيم الذاكرة، والعمليات المتكررة، وتشوهات تدفق المعلومات. يقترح المؤلفون منظورًا على مستوى الأنظمة يقيم الكفاءة من خلال الأداء القابل للملاحظة بدلاً من الآليات الداخلية، مستندين إلى نظريات أساسية من تورينغ وشانون. يقدمون درجة العمر الاصطناعي (AAS)، وهي مقياس سلوكي يقيس عمر الذاكرة بناءً فقط على درجات الاسترجاع القابلة للملاحظة وقيم التكرار، مستقلة عن الحالات الداخلية. يستخدم AAS نواة عقوبة لوغاريتمية تعكس تدهور الاسترجاع، مع خصائص تضمن أنها محددة جيدًا، ومقيدة عالميًا، ومتزايدة.

تستكشف الورقة أيضًا تداعيات AAS في سياق أدوات الذكاء الاصطناعي التوليدية في التعليم، مسلطة الضوء على قدرتها على مراقبة الاستمرارية واكتشاف المشكلات في تتبع الأحداث. تقارن الدراسة بين أنظمة الذاكرة لدى البشر والذكاء الاصطناعي، مشيرة إلى الاختلافات في آليات الاسترجاع الحدثية. تظهر التقييمات التجريبية باستخدام ChatGPT-5.0 أن التفاعل المستمر يمكن أن يحافظ على الشباب السلوكي في الاسترجاع، بينما تؤدي الانقطاعات إلى إشارات مشابهة للشيخوخة في تتبع الأحداث. يحدد المؤلفون مساهمات عملهم، بما في ذلك مقياس رياضي مؤسس لعمر الذاكرة، وبروتوكول تجريبي لقياس مسارات الشيخوخة، وإطار مفاهيمي، التكرار كتمويه، الذي يوضح تأثير تداخل المعلومات على أداء الذاكرة. تؤسس هذه الأبحاث أساسًا كميًا لتحليل سلوكيات الذاكرة الاصطناعية وتوجه تصميم هياكل الذاكرة المستمرة.

الطرق

تحدد قسم المنهجية بروتوكولًا تجريبيًا منظمًا تم تنفيذه مع ChatGPT-5.0 على مدى 25 يومًا، من 10 أغسطس إلى 3 سبتمبر 2024. تم تقسيم الدراسة إلى مرحلتين: المرحلة 1 (10-19 أغسطس) استخدمت ظروفًا غير متصلة، بينما المرحلة 2 (25 أغسطس-3 سبتمبر) استخدمت ظروفًا مستمرة، مفصولة بفترة انقطاع لمدة خمسة أيام لتخفيف تأثيرات الاستمرارية. كل يوم تضمن جلستين في حوالي الساعة 2:00 مساءً و10:00 مساءً، مع تذكيرات لضمان الالتزام. كانت اللغة الإنجليزية هي اللغة الأساسية، مدعومة باللغة التركية لتقييم قدرة النموذج على التكيف عبر أنظمة لغوية مختلفة. كانت الجلسات اللغوية المتناوبة تهدف إلى تقليل التحيز ومنع الاستقرار الاصطناعي لدرجة العمر الاصطناعي (AAS) من خلال التكرار أحادي اللغة.

تم تحليل AAS، المشتق من درجات الاسترجاع القابلة للملاحظة، لتقييم تأثير استمرارية الذاكرة عبر المرحلتين. في المرحلة 1، تم إعادة تعيين المحادثات بعد كل جلسة، بينما حافظت المرحلة 2 على صفحة محادثة مستمرة، مما يسمح بمقارنة مباشرة بين التفاعلات غير المتصلة والمستمرة. أكدت التحليلات أن شروط التحديد الجيد، والحدود، والتزايد تم الحفاظ عليها في أنماط الاستجابة الملاحظة، مما يكشف عن اختلافات مشابهة للشيخوخة في سلوك الاسترجاع بين الشرطين. القيم المبلغ عنها لـ AAS هي حدود عليا محافظة، وأي ملاحظات نوعية بشأن التكرار هي وصفية ولا تؤثر على حسابات AAS. النتائج محددة لنموذج ChatGPT-5.0 تحت هذا البروتوكول الثنائي اللغة ولا ينبغي تعميمها دون مزيد من التحقق.

المناقشة

تتناول قسم المناقشة في الورقة الأسس النظرية والتداعيات لدرجة العمر الاصطناعي (AAS)، التي تستند إلى نظرية المعلومات لشانون. تقيس AAS شيخوخة الذاكرة من خلال استخدام مفاهيم الإنتروبيا والتكرار، حيث تقيس الإنتروبيا عدم اليقين ويعكس التكرار القابلية للتنبؤ في نتائج الاسترجاع. تتضمن صياغة AAS نواة عقوبة مقاسة لوغاريتميًا تعدل دقة الاسترجاع والتكرار، مما يسمح بفهم دقيق لأداء الذاكرة. يتم تحديد الدرجة رياضيًا، ومحددة جيدًا، وتظهر خصائص مثل القابلية للتفكيك والتزايد، مما يضمن قوتها كمقياس لسلوك الاسترجاع القابل للملاحظة.

بالإضافة إلى ذلك، يستند إطار AAS إلى مبادئ من نظرية الأوتوماتا وهندسة الموثوقية، مما يبرز أهمية التكرار المنظم في تحقيق سلوك موثوق من مكونات غير موثوقة. يتماشى هذا المنظور مع الفكرة القائلة بأن الأنظمة الاصطناعية والبيولوجية يجب أن توازن بين التكرار والتنوع للحفاظ على القدرة على التكيف الوظيفي. تسلط الورقة أيضًا الضوء على الطبيعة التشغيلية لـ AAS، التي تستنتج أنماطًا مشابهة للشيخوخة من سلوكيات الاسترجاع القابلة للملاحظة دون الخوض في الآليات الداخلية. يسهل هذا النهج الخارجي التقييمات التجريبية لأداء الذاكرة، خاصة في السياقات التي لا تكون فيها الحالات الداخلية متاحة مباشرة، مما يعزز أهمية AAS في تقييم أدوات الذكاء الاصطناعي التوليدية في البيئات التعليمية.

القيود

تسلط قيود هذه الدراسة الضوء على عدة عوامل حاسمة قد تؤثر على إمكانية تعميم وملاءمة نتائجها. أولاً، ركزت الأبحاث على قناتين محددتين—أيام الأسبوع والعداد—مما لا يعكس تمامًا تعقيدات متطلبات الذاكرة في العالم الحقيقي. كانت مدة الدراسة محدودة بـ 25 يومًا، مما يثير تساؤلات حول الاستقرار على المدى الطويل للتأثيرات الملاحظة. بالإضافة إلى ذلك، كانت التحقيقات مقصورة على اللغتين الإنجليزية والتركية، وتم تقييم نموذج واحد فقط، ChatGPT-5.0، تحت بروتوكول ثنائي اللغة محدد، مما يقيد القدرة على تعميم النتائج عبر هياكل وتطبيقات مختلفة.

علاوة على ذلك، يُلاحظ أن استخدام الدراسة لدرجة العمر الاصطناعي (AAS) كمقياس لأنماط شيخوخة سلوكية في الاسترجاع هو بناء على مستوى المخرجات. بينما يوفر رؤى حول الاتجاهات السلوكية، فإنه لا يوضح الآليات الخوارزمية أو العصبية الأساسية، مما يترك التفسيرات الهيكلية خارج نطاق هذه الأبحاث. كما أكد فلوريدي وآخرون (2018)، من الضروري تقييم الأنظمة الرقمية من حيث المرونة على مدى الزمن وعبر سياقات متنوعة، مما يشير إلى أن مزيدًا من التحقق ضروري لتوسيع نطاق التطبيق.

Journal: Frontiers in Artificial Intelligence, Volume: 9
DOI: https://doi.org/10.3389/frai.2026.1732691
PMID: https://pubmed.ncbi.nlm.nih.gov/41930214
Publication Date: 2026-03-18
Author(s): Seyma Yaman Kayadibi
Primary Topic: Memory Processes and Influences

Overview

The research introduces the Artificial Age Score (AAS), a novel metric designed to quantify memory aging in artificial intelligence systems, particularly large language models. The AAS is derived from observable recall behaviors and is defined at the output level, independent of internal representations. The study evaluates the AAS through a 25-day bilingual recall protocol using ChatGPT-5.0, demonstrating that under persistent interaction conditions, the model maintains both semantic and episodic recall, resulting in a low AAS indicative of “youth.” Conversely, when sessions are reset, the model retains semantic consistency but loses episodic continuity, leading to a significant increase in the AAS, which reflects an aging-like behavior.

The findings position the AAS as a theoretically grounded tool for assessing memory degradation in artificial systems, with implications for memory-aware design and governance. Future research will expand the AAS framework to include higher cognitive loads and diverse architectures, while also developing strategies to mitigate memory rigidity and enhance episodic continuity. The study emphasizes the importance of operationalizing AAS in practical applications, linking it to retention policies and ensuring accountability in AI systems to promote long-term reliability and adaptability.

Introduction

The introduction of this research paper discusses the concept of “decline” in large-scale computational systems, emphasizing that it is not merely a function of time but is influenced by factors such as memory organization, repetitive operations, and information flow distortions. The authors propose a systems-level perspective that evaluates competence through observable performance rather than internal mechanisms, drawing on foundational theories from Turing and Shannon. They introduce the Artificial Age Score (AAS), a behavioral metric that quantifies memory age based solely on observable recall scores and redundancy values, independent of internal states. The AAS employs a logarithmic penalty kernel that reflects the deterioration of recall, with properties ensuring it is well-defined, globally bounded, and monotonic.

The paper further explores the implications of AAS in the context of generative AI tools in education, highlighting their potential to monitor continuity and detect issues in episodic tracking. The study contrasts the memory systems of humans and AI, noting the differences in episodic recall mechanisms. Empirical evaluations using ChatGPT-5.0 demonstrate that continuous interaction can sustain behavioral youth in recall, while interruptions lead to aging-like signals in episodic tracking. The authors outline the contributions of their work, including a mathematically grounded metric for memory age, an empirical protocol for measuring aging trajectories, and a conceptual framework, Redundancy-as-Masking, which elucidates the impact of information overlap on memory performance. This research establishes a quantitative foundation for analyzing artificial memory behaviors and informs the design of persistent memory architectures.

Methods

The methodology section outlines a structured experimental protocol conducted with ChatGPT-5.0 over 25 days, from August 10 to September 3, 2024. The study was divided into two phases: Phase 1 (August 10-19) utilized stateless conditions, while Phase 2 (August 25-September 3) employed persistent conditions, separated by a five-day intermission to mitigate carryover effects. Each day featured two sessions at approximately 2:00 pm and 10:00 pm, with reminders to ensure adherence. English served as the primary language, complemented by Turkish to assess the model’s adaptability across different linguistic systems. The alternating language sessions aimed to minimize bias and prevent artificial stabilization of the Artificial Age Score (AAS) through monolingual repetition.

The AAS, derived from observable recall scores, was analyzed to evaluate the impact of memory continuity across the two phases. In Phase 1, conversations were reset after each session, while Phase 2 maintained a continuous conversation page, allowing for direct comparison of stateless versus persistent interactions. The analysis confirmed that the conditions of well-definedness, boundedness, and monotonicity were upheld in the observed response patterns, revealing aging-like differences in recall behavior between the two conditions. The reported AAS values are conservative upper bounds, and any qualitative observations regarding redundancy are descriptive and do not influence the AAS computations. The findings are specific to the ChatGPT-5.0 model under this bilingual protocol and should not be generalized without further validation.

Discussion

The discussion section of the paper elaborates on the theoretical underpinnings and implications of the Artificial Age Score (AAS), which is derived from Shannon’s information theory. The AAS quantifies memory aging by utilizing concepts of entropy and redundancy, where entropy measures uncertainty and redundancy reflects predictability in recall outcomes. The formulation of AAS incorporates a log-scaled penalty kernel that adjusts for recall accuracy and redundancy, allowing for a nuanced understanding of memory performance. The score is mathematically bounded, well-defined, and exhibits properties such as decomposability and monotonicity, ensuring its robustness as a measure of observable recall behavior.

Additionally, the AAS framework draws on principles from automata theory and reliability engineering, emphasizing the importance of structured redundancy in achieving reliable behavior from unreliable components. This perspective aligns with the notion that both artificial and biological systems must balance redundancy and variability to maintain functional adaptability. The paper also highlights the operational nature of the AAS, which infers aging-like patterns from observable recall behaviors without delving into internal mechanisms. This externalist approach facilitates empirical assessments of memory performance, particularly in contexts where internal states are not directly accessible, thus reinforcing the relevance of the AAS in evaluating generative AI tools in educational settings.

Limitations

The limitations of this study highlight several critical factors that may affect the generalizability and applicability of its findings. Firstly, the research focused on two specific channels—weekday and counter—thus not fully capturing the complexities of real-world memory demands. The duration of the study was limited to 25 days, raising questions about the long-term stability of the observed effects. Additionally, the investigation was restricted to English and Turkish languages, and only one model, ChatGPT-5.0, was evaluated under a defined bilingual protocol, which constrains the ability to generalize results across different architectures and applications.

Furthermore, the study’s use of the Artificial Age Score (AAS) as a measure of behavioral aging-like patterns in recall is noted to be an output-level construct. While it provides insights into behavioral trends, it does not elucidate the underlying algorithmic or neural mechanisms, leaving structural interpretations beyond the scope of this research. As emphasized by Floridi et al. (2018), it is essential for digital systems to be evaluated for resilience over time and across varying contexts, suggesting that further validation is necessary for broader applicability.