PropertyGPT: التحقق الرسمي المدفوع بواسطة LLM للعقود الذكية من خلال توليد الخصائص المعزز بالاسترجاع PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation

المجلة: Proceedings 2025 Network and Distributed System Security Symposium
DOI: https://doi.org/10.14722/ndss.2025.241357
تاريخ النشر: 2025-01-01
المؤلف: Ye Liu وآخرون
الموضوع الرئيسي: تطبيقات تكنولوجيا البلوكشين والأمان

نظرة عامة

في هذه الورقة، قدم المؤلفون طريقة جديدة لتوليد الخصائص المعززة بالاسترجاع في العقود الذكية، مستفيدين من قدرات التعلم في السياق لنماذج اللغة الكبيرة (LLMs). تم تنفيذ هذه الطريقة في أداة تسمى PropertyGPT، التي تعالج التحديات الرئيسية لضمان أن الخصائص المولدة قابلة للتجميع، مناسبة، وقابلة للتحقق في وقت التشغيل.

أظهرت تقييمات PropertyGPT فعاليتها في تحديد العديد من الثغرات في العقود الذكية في العالم الحقيقي، لا سيما في المشاريع البارزة، مما أسفر عن إجمالي مكافآت بقيمة 8,256 دولار من بائعين مختلفين. مع النظر إلى المستقبل، يهدف المؤلفون إلى تحسين PropertyGPT من خلال دمج معلومات سياقية أكثر شمولاً من وثائق العقود وإثراء معرفتها بالخصائص من خلال مصادر متنوعة.

مقدمة

في مقدمة هذه الورقة البحثية، يناقش المؤلفون أهمية العقود الذكية، التي هي برامج مؤتمتة تُنفذ على منصات البلوكشين، لا سيما في التطبيقات اللامركزية مثل DeFi وNFTs. على الرغم من اعتمادها الواسع، فإن العقود الذكية معرضة لمجموعة متنوعة من الهجمات بسبب الأخطاء البرمجية والأخطاء المنطقية. يتم تسليط الضوء على التحقق الرسمي كطريقة حاسمة لتحديد هذه الثغرات، مما يتطلب توليد مواصفات رسمية مخصصة، والتي غالبًا ما تشمل خصائص المنطق الزمني ومنطق هوار. ومع ذلك، فإن المشهد الحالي يفتقر إلى طرق آلية لتوليد الخصائص الشاملة اللازمة للتحقق الرسمي الفعال.

تقترح الورقة الاستفادة من التقدمات الأخيرة في نماذج اللغة الكبيرة (LLMs) لأتمتة توليد خصائص العقود الذكية. من خلال تضمين الخصائص المكتوبة من قبل البشر في قاعدة بيانات متجهة، يهدف المؤلفون إلى تسهيل التعلم في السياق لتوليد خصائص جديدة مصممة لرمز العقد غير المعروف. يقدمون نظامًا يسمى PropertyGPT، الذي يستخدم تصاميم مبتكرة لضمان أن الخصائص المولدة قابلة للتجميع، مناسبة، وقابلة للتحقق. تشير التقييمات الأولية إلى أن PropertyGPT يمكنه بنجاح توليد خصائص تتماشى مع المعايير المكتوبة من قبل البشر، محققًا تغطية بنسبة 80% من الخصائص المعادلة ويظهر فعالية في اكتشاف الثغرات والأخطاء في العالم الحقيقي، مما يبرز إمكانيته في تعزيز أمان العقود الذكية.

طرق

في هذا القسم، يصف المؤلفون إعداد التجربة والمنهجية المستخدمة لتقييم PropertyGPT، وهو نموذج لغة كبير (LLM) يهدف إلى توليد خصائص للعقود الذكية واكتشاف الثغرات. النموذج المستخدم هو GPT-4-turbo، تم تكوينه مع معلمات محددة مثل درجة حرارة 0.8 وطول استجابة أقصى يبلغ 2000. كما استخدم المؤلفون نموذج text-embedding-ada-002 لحساب تشابهات التضمين. تم إجراء التجارب في بيئة Docker مع Ubuntu 20.04، باستخدام معالج Intel Core Xeon وذاكرة RAM سعة 2 جيجابايت.

للإجابة على أسئلة البحث (RQs)، قسم المؤلفون خصائص Certora إلى مجموعات بيانات تدريب واختبار. اختاروا تسعة مشاريع Certora، تتكون من 90 خاصية حقيقية، لتقييم دقة توليد الخصائص في PropertyGPT (RQ1). لاكتشاف الثغرات (RQ2)، تم مقارنة PropertyGPT مع Smart-Inv، أداة متزامنة، باستخدام معايير مختارة تضمنت 24 حادثة هجوم و13 ثغرة شائعة (CVEs). حدد المؤلفون مشكلات داخل معيار Smart-Inv، مثل الحالات المتكررة والرمز غير المكتمل، وضمان اختيار متنوع من CVEs لتقييم شامل. تهدف المنهجية إلى تقييم فعالية PropertyGPT في توليد الخصائص، واكتشاف الثغرات، وقابلية التعميم، والعوامل المؤثرة، وقدرته على تحديد الثغرات من نوع zero-day في المشاريع الواقعية.

نقاش

في هذا القسم، يقدم المؤلفون المساهمات والوظائف الخاصة بـ PropertyGPT، وهي أداة جديدة مصممة لتوليد الخصائص والتحقق الرسمي من العقود الذكية. تشمل المساهمات الرئيسية تطوير لغة مواصفات الخصائص (PSL) المصممة خصيصًا للعقود الذكية، ومثبت مخصص للتحقق من الخصائص، وآلية توليد الخصائص المعززة بالاسترجاع التي تستفيد من نماذج اللغة الكبيرة (LLMs) مثل GPT-4. أجرى المؤلفون تجارب واسعة للتحقق من فعالية PropertyGPT عبر سيناريوهات واقعية متنوعة، مع نتائج تشير إلى إمكانيته في تعزيز عمليات التحقق الرسمي.

كما يسلط النقاش الضوء على الاستخدام المبتكر للتعلم في السياق (ICL) لتسهيل توليد الخصائص، مما يسمح لـ PropertyGPT باستخدام الخصائص المكتوبة من قبل البشر كنقاط مرجعية. تستخدم الأداة خط أنابيب منظم يتضمن إنشاء قاعدة بيانات متجهة للخصائص المرجعية، واسترجاع خصائص مشابهة بناءً على رمز الإدخال، وتوليد خصائص مخصصة من خلال تحسين تكراري. بالإضافة إلى ذلك، تم تقديم خوارزمية تصنيف موزونة لاختيار الخصائص الأكثر صلة للتحقق، مما يضمن أن الخصائص المولدة ليست فقط قابلة للتجميع ولكن أيضًا ذات معنى في سياق العقود الذكية التي يتم تحليلها. يؤكد المؤلفون على توفر مجموعة البيانات والنموذج الأولي الخاص بهم، الذي يتم تسويقه بالتعاون مع مختبرات MetaTrust، مما يساهم في المجال الأوسع لأمان العقود الذكية والتحقق منها.

Journal: Proceedings 2025 Network and Distributed System Security Symposium
DOI: https://doi.org/10.14722/ndss.2025.241357
Publication Date: 2025-01-01
Author(s): Ye Liu et al.
Primary Topic: Blockchain Technology Applications and Security

Overview

In this paper, the authors introduced a novel method for retrieval-augmented property generation in smart contracts, leveraging the in-context learning capabilities of large language models (LLMs). This method was implemented in a tool named PropertyGPT, which addresses key challenges to ensure that the generated properties are compilable, appropriate, and verifiable at runtime.

The evaluation of PropertyGPT demonstrated its effectiveness in identifying numerous real-world vulnerabilities in smart contracts, particularly in high-profile projects, resulting in a total of $8,256 in bounty rewards from various vendors. Looking ahead, the authors aim to enhance PropertyGPT by incorporating more comprehensive contextual information from contract documentation and enriching its property knowledge through diverse sources.

Introduction

In the introduction of this research paper, the authors discuss the significance of smart contracts, which are automated programs executed on blockchain platforms, particularly in decentralized applications like DeFi and NFTs. Despite their widespread adoption, smart contracts are vulnerable to various attacks due to programming errors and logical bugs. Formal verification is highlighted as a crucial method for identifying these vulnerabilities, requiring the generation of customized formal specifications, which often include temporal logic and Hoare logic properties. However, the current landscape lacks automated methods for generating comprehensive properties necessary for effective formal verification.

The paper proposes leveraging recent advancements in large language models (LLMs) to automate the generation of smart contract properties. By embedding existing human-written properties into a vector database, the authors aim to facilitate in-context learning for generating new properties tailored to unknown contract code. They introduce a system called PropertyGPT, which employs innovative designs to ensure the generated properties are compilable, appropriate, and verifiable. Initial evaluations indicate that PropertyGPT can successfully generate properties that align with human-written standards, achieving an 80% coverage of equivalent properties and demonstrating effectiveness in detecting real-world vulnerabilities and bugs, thus showcasing its potential for enhancing the security of smart contracts.

Methods

In this section, the authors describe the experimental setup and methodology used to evaluate PropertyGPT, a large language model (LLM) aimed at generating properties for smart contracts and detecting vulnerabilities. The model utilized is GPT-4-turbo, configured with specific parameters such as a temperature of 0.8 and a maximum response length of 2000. The authors also employed the text-embedding-ada-002 model for calculating embedding similarities. The experiments were conducted on a Docker environment with Ubuntu 20.04, utilizing an Intel Core Xeon processor and 2GB of RAM.

To address the research questions (RQs), the authors divided Certora properties into training and testing datasets. They selected nine Certora projects, comprising 90 ground-truth properties, to evaluate PropertyGPT’s property generation accuracy (RQ1). For vulnerability detection (RQ2), PropertyGPT was compared against Smart-Inv, a concurrent tool, using curated benchmarks that included 24 attack incidents and 13 Common Vulnerabilities and Exposures (CVEs). The authors identified issues within the Smart-Inv benchmark, such as repeated cases and incomplete code, and ensured a diverse selection of CVEs for a comprehensive evaluation. The methodology aims to assess PropertyGPT’s effectiveness in generating properties, detecting vulnerabilities, generalizability, influencing factors, and its ability to identify zero-day vulnerabilities in real-world projects.

Discussion

In this section, the authors present the contributions and functionalities of PropertyGPT, a novel tool designed for property generation and formal verification of smart contracts. Key contributions include the development of a property specification language (PSL) tailored for smart contracts, a dedicated prover for property verification, and a retrieval-augmented property generation mechanism that leverages large language models (LLMs) like GPT-4. The authors conducted extensive experiments to validate PropertyGPT’s effectiveness across various real-world scenarios, with results indicating its potential to enhance formal verification processes.

The discussion also highlights the innovative use of in-context learning (ICL) to facilitate property generation, allowing PropertyGPT to utilize existing human-written properties as reference points. The tool employs a structured pipeline that includes creating a vector database for reference properties, retrieving similar properties based on input code, and generating customized properties through iterative refinement. Additionally, a weighted ranking algorithm is introduced to select the most relevant properties for verification, ensuring that the generated properties are not only compilable but also meaningful in the context of the smart contracts being analyzed. The authors emphasize the availability of their dataset and prototype, which is being commercialized in collaboration with MetaTrust Labs, thus contributing to the broader field of smart contract security and verification.