توقع أسعار الأسهم باستخدام إشارات حركة السوق الموجهة بواسطة LLM ونموذج المحولات Stock Price Prediction with LLM-Guided Market Movement Signals and Transformer Model

المجلة: FinTech and Sustainable Innovation
DOI: https://doi.org/10.47852/bonviewfsi52025703
تاريخ النشر: 2025-01-01
المؤلف: Qizhao Chen
الموضوع الرئيسي: طرق التنبؤ بسوق الأسهم

نظرة عامة

تقدم ورقة البحث نهجًا جديدًا لتوقع أسعار الأسهم يستفيد من نماذج اللغة الكبيرة (LLMs) بالتزامن مع شبكات Transformer. غالبًا ما تكافح طرق التوقع التقليدية لتحليل تعقيدات الأخبار المالية واتجاهات السوق بشكل فعال. لمعالجة ذلك، يقترح المؤلفون إطارًا يستخدم موجهًا منظمًا لاستخراج رؤى من الأخبار المالية وميزات الأسهم، مما يولد اتجاهات متوقعة كمتجهات واحدة ساخنة مع احتمالات مرتبطة. تعتبر هذه المخرجات التي تم إنشاؤها بواسطة LLM، جنبًا إلى جنب مع أسعار الإغلاق التاريخية، مدخلًا لنموذج Transformer للتنبؤ بسعر السهم في اليوم التالي.

تم تقييم فعالية هذا الإطار مقابل نماذج أساسية مختلفة، بما في ذلك الذاكرة طويلة وقصيرة الأجل (LSTM)، والشبكة التلافيفية الزمنية، والشبكة العصبية التلافيفية (CNN)، وغيرها. تشير النتائج إلى أن الطريقة المقترحة تتفوق باستمرار على هذه النماذج التقليدية، محققة أقل خطأ متوسط مربع (MSE) عبر عدة أسهم. تؤكد دراسة الإزالة أيضًا أن دمج الميزات التي تم إنشاؤها بواسطة LLM يعزز أداء التنبؤ. تؤكد الخاتمة على إمكانات LLMs لتحسين استخراج الميزات ودقة التنبؤ، مقترحة مجالات للبحث المستقبلي، مثل ضبط LLMs للتطبيقات المالية واستكشاف تأثير تكوينات الإدخال المختلفة على أداء النموذج.

مقدمة

تتناول مقدمة ورقة البحث هذه تعقيدات توقع أسعار الأسهم، مع تسليط الضوء على أهميتها في المالية والتحديات التي تطرحها عوامل التأثير المختلفة، بما في ذلك مشاعر السوق والظروف الاقتصادية الكلية. تعاني نماذج التنبؤ التقليدية مثل الذاكرة طويلة وقصيرة الأجل (LSTM)، وآلات الدعم النقطية (SVM)، وغابات العشوائية (RF) من قيود في التقاط العلاقات الدقيقة بين أسعار الأسهم وديناميات السوق، خاصة خلال التحولات المفاجئة في المشاعر. تؤكد الورقة على إمكانات دمج البيانات النصية غير المنظمة، مثل الأخبار المالية ومنشورات وسائل التواصل الاجتماعي، في نماذج التنبؤ، مستفيدة من التقدم في معالجة اللغة الطبيعية (NLP) ونماذج اللغة الكبيرة (LLMs) لتعزيز دقة التنبؤ.

يقترح المؤلفون إطارًا جديدًا يجمع بين الميزات التي تم إنشاؤها بواسطة LLM ونموذج Transformer لتوقع أسعار الأسهم. على عكس الشبكات العصبية التكرارية (RNNs) التي تعالج البيانات بشكل تسلسلي، يستخدم نموذج Transformer آليات الانتباه الذاتي لتحليل جميع ميزات الإدخال في وقت واحد، مما يمكنه من التقاط العلاقات المعقدة في تحركات الأسهم مع دمج المشاعر الخارجية. تهدف الدراسة إلى تحديد ما إذا كانت الميزات التي تم إنشاؤها بواسطة LLM يمكن أن تعزز دقة التنبؤ لنماذج القائمة على Transformer. تشمل المساهمات الرئيسية تقديم نهج جديد يدمج الرؤى المدفوعة بالمشاعر مع البيانات التاريخية ودراسة إزالة توضح تفوق النموذج المقترح على الأسس التقليدية، محققة أداءً محسناً من حيث خطأ المتوسط المربع (MSE). توضح الورقة هيكلها، موضحة الأعمال ذات الصلة، والمنهجية، والنتائج التجريبية، والاستنتاجات في الأقسام اللاحقة.

طرق

تقدم منهجية البحث الموضحة في الورقة إطارًا موضحًا في الشكل 1، والذي يدمج الأخبار المالية وميزات الأسهم لتحليل اتجاهات السوق. يستخدم النهج تقنية النافذة المنزلقة بحجم خمسة أيام، مما يسمح بجمع نقاط البيانات ذات الصلة التي تلتقط الديناميات الزمنية في أداء الأسهم. يتم بناء كل موجه من هذه البيانات، مما يمكّن من فحص شامل للتفاعل بين مشاعر الأخبار وتحركات الأسهم.

تؤكد المنهجية على أهمية دمج البيانات النوعية والكمية لتعزيز دقة التنبؤ. من خلال الاستفادة من إطار منظم يجمع بين الأخبار المالية وميزات الأسهم، تهدف الدراسة إلى كشف الأنماط التي قد تفيد استراتيجيات الاستثمار وعمليات اتخاذ القرار في الأسواق المالية. من المتوقع أن يؤدي هذا النهج المزدوج إلى تقديم رؤى حول تأثير الأخبار على سلوك الأسهم خلال الإطار الزمني المحدد.

نتائج

تقيم نتائج التجارب، المفصلة في الجدول 1، نماذج مختلفة لتوقع أسعار الأسهم باستخدام خطأ المتوسط المربع (MSE) كمقياس رئيسي. تقارن الدراسة خوارزميات التعلم الآلي التقليدية، مثل غابات العشوائية (RF) والانحدار بواسطة آلات الدعم النقطية (SVR)، مع أساليب التعلم العميق بما في ذلك الذاكرة طويلة وقصيرة الأجل (LSTM)، والشبكات التلافيفية الزمنية (TCN)، والشبكات العصبية التلافيفية (CNN)، والنماذج الهجينة مثل CNN-LSTM، بالإضافة إلى الهياكل المعتمدة على Transformer. تشير النتائج إلى أن نموذج Transformer المقترح، المعزز بالميزات التي تم إنشاؤها بواسطة نماذج اللغة الكبيرة (LLMs)، يحقق باستمرار أقل MSE عبر جميع الأسهم المستهدفة، مما يظهر قدرة قوية على التقاط ديناميات السوق وإشارات مدفوعة بالمشاعر.

علاوة على ذلك، يتفوق النموذج المقترح بشكل كبير على نماذج التعلم العميق التقليدية ونموذج CNN-LSTM الهجين، مما يشير إلى أن دمج الميزات التي تم إنشاؤها بواسطة LLM يعزز بشكل فعال قدرة النموذج على تحديد الأنماط المعقدة داخل البيانات المالية. بينما يتجاوز نموذج Transformer العادي بالفعل نماذج التعلم الآلي التقليدية، فإن إضافة ميزات LLM تعزز قدراته التنبؤية من خلال الاستفادة من معلومات المشاعر المستخرجة من البيانات النصية. تكشف النتائج أيضًا أن النماذج التقليدية مثل RF وSVR يمكن أن تتفوق على بعض نماذج التعلم العميق، مما يشير إلى أن النماذج الأبسط يمكن أن تحقق أحيانًا أداءً مرضيًا. بشكل عام، تؤكد الفعالية المستمرة لنموذج Transformer المقترح مع ميزات LLM على قيمة دمج مصادر البيانات البديلة، مثل الأخبار المالية ومشاعر السوق، في نماذج توقع أسعار الأسهم، مما يمهد الطريق للبحث المستقبلي حول نماذج متطورة تدمج LLMs مع هياكل التعلم العميق لتحسين التنبؤات المالية.

مناقشة

تسلط المناقشة الضوء على تطور منهجيات توقع أسعار الأسهم، مقارنتها بنماذج التعلم الآلي التقليدية مثل آلات الدعم النقطية (SVM)، وغابات العشوائية (RF)، وتعزيز التدرج مع تقنيات التعلم العميق المتقدمة، وخاصة الذاكرة طويلة وقصيرة الأجل (LSTM) وهياكل Transformer. بينما تتفوق النماذج التقليدية في تحديد الأنماط المعقدة، فإنها غالبًا ما تكافح مع الاعتماديات طويلة الأجل في بيانات السلاسل الزمنية. في المقابل، تعتبر LSTMs بارعة في معالجة البيانات التسلسلية ولكن تواجه تحديات تتعلق بالكفاءة الحاسوبية. لقد أظهرت Transformers، التي تستفيد من آليات الانتباه الذاتي، أداءً متفوقًا في التقاط الاعتماديات المحلية والعالمية، ومع ذلك، تركز معظم النماذج الحالية فقط على بيانات الأسعار التاريخية والمؤشرات الفنية، متجاهلة الإشارات المدفوعة بالمشاعر.

بدأت الأبحاث الحديثة في دمج ميزات المشاعر التي تم إنشاؤها بواسطة نماذج اللغة الكبيرة (LLMs) في أطر توقع الأسهم. بينما تم استخدام نماذج مضبوطة مثل BERT وGPT لتصنيف المشاعر من الأخبار المالية، فإنها غالبًا ما تتطلب تدريبًا مكثفًا خاصًا بالمجال، مما يمكن أن يكون مكلفًا من حيث الموارد. يستخدم النهج المقترح نموذج LLM قائم على الموجه لتوليد مخرجات منظمة تتضمن كل من البيانات التاريخية للأسهم وتحليل المشاعر، مما يعزز دقة التنبؤ. لقد أظهر دمج التنبؤات التي تم إنشاؤها بواسطة LLM واحتمالاتها المرتبطة في نموذج Transformer تحسينات كبيرة في أداء التنبؤ، كما يتضح من انخفاض قيم خطأ المتوسط المربع (MSE) مقارنة بالنماذج الأساسية. تؤكد النتائج على أهمية دمج التنبؤات المشاعرية الفئوية مع التقديرات الاحتمالية لالتقاط عدم اليقين في السوق، مما يشير إلى أن هذا النهج الهجين قد يؤدي إلى منهجيات أكثر قوة في توقع أسعار الأسهم. تشمل اتجاهات البحث المستقبلية ضبط LLMs للتطبيقات المالية واستكشاف أحجام نوافذ الإدخال المتغيرة لتحسين دقة التنبؤ.

Journal: FinTech and Sustainable Innovation
DOI: https://doi.org/10.47852/bonviewfsi52025703
Publication Date: 2025-01-01
Author(s): Qizhao Chen
Primary Topic: Stock Market Forecasting Methods

Overview

The research paper presents a novel approach to stock price prediction that leverages large language models (LLMs) in conjunction with Transformer networks. Traditional prediction methods often struggle to effectively analyze the complexities of financial news and market trends. To address this, the authors propose a framework that utilizes a structured prompt to extract insights from financial news and stock features, generating predicted trends as one-hot vectors with associated probabilities. This LLM-generated output, combined with historical closing prices, serves as input for a Transformer model to forecast the next day’s stock price.

The effectiveness of this framework is evaluated against various baseline models, including Long Short-Term Memory (LSTM), Temporal Convolutional Network, Convolutional Neural Network (CNN), and others. The results indicate that the proposed method consistently outperforms these traditional models, achieving the lowest mean squared error (MSE) across multiple stocks. The ablation study further confirms that incorporating LLM-generated features enhances prediction performance. The conclusion emphasizes the potential of LLMs for improved feature extraction and predictive accuracy, suggesting avenues for future research, such as fine-tuning LLMs for financial applications and exploring the impact of different input configurations on model performance.

Introduction

The introduction of this research paper addresses the complexities of stock price prediction, highlighting its significance in finance and the challenges posed by various influencing factors, including market sentiment and macroeconomic conditions. Traditional predictive models like Long Short-Term Memory (LSTM), support vector machines (SVM), and Random Forest (RF) have limitations in capturing the nuanced relationships between stock prices and market dynamics, particularly during sudden shifts in sentiment. The paper emphasizes the potential of integrating unstructured textual data, such as financial news and social media posts, into predictive models, leveraging advancements in natural language processing (NLP) and large language models (LLMs) to enhance prediction accuracy.

The authors propose a novel framework that combines LLM-generated features with a Transformer model for stock price forecasting. Unlike recurrent neural networks (RNNs) that process data sequentially, the Transformer model employs self-attention mechanisms to analyze all input features simultaneously, enabling it to capture complex relationships in stock movements while incorporating external sentiment. The study aims to determine whether LLM-generated features can enhance the predictive accuracy of Transformer-based models. Key contributions include the introduction of a new approach that integrates sentiment-driven insights with historical data and an ablation study demonstrating the superiority of the proposed model over traditional baselines, achieving improved performance in terms of mean squared error (MSE). The paper outlines its structure, detailing related work, methodology, experimental results, and conclusions in subsequent sections.

Methods

The research methodology outlined in the paper introduces a framework depicted in Figure 1, which integrates financial news and stock features to analyze market trends. The approach employs a sliding window technique with a size of five days, allowing for the collection of relevant data points that capture temporal dynamics in stock performance. Each prompt is constructed from this data, enabling a comprehensive examination of the interplay between news sentiment and stock movements.

The methodology emphasizes the importance of combining qualitative and quantitative data to enhance predictive accuracy. By leveraging a structured framework that incorporates both financial news and stock features, the study aims to uncover patterns that may inform investment strategies and decision-making processes in the financial markets. This dual approach is expected to yield insights into the influence of news on stock behavior over the specified time frame.

Results

The results of the experiments, detailed in Table 1, evaluate various models for stock price prediction using Mean Squared Error (MSE) as the primary metric. The study compares traditional machine learning algorithms, such as Random Forest (RF) and Support Vector Regression (SVR), with deep learning approaches including Long Short-Term Memory (LSTM), Temporal Convolutional Networks (TCN), Convolutional Neural Networks (CNN), and hybrid models like CNN-LSTM, as well as Transformer-based architectures. The findings indicate that the proposed Transformer model, enhanced with features generated by large language models (LLMs), consistently achieves the lowest MSE across all target stocks, demonstrating a robust capacity to capture market dynamics and sentiment-driven signals.

Moreover, the proposed model significantly outperforms traditional deep learning models and the hybrid CNN-LSTM model, suggesting that the integration of LLM-generated features effectively enhances the model’s ability to identify intricate patterns within financial data. While the vanilla Transformer already surpasses traditional machine learning models, the addition of LLM features further amplifies its predictive capabilities by leveraging sentiment information extracted from textual data. The results also reveal that traditional models like RF and SVR can outperform certain deep learning models, indicating that simpler models can sometimes yield satisfactory performance. Overall, the consistent efficacy of the proposed Transformer with LLM features underscores the value of incorporating alternative data sources, such as financial news and market sentiment, into stock price prediction models, paving the way for future research on sophisticated models that merge LLMs with deep learning architectures for improved financial forecasting.

Discussion

The discussion highlights the evolution of stock price prediction methodologies, contrasting traditional machine learning models like Support Vector Machines (SVM), Random Forests (RF), and gradient boosting with advanced deep learning techniques, particularly Long Short-Term Memory (LSTM) and Transformer architectures. While traditional models excel in identifying complex patterns, they often struggle with long-term dependencies in time series data. In contrast, LSTMs are adept at sequential data processing but face challenges related to computational efficiency. Transformers, leveraging self-attention mechanisms, have shown superior performance in capturing both local and global dependencies, yet most existing models focus solely on historical price data and technical indicators, neglecting sentiment-driven signals.

Recent research has begun integrating sentiment features generated by Large Language Models (LLMs) into stock prediction frameworks. While fine-tuned models like BERT and GPT have been employed to classify sentiment from financial news, they often require extensive domain-specific training, which can be resource-intensive. The proposed approach utilizes a prompt-based LLM to generate structured outputs that incorporate both historical stock data and sentiment analysis, thereby enhancing prediction accuracy. The integration of LLM-generated predictions and their associated probabilities into a Transformer model has demonstrated significant improvements in forecasting performance, as evidenced by lower mean squared error (MSE) values compared to baseline models. The findings underscore the importance of combining categorical sentiment predictions with probabilistic estimates to capture market uncertainty, suggesting that this hybrid approach could lead to more robust stock price forecasting methodologies. Future research directions include fine-tuning LLMs for financial applications and exploring varying input window sizes to optimize predictive accuracy.