التنبؤ بتقلبات أسعار الأسهم التكيفية باستخدام تجميع kMedoids المحسن مع LSTM المعتمد على الانتباه لتداول اليوم Adaptive stock price volatility forecasting using enhanced kMedoids clustering with attention based LSTM for day trading

المجلة: Discover Computing، المجلد: 29، العدد: 1
DOI: https://doi.org/10.1007/s10791-026-10079-z
تاريخ النشر: 2026-03-30
المؤلف: Zhenyun Du وآخرون
الموضوع الرئيسي: طرق التنبؤ بسوق الأسهم

نظرة عامة

تظل توقعات تحركات أسعار الأسهم تحديًا كبيرًا للمتداولين والباحثين بسبب السلوك المعقد وغير الخطي لسوق الأسهم. على الرغم من الأبحاث الواسعة، فإن توقع أسعار الأسهم على المدى القصير أثبت أنه صعب بشكل خاص، حيث تكون ردود أفعال السوق غالبًا غير متوقعة. تقترح هذه الورقة إطارًا جديدًا يعزز طرق التوقع التقليدية من خلال دمج التجميع القائم على الشكل مع نموذج التعلم الزمني ذو المخرجات المزدوجة، الذي يأخذ في الاعتبار كل من العوائد المتوقعة وتقلبات المدى القصير.

يتكون الإطار المقترح من ثلاثة مكونات رئيسية: تجميع kMedoids باستخدام معيار الشكل لتحديد المجموعات المثلى، ومكون LSTM لتحديد الأنماط السوقية المتسلسلة، وآلية انتباه تبرز الخطوات الزمنية الأكثر صلة بالتوقع. تظهر الاختبارات التجريبية باستخدام بيانات تاريخية وحية من بورصتي BSE/NSE أن هذا النهج المتكامل يحسن بشكل كبير دقة التوقع مقارنة بالطرق الحالية مثل LSTM وGRU وARIMA. ومن الجدير بالذكر أن النموذج يولد توصيات قابلة للتنفيذ للشراء والبيع والاحتفاظ، مما يمكّن من اتخاذ قرارات تداول في الوقت الحقيقي. تشير النتائج إلى أن دمج التجميع الديناميكي مع آليات الانتباه في نماذج التعلم العميق يعزز الدقة والاتساق، مما يوفر رؤى قيمة لكل من المستخدمين النهائيين والمطورين. قد تشمل التحسينات المستقبلية دمج تحليل المشاعر ومؤشرات الأسهم العالمية لتحسين الفعالية في الوقت الحقيقي بشكل أكبر.

طرق

تقدم المنهجية المقترحة بنية معمارية جديدة متعددة الطبقات لتوقع أسعار الأسهم تدمج التجميع الديناميكي مع نموذج الذاكرة الطويلة القصيرة (LSTM) ذو المخرجات المزدوجة. ينظم هذا النهج أولاً الأسهم بناءً على التشابهات السلوكية من خلال مرحلة التجميع، باستخدام خوارزمية k-Medoids المحسنة لاستخراج سلسلة الميدويد التي تمثل سلوك المجموعة. من خلال التركيز على هذه الميدويدات بدلاً من سلسلة زمنية فردية للأسهم، يقلل النموذج من الضوضاء الفردية ويعزز التعرف على الأنماط، مما يؤدي إلى تحسين دقة التوقع وتقليل الإفراط في التكيف. تظهر النتائج التجريبية أن هذه البنية تحقق انخفاضًا في خطأ الجذر التربيعي المتوسط المعياري (NRMSE) والتباين، مع تحسين ملحوظ في الدقة المتوسطة بنسبة تقارب 9% مقارنة بالطرق التقليدية.

تستخدم الدراسة التجريبية مجموعة بيانات من بورصة بومباي (BSE) وبورصة الأسهم الوطنية (NSE)، تشمل أحد عشر سهمًا يتم تداولها بنشاط على مدى فترة من 2018 إلى 2025. يتم تقييم أداء النموذج بدقة من خلال مقاييس مختلفة، بما في ذلك NRMSE، وخطأ القيمة المطلقة المعيارية (NMAE)، ونسبة شارب، مما يؤكد قوته وقابليته للتطبيق العملي. تشير اختبارات t المزدوجة إلى تحسينات ذات دلالة إحصائية مقارنة بالنماذج الأساسية، حيث حقق الإطار المقترح NRMSE قدره 0.102. تؤكد النتائج فعالية مكونات التجميع والانتباه والمخرجات المزدوجة، حيث تساهم كل منها بشكل فريد في أداء النموذج. تشمل اتجاهات البحث المستقبلية اختبار الإطار على المؤشرات العالمية وتعزيز الشفافية المنهجية من خلال أدوات إحصائية متقدمة.

نتائج

تظهر نتائج الدراسة فعالية إطار جديد يدمج التجميع التكيفي، وبنية LSTM المدفوعة بالانتباه، ونموذج التوقع ذو المخرجات المزدوجة لتوقع أسعار الأسهم على المدى القصير. يتكيف هذا النموذج مع ظروف السوق المتغيرة ويحدد بدقة العوامل الرئيسية التي تؤثر على تقلبات الأسعار. كشفت الاختبارات الحية ضمن أطر بورصة بومباي (BSE) وبورصة الأسهم الوطنية (NSE) عن موثوقية متسقة، مما يشير إلى قابليته للتطبيق في التداول اليومي والداخلي. بينما يظهر النموذج وعدًا، يمكن تحسينه بشكل أكبر من خلال دمج المعلمات الخارجية، ومشاعر الأخبار، والمتغيرات الاقتصادية الكلية.

أكدت التحليلات الإحصائية، بما في ذلك اختبارات t المزدوجة واختبارات رتبة ويلكوكسون، أداء النموذج المتفوق مقارنة بنماذج LSTM الأساسية ونماذج kMedoids+LSTM، مع مستوى دلالة \( p < 0.05 \). أظهر نموذج ekMedoids+LSTM الهجين تباينًا أقل في أخطاء التوقع وأظهر تحسينات في قدرات اتخاذ القرار لاستراتيجيات التداول، محققًا تباين عائد تراكمي أعلى قدره \( 0.0004 \)، ونسبة شارب قدرها \( 1.09 \) (زيادة تتراوح بين 12-18%)، وانخفاض أقصى قدره \(-9.3\%\). بالإضافة إلى ذلك، حقق النموذج متوسط خطأ النسبة المطلقة (MAPE) قدره \( 4.93\% \) وقيمة \( R^2 \) قدرها \( 0.92 \)، مما يشير إلى وجود علاقة قوية بين تحركات الأسعار المتوقعة والفعلية. بشكل عام، تشير النتائج إلى أن الأداء المحسن لنموذج ekMedoids-Attention LSTM يترجم إلى فوائد مالية ملموسة، مما يؤكد إمكانيته كأداة قوية لتوقع الأسهم على المدى القصير.

نقاش

في هذا القسم، يناقش المؤلفون نهجهم المبتكر في توقع السلاسل الزمنية المالية، والذي يدمج تقنية التجميع k-Medoids الموجهة بالشكل مع نموذج LSTM ذو المخرجات المزدوجة المعزز بآلية الانتباه. يميز هذا المنهج نفسه عن النماذج الحالية من خلال توقع كل من أسعار الأسهم والتقلبات في الوقت نفسه، مما يوفر للمتداولين أدوات تقييم المخاطر الحيوية. تسلط مراجعة الأدبيات الضوء على الأهمية المتزايدة لتقنيات التعلم العميق، مثل LSTMs والنماذج الهجينة مثل CNN-LSTM وTransformers، في توقع الأسهم، مع التأكيد على أهمية دمج تحليل المشاعر والتجميع التكيفي لتحسين دقة التوقع.

يتكون النموذج المقترح من أربع طبقات: معالجة البيانات، التجميع التكيفي، LSTM ذو المخرجات المزدوجة مع الانتباه، ودعم قرار التداول. تقوم مرحلة المعالجة المسبقة بتطبيع وتحضير بيانات الأسهم التاريخية، بينما تستخدم طبقة التجميع التكيفي طريقة k-Medoids المحسنة لتحديد الأسهم المتشابهة هيكليًا، مما يحسن المدخلات لـ LSTM. يتوقع LSTM ذو المخرجات المزدوجة كل من السعر والتقلب، مستخدمًا آلية الانتباه للتركيز على الخطوات الزمنية ذات الصلة، مما يعزز القابلية للتفسير والأداء التنبؤي. أخيرًا، تترجم طبقة دعم قرار التداول التوقعات إلى إشارات قابلة للتنفيذ (شراء، بيع، احتفاظ) بناءً على عتبة معدلة وفقًا للتقلبات، مما يضمن استراتيجيات تداول قوية. يظهر النموذج أداءً متفوقًا عبر مقاييس زمنية مختلفة، محققًا متوسط انخفاض في خطأ التوقع وتحسين دقة إشارات التداول مقارنة بالنماذج الأساسية، مما يثبت فعالية الإطار المقترح.

Journal: Discover Computing, Volume: 29, Issue: 1
DOI: https://doi.org/10.1007/s10791-026-10079-z
Publication Date: 2026-03-30
Author(s): Zhenyun Du et al.
Primary Topic: Stock Market Forecasting Methods

Overview

The prediction of stock price movements remains a significant challenge for traders and researchers due to the complex, non-linear behavior of the stock market. Despite extensive research, short-term stock price forecasting has proven to be particularly difficult, as market reactions are often unpredictable. This paper proposes a novel framework that enhances traditional forecasting methods by integrating silhouette-based clustering with a dual-output temporal learning model, which accounts for both expected returns and short-term volatilities.

The proposed framework consists of three main components: kMedoids clustering utilizing a Silhouette criterion for optimal cluster determination, an LSTM component for identifying sequential market patterns, and an attention mechanism that highlights the most relevant time steps for forecasting. Empirical tests using historical and live data from the BSE/NSE stock exchanges demonstrate that this integrated approach significantly improves forecasting accuracy compared to existing methods such as LSTM, GRU, and ARIMA. Notably, the model generates actionable BUY, SELL, and HOLD recommendations, enabling real-time trading decisions. The findings suggest that combining dynamic clustering with attention mechanisms in deep learning models enhances accuracy and consistency, providing valuable insights for both end-users and developers. Future enhancements could include the integration of sentiment analysis and global stock indices to further improve real-time effectiveness.

Methods

The proposed methodology introduces a novel layered architecture for stock price prediction that integrates dynamic clustering with a dual-output Long Short-Term Memory (LSTM) model. This approach first organizes stocks based on behavioral similarities through a clustering stage, utilizing an enhanced k-Medoids algorithm to extract medoid series that represent cluster behavior. By focusing on these medoids rather than individual stock time series, the model reduces idiosyncratic noise and enhances pattern recognition, leading to improved predictive accuracy and reduced overfitting. Experimental results demonstrate that this architecture yields lower Normalized Root Mean Square Error (NRMSE) and variance, with a notable average accuracy improvement of approximately 9% compared to traditional methods.

The empirical study employs a dataset from the Bombay Stock Exchange (BSE) and the National Stock Exchange (NSE), encompassing eleven actively traded stocks over a period from 2018 to 2025. The model’s performance is rigorously evaluated through various metrics, including NRMSE, Normalized Mean Absolute Error (NMAE), and the Sharpe Ratio, confirming its robustness and practical applicability. A paired t-test indicates statistically significant improvements over baseline models, with the proposed framework achieving an NRMSE of 0.102. The results underscore the effectiveness of the clustering, attention, and dual-output components, each contributing uniquely to the model’s performance. Future research directions include testing the framework on global indices and enhancing methodological transparency through advanced statistical tools.

Results

The results of the study demonstrate the effectiveness of a novel framework that integrates adaptive clustering, an attention-driven Long Short-Term Memory (LSTM) architecture, and a dual-output prediction model for short-term stock price forecasting. This model adapts to varying market conditions and accurately identifies key factors influencing price fluctuations. Live testing within the Bombay Stock Exchange (BSE) and National Stock Exchange (NSE) frameworks revealed consistent reliability, indicating its applicability for intraday and daily trading. While the model shows promise, it could be further enhanced by incorporating external parameters, news sentiment, and macroeconomic variables.

Statistical analyses, including paired t-tests and Wilcoxon signed-rank tests, confirmed the model’s superior performance compared to baseline LSTM and kMedoids+LSTM models, with a significance level of \( p < 0.05 \). The hybrid ekMedoids+LSTM model exhibited lower variance in prediction errors and demonstrated improved decision-making capabilities for trading strategies, achieving a higher cumulative return variance of \( 0.0004 \), a Sharpe ratio of \( 1.09 \) (an increase of 12-18%), and a maximum drawdown of \(-9.3\%\). Additionally, the model achieved a mean absolute percentage error (MAPE) of \( 4.93\% \) and an \( R^2 \) value of \( 0.92 \), indicating a strong correlation between predicted and actual price movements. Overall, the findings suggest that the enhanced performance of the ekMedoids-Attention LSTM model translates into tangible financial benefits, affirming its potential as a robust tool for short-term stock prediction.

Discussion

In this section, the authors discuss their innovative approach to financial time series forecasting, which integrates a silhouette-guided k-Medoids clustering technique with a dual-output LSTM model enhanced by an attention mechanism. This methodology distinguishes itself from existing models by simultaneously predicting both stock prices and volatility, thereby providing traders with critical risk assessment tools. The literature review highlights the growing importance of deep learning techniques, such as LSTMs and hybrid models like CNN-LSTM and Transformers, in stock prediction, while emphasizing the significance of incorporating sentiment analysis and adaptive clustering to improve forecasting accuracy.

The proposed model consists of four layers: data preprocessing, adaptive clustering, dual-output LSTM with attention, and trading decision support. The preprocessing stage normalizes and prepares historical stock data, while the adaptive clustering layer employs an enhanced k-Medoids method to identify structurally similar stocks, optimizing the input for the LSTM. The dual-output LSTM forecasts both price and volatility, utilizing an attention mechanism to focus on relevant time steps, which enhances interpretability and predictive performance. Finally, the trading decision support layer translates predictions into actionable signals (BUY, SELL, HOLD) based on a volatility-adjusted threshold, ensuring robust trading strategies. The model demonstrates superior performance across various time scales, achieving an average reduction in prediction error and improving trading signal accuracy compared to baseline models, thus validating the effectiveness of the proposed framework.