شبكة RCSAN المعززة للقناة المتبقية لتحليل الانتباه المكاني لتوقع أسعار الأسهم RCSAN residual enhanced channel spatial attention network for stock price forecasting

المجلة: Scientific Reports، المجلد: 15، العدد: 1
DOI: https://doi.org/10.1038/s41598-025-06885-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40596213
تاريخ النشر: 2025-07-01
المؤلف: Wenjie Sun وآخرون
الموضوع الرئيسي: طرق التنبؤ بسوق الأسهم

نظرة عامة

تقدم هذه الدراسة شبكة الانتباه القنوي المكاني المعزز بالمخلفات (R-CSAN) لتوقع أسعار الأسهم، والتي تجمع بين آليات الانتباه التكيفي القنوي المكاني مع الاتصالات المتبقية لنمذجة الأنماط المعقدة في السلاسل الزمنية المالية بفعالية. تتميز البنية بهيكل ترميز-فك، حيث يستخدم الترميز عدة طبقات من وحدات الانتباه لاستخراج علاقات الميزات من البيانات التاريخية، بينما يستخدم الفك آلية قناع لمنع تسرب المعلومات المستقبلية وآلية انتباه متقاطع لالتقاط العلاقات بين الأسواق. تشير النتائج التجريبية على مجموعات بيانات من أمازون وماوتاي وبنج آن وفانكي إلى أن R-CSAN تتفوق بشكل كبير على النماذج التقليدية (ARIMA، LSTM، CNN-LSTM) والنماذج الحديثة المعتمدة على المحولات (Informer، Autoformer، iTransformer)، محققة تخفيضات في RMSE تتراوح بين 17.3-49.3% وأقصى R² يبلغ 93.17%.

تصمم النموذج لمعالجة قيود أساليب التنبؤ الحالية من خلال دمج كل من التفاعلات الزمنية والميزات ضمن إطار موحد. تؤكد دراسات الإزالة على أهمية كل مكون، حيث تكشف أن إزالة الوحدة الزمنية تزيد RMSE بنسبة 38.6%، بينما يؤدي غياب الانتباه القنوي المكاني إلى زيادة بنسبة 21.3%. كما توفر R-CSAN قابلية التفسير من خلال تصور أوزان الانتباه، مما يساعد في فهم أهمية الميزات والفترات الزمنية الحرجة للتنبؤات. بينما تظهر R-CSAN وعدًا للتطبيقات العملية في استراتيجيات التداول، إلا أن لديها قيودًا، بما في ذلك عدم وجود تحسين للتداول عالي التردد وحساسيتها للأحداث الاقتصادية الكلية غير المتوقعة. تشمل اتجاهات البحث المستقبلية دمج الشبكات العصبية البيانية، وتحليل المشاعر، وأطر التعلم المعزز لتعزيز قابلية النموذج للتكيف والاستجابة.

الطرق

تناقش هذه القسم الطرق التقليدية لتوقع السلاسل الزمنية المالية، مع تسليط الضوء على قيود النماذج الإحصائية مثل المتوسط المتحرك التكاملي الذاتي (ARIMA) والتباين الشرطي الذاتي العام (GARCH)، التي تواجه صعوبات مع البيانات عالية الأبعاد وغير الثابتة في الأسواق المتقلبة. تواجه خوارزميات التعلم الآلي الكلاسيكية، بما في ذلك آلات الدعم الشعاعي (SVM)، والغابات العشوائية، وأشجار تعزيز التدرج، تحديات أيضًا في هندسة الميزات والتقاط الاعتماديات طويلة الأجل. تؤكد هذه القيود على الحاجة إلى تقنيات متقدمة، مما يؤدي إلى استكشاف طرق التعلم العميق التي يمكن أن تتعلم الميزات تلقائيًا وتقوم بنمذجة العلاقات المعقدة بشكل أفضل في البيانات المالية.

تظهر النتائج التجريبية الأداء المتفوق لنموذج R-CSAN المقترح مقارنة بمختلف نماذج الأساس عبر أربع مجموعات بيانات للأسهم. تحقق R-CSAN باستمرار أقل خطأ جذر متوسط المربعات (RMSE)، ومتوسط الخطأ المطلق (MAE)، ومتوسط الخطأ النسبي المطلق (MAPE)، بينما تحقق أعلى درجات $R^2$، مما يشير إلى فعاليتها في توقع أسعار الأسهم. على سبيل المثال، في مجموعة بيانات أمازون، كان RMSE لـ R-CSAN البالغ 18.73 أقل بكثير من RMSE لـ ARIMA (35.48) ونماذج أخرى، مما يظهر قوتها. كما يظهر النموذج تعميمًا قويًا عبر الأسواق، حيث يعمل بشكل جيد على كل من الأسهم الأمريكية والصينية، ويظهر عوائد استثمار كبيرة، مع أقصى عائد يبلغ 482.64%. تؤكد هذه النتائج فعالية دمج الاتصالات المتبقية مع آليات الانتباه القنوي المكاني في التقاط الديناميات الزمنية وتفاعلات الميزات في السلاسل الزمنية المالية.

النتائج

تظهر نتائج تحليل التصور الذي تم إجراؤه على أداء نموذج R-CSAN التنبؤي عبر أربع مجموعات بيانات للأسهم—أمازون، موطاي، بنج آن، وفانكي—فعاليته في توقع السلاسل الزمنية المالية. يكشف التحليل، الموضح في الشكل 3، أن القيم المتوقعة للنموذج (الممثلة بالخط الأخضر) تتبع عن كثب أسعار الأسهم الفعلية (الخط الأزرق)، مما يشير إلى دقة عالية في التنبؤ وقوة. تشير هذه المحاذاة إلى أن آلية الانتباه التكيفي القنوي المكاني المستخدمة في النموذج تلتقط بنجاح الأنماط المعقدة الكامنة في تقلبات أسعار الأسهم.

تشمل النتائج الرئيسية من التصور قدرة النموذج على التكيف مع ظروف السوق المتغيرة وخصائص الصناعة، كما يتضح من أدائه على كل من سهم موطاي المتقلب وسهم أمازون الأكثر استقرارًا. يتتبع نموذج R-CSAN بفعالية اتجاهات الأسعار عبر بيئات مختلفة، مما يظهر قدرته على التعميم عبر الأسواق. علاوة على ذلك، يظهر النموذج حساسية لكل من التقلبات الدقيقة والاتجاهات العامة، خاصة في مجموعات بيانات فانكي وبنج آن، حيث يلتقط بدقة التغيرات المحلية الطفيفة في الأسعار. على الرغم من التباينات في السعة وتكرار تقلبات الأسعار بين مجموعات البيانات، يحافظ النموذج على مستوى عالٍ من اتساق التنبؤ، مما يثبت فعالية آلية الانتباه الهجينة وبنية التعزيز المتبقية. بشكل عام، تشير النتائج إلى أن نموذج R-CSAN يحمل وعدًا كبيرًا لتطبيقات التنبؤ المالي في العالم الحقيقي، مما يوفر رؤى قيمة لدعم أنظمة اتخاذ القرار الاستثماري.

المناقشة

تسلط قسم المناقشة في ورقة البحث الضوء على التأثير التحويلي لتقنيات التعلم العميق، وخاصة الشبكات العصبية الذاكرة الطويلة القصيرة (LSTMs) والنماذج الهجينة مثل CNN-LSTM، على التنبؤ المالي. لقد أظهرت LSTMs أنها تتفوق على الطرق التقليدية في توقع سوق الأسهم من خلال معالجة القضايا مثل تلاشي التدرجات والتقاط الاعتماديات طويلة الأجل بفعالية. لقد عزز دمج آليات الانتباه، خاصة من خلال هياكل المحولات، أداء النموذج بشكل أكبر من خلال السماح بالتركيز الديناميكي على المعلومات الحرجة ضمن بيانات السلاسل الزمنية. ومع ذلك، غالبًا ما تفشل آليات الانتباه الحالية في معالجة التفاعلات المعقدة بين المؤشرات المالية، وهو أمر حاسم للتنبؤ الدقيق.

لمعالجة هذه القيود، تقدم الورقة شبكة الانتباه القنوي المكاني المعزز بالمخلفات (R-CSAN)، التي تستخدم آلية انتباه قنوي مكاني تكيفي جديدة. يلتقط هذا النموذج في الوقت نفسه كل من الأبعاد الزمنية والميزات، مما يعزز فهم الديناميات المالية المعقدة. تتضمن البنية اتصالات متبقية للتخفيف من مشاكل تلاشي التدرجات، مما يسمح بهياكل شبكة أعمق يمكنها تعلم الأنماط المعقدة بفعالية. لا يحسن التصميم المبتكر لنموذج R-CSAN دقة التنبؤ فحسب، بل يوفر أيضًا فهمًا أكثر دقة للعلاقات بين مختلف المؤشرات المالية، مما يقدم أدوات قيمة للمستثمرين والباحثين في مجال التنبؤ المالي.

Journal: Scientific Reports, Volume: 15, Issue: 1
DOI: https://doi.org/10.1038/s41598-025-06885-y
PMID: https://pubmed.ncbi.nlm.nih.gov/40596213
Publication Date: 2025-07-01
Author(s): Wenjie Sun et al.
Primary Topic: Stock Market Forecasting Methods

Overview

This study introduces the Residual-Enhanced Channel-Spatial Attention Network (R-CSAN) for stock price prediction, which combines channel-spatial adaptive attention mechanisms with residual connections to effectively model complex patterns in financial time series. The architecture features an encoder-decoder structure, where the encoder utilizes multiple layers of attention modules to extract feature correlations from historical data, while the decoder employs a masking mechanism to prevent future information leakage and a cross-attention mechanism to capture inter-market correlations. Experimental results on datasets from Amazon, Maotai, Ping An, and Vanke indicate that R-CSAN significantly outperforms traditional models (ARIMA, LSTM, CNN-LSTM) and recent Transformer-based models (Informer, Autoformer, iTransformer), achieving reductions in RMSE by 17.3-49.3% and a maximum R² of 93.17%.

The model’s design addresses limitations of existing forecasting approaches by integrating both temporal and feature interactions within a unified framework. Ablation studies confirm the importance of each component, revealing that the removal of the temporal module increases RMSE by 38.6%, while the absence of channel-spatial attention results in a 21.3% increase. R-CSAN also provides interpretability through attention weight visualization, which aids in understanding feature significance and critical time periods for predictions. While R-CSAN shows promise for practical applications in trading strategies, it has limitations, including a lack of optimization for high-frequency trading and sensitivity to unexpected macroeconomic events. Future research directions include the incorporation of Graph Neural Networks, sentiment analysis, and reinforcement learning frameworks to enhance the model’s adaptability and responsiveness.

Methods

The section discusses traditional methods for financial time series prediction, highlighting the limitations of statistical models such as AutoRegressive Integrated Moving Average (ARIMA) and Generalized AutoRegressive Conditional Heteroskedasticity (GARCH), which struggle with high-dimensional, non-stationary data in volatile markets. Classic machine learning algorithms, including Support Vector Machines (SVM), Random Forests, and Gradient Boosting Trees, also face challenges in feature engineering and capturing long-term dependencies. These limitations underscore the need for advanced techniques, leading to the exploration of deep learning methods that can automatically learn features and better model complex relationships in financial data.

The experimental results demonstrate the superior performance of the proposed R-CSAN model compared to various baseline models across four stock datasets. R-CSAN consistently achieves the lowest Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), while attaining the highest $R^2$ scores, indicating its effectiveness in predicting stock prices. For instance, on the Amazon dataset, R-CSAN’s RMSE of 18.73 is significantly lower than that of ARIMA (35.48) and other models, showcasing its robustness. The model also exhibits strong cross-market generalization, performing well on both U.S. and Chinese stocks, and demonstrates substantial investment returns, with a maximum return of 482.64%. These findings validate the effectiveness of integrating residual connections with spatial-channel attention mechanisms in capturing temporal dynamics and feature interactions in financial time series.

Results

The results of the visualization analysis conducted on the R-CSAN model’s predictive performance across four stock datasets—Amazon, Moutai, Ping An, and Vanke—demonstrate its effectiveness in financial time series forecasting. The analysis, illustrated in Figure 3, reveals that the model’s predicted values (represented by the green line) closely follow the actual stock prices (blue line), indicating high prediction accuracy and robustness. This alignment suggests that the channel-spatial adaptive attention mechanism employed in the model successfully captures the complex patterns inherent in stock price fluctuations.

Key findings from the visualization include the model’s adaptability to varying market conditions and industry characteristics, as evidenced by its performance on both the volatile Moutai stock and the more stable Amazon stock. The R-CSAN model effectively tracks price trends across different environments, showcasing its cross-market generalization ability. Furthermore, the model demonstrates sensitivity to both micro fluctuations and overall trends, particularly in the Vanke and Ping An datasets, where it accurately captures subtle local price changes. Despite variations in amplitude and frequency of price fluctuations among the datasets, the model maintains a high level of prediction consistency, validating the effectiveness of its hybrid attention mechanism and residual enhancement architecture. Overall, the results indicate that the R-CSAN model holds significant promise for real-world financial forecasting applications, providing valuable insights for investment decision support systems.

Discussion

The discussion section of the research paper highlights the transformative impact of deep learning techniques, particularly Long Short-Term Memory networks (LSTMs) and hybrid models like CNN-LSTM, on financial forecasting. LSTMs have been shown to outperform traditional methods in stock market prediction by effectively addressing issues such as vanishing gradients and capturing long-term dependencies. The integration of attention mechanisms, particularly through Transformer architectures, has further enhanced model performance by allowing dynamic focus on critical information within time series data. However, existing attention mechanisms often fall short in addressing the complex interdependencies among financial indicators, which is crucial for accurate forecasting.

To address these limitations, the paper introduces the Residual-Enhanced Channel-Spatial Attention Network (R-CSAN), which employs a novel channel-spatial adaptive attention mechanism. This model simultaneously captures both temporal and feature dimensions, enhancing the understanding of intricate financial data dynamics. The architecture incorporates residual connections to mitigate gradient vanishing issues, allowing for deeper network structures that can learn complex patterns effectively. The R-CSAN model’s innovative design not only improves prediction accuracy but also provides a more nuanced understanding of the relationships among various financial indicators, thereby offering valuable tools for investors and researchers in the field of financial forecasting.