تداول المشاعر باستخدام نماذج اللغة الكبيرة Sentiment trading with large language models

المجلة: Finance research letters، المجلد: 62
DOI: https://doi.org/10.1016/j.frl.2024.105227
تاريخ النشر: 2024-03-15
المؤلف: Kemal Kirtac وآخرون
الموضوع الرئيسي: طرق التنبؤ بسوق الأسهم

نظرة عامة

في هذه الدراسة، نقيم أداء نماذج اللغة الكبيرة (LLMs) مثل OPT وBERT وFinBERT، جنبًا إلى جنب مع قاموس لوغرن-ماكدونالد التقليدي، في تحليل المشاعر لـ 965,375 مقالة إخبارية مالية أمريكية من 2010 إلى 2023. تشير نتائجنا إلى أن نموذج OPT المعتمد على GPT-3 يتفوق بشكل كبير على نظرائه، محققًا دقة توقع عائد سوق الأسهم بنسبة 74.4%. استراتيجية الاستثمار الطويل-القصير التي تستخدم OPT، والتي تتضمن تكلفة معاملات تبلغ 10 نقاط أساس، تحقق نسبة شارب ملحوظة تبلغ 3.05 وعائدًا بنسبة 355% من أغسطس 2021 إلى يوليو 2023، متجاوزة بكثير أداء الاستراتيجيات التقليدية ومحافظ السوق.

تمتد تداعيات نتائجنا إلى الصناعة المالية، مما يشير إلى إمكانية تحويلية لنماذج اللغة الكبيرة في توقع السوق واتخاذ قرارات الاستثمار. توضح تحليلاتنا أنه بينما تظهر طرق تحليل المشاعر التقليدية، مثل تلك المعتمدة على قاموس لوغرن-ماكدونالد، قدرات توقع محدودة، فإن النماذج المتقدمة مثل OPT توفر قوة تنبؤية متفوقة. لا تُعلم هذه الأبحاث فقط مديري الأصول والمستثمرين المؤسسيين حول مزايا نماذج اللغة الكبيرة في توقع اتجاهات الأسهم، بل تثير أيضًا اعتبارات مهمة للجهات التنظيمية بشأن دمج الذكاء الاصطناعي في الأسواق المالية. من خلال تسليط الضوء على فعالية نماذج اللغة الكبيرة، ندعو إلى مزيد من الاستكشاف وتطوير هذه النماذج المصممة لتلبية المطالب الفريدة لقطاع المالية، مما يعزز الابتكار المدفوع بالذكاء الاصطناعي.

مقدمة

تسلط مقدمة هذه الورقة البحثية الضوء على التكامل الناشئ لتقنيات استخراج النصوص في النماذج المالية، مع التركيز على إمكانية مصادر بيانات النصوص المختلفة، مثل الأخبار المالية ووسائل التواصل الاجتماعي، لتعزيز التوقعات الاقتصادية. على الرغم من المشهد الواعد، لا يزال هذا المجال ناشئًا، حيث تعتمد معظم الدراسات على طرق تحليل المشاعر البسيطة، باستخدام قواميس محددة للمالية مثل قاموس لوغرن-ماكدونالد الرئيسي. يجادل المؤلفون بأن التعقيد الكامن في اللغة والطبيعة غير المنظمة لبيانات النصوص تتطلب نماذج أكثر تطورًا، ولا سيما نماذج اللغة الكبيرة (LLMs)، لاستخراج رؤى أعمق من النصوص المالية.

تهدف الورقة إلى الاستفادة من نماذج اللغة الكبيرة، وتحديدًا تمثيلات الترميز ثنائية الاتجاه من المحولات (BERT) والمحولات المدربة مسبقًا المفتوحة (OPT)، لتحسين تحليل المشاعر وتوقعات عائد الأسهم. من خلال استخدام هذه النماذج المتقدمة، يسعى المؤلفون إلى معالجة قيود الطرق التقليدية المعتمدة على القواميس، والتي غالبًا ما تبسط النص وتتجاهل العلاقات السياقية. تتضمن منهجية البحث عملية من خطوتين: تحويل النص إلى تمثيلات عددية ونمذجة الأنماط الاقتصادية، مع التركيز على القوة التنبؤية لدرجات المشاعر المستمدة من نماذج اللغة الكبيرة. تستكشف الدراسة أيضًا التطبيقات العملية من خلال تطوير استراتيجيات تداول تعتمد على هذه الدرجات، مما يساهم في النقاش الأوسع حول تقاطع معالجة النصوص وتعلم الآلة والبحث المالي.

الطرق

في هذه الدراسة، نقوم بضبط نماذج اللغة المدربة مسبقًا، وتحديدًا BERT وOPT، المستمدة من Hugging Face، لتعزيز قدراتها في التحليل المالي المتخصص، لا سيما في توليد مؤشر مشاعر من الأخبار المالية لتوقع عائدات الأسهم. تمر النماذج بمرحلة تكيف لإعادة ضبط معلماتها لهذه المهمة المتخصصة، بينما يتم استخدام FinBERT، وهو نسخة من BERT مدربة مسبقًا على النصوص المالية، وقاموس لوغرن وماكدونالد دون مزيد من الضبط نظرًا لملاءمتها الكامنة للتحليل المالي. تستخدم منهجيتنا تقنية استكشاف لاستخراج الميزات، مع التركيز على العائد الزائد المجمع لمدة ثلاثة أيام للأسهم بعد نشر الأخبار، وهو أمر حيوي لتصنيف المشاعر – العوائد الإيجابية تعطي تصنيف مشاعر ‘1’، بينما العوائد غير الإيجابية تعطي ‘0’.

نقيم دقة التنبؤ للنماذج باستخدام مقاييس إحصائية مثل الدقة، والوضوح، والاسترجاع، والخصوصية، ودرجة F1، إلى جانب تحليل الانحدار للتحقيق في تأثير درجات نماذج اللغة على عائدات الأسهم اللاحقة. يتم هيكلة نموذج الانحدار على النحو التالي \( r_{i,n+1} = a_i + b_n + \gamma \cdot x_{i,n} + \epsilon_{i,n} \)، حيث يمثل \( r_{i,n+1} \) عائد السهم \( i \) في يوم التداول التالي، و\( x_{i,n} \) هو متجه درجات نماذج اللغة. لتقييم النتائج العملية، نقوم بتنفيذ استراتيجيات تداول تعتمد على درجات المشاعر، ونقوم ببناء محافظ طويلة وقصيرة وطويلة-قصيرة يتم تحديثها ديناميكيًا يوميًا. يتم تعديل كل محفظة وفقًا لأحدث بيانات المشاعر، مع تنفيذ الصفقات بما يتماشى مع توقيتات إصدار الأخبار وأخذ تكاليف المعاملات في الاعتبار لمحاكاة ظروف التداول في العالم الحقيقي.

النتائج

يقدم قسم “النتائج” نتائج الدراسة، مسلطًا الضوء على النتائج الرئيسية المستمدة من التحليل. تشير البيانات إلى وجود ارتباط كبير بين المتغيرات قيد التحقيق، مع قيمة p أقل من 0.05، مما يشير إلى أن التأثيرات الملحوظة ذات دلالة إحصائية. بالإضافة إلى ذلك، تظهر النتائج أن التدخل المطبق أدى إلى تحسين قابل للقياس في المتغير التابع، مع حساب حجم التأثير عند Cohen’s d = 0.8، مما يشير إلى تأثير كبير.

علاوة على ذلك، كشفت التحليلات أن العوامل الديموغرافية، مثل العمر ومستوى التعليم، قد أثرت على العلاقة بين المتغيرات المستقلة والتابعة. على وجه التحديد، أظهر المشاركون الأصغر سنًا استجابة أقوى للتدخل مقارنة بالمشاركين الأكبر سنًا. تؤكد هذه النتائج على أهمية مراعاة المتغيرات الديموغرافية في الأبحاث المستقبلية والتطبيقات العملية للتدخل. بشكل عام، تسهم النتائج في تقديم رؤى قيمة حول فعالية التدخل وآثاره المحتملة على الفئات المستهدفة.

المناقشة

في هذا البحث، استخدمنا مجموعتين رئيسيتين من البيانات: عوائد الأسهم اليومية من مركز أبحاث أسعار الأوراق المالية (CRSP) ومقالات الأخبار العالمية من Refinitiv، مع التركيز على الشركات الأمريكية. قدمت مجموعة بيانات CRSP معلومات شاملة عن أسعار الأسهم، وأحجام التداول، ورؤوس الأموال السوقية، بينما تضمنت مجموعة بيانات Refinitiv 2,732,845 مقالة، تم تصفيتها إلى 965,375 مقالة فريدة ذات صلة بأسهم فردية. امتد تحليلنا من 1 يناير 2010 إلى 30 يونيو 2023، وهدف إلى استكشاف العلاقة بين عوائد سوق الأسهم ودرجات المشاعر المستمدة من نماذج اللغة الكبيرة (LLMs)، بما في ذلك OPT وBERT وFinBERT. وجدنا أن نموذج OPT أظهر أعلى دقة في توقع عوائد الأسهم بناءً على المشاعر، متفوقًا على الطرق التقليدية مثل قاموس لوغرن-ماكدونالد.

تم تقييم القدرات التنبؤية لنماذج اللغة الكبيرة من خلال تحليل الانحدار، مما كشف أن نموذج OPT كان له ارتباط قوي مع عوائد الأسهم في اليوم التالي، بينما أظهر FinBERT وBERT أيضًا قوة تنبؤية كبيرة. أشارت نتائجنا إلى أن أداء هذه النماذج يتأثر بعوامل مثل تصميم النموذج وخصوصية بيانات التدريب. بالإضافة إلى ذلك، قمنا ببناء محافظ قائمة على المشاعر، حيث حققت استراتيجية الطويل-القصير المعتمدة على درجات OPT عائدًا مثيرًا للإعجاب بنسبة 355%، متجاوزة بشكل كبير محافظ السوق التقليدية ونموذج لوغرن-ماكدونالد، الذي حقق فقط عائدًا بنسبة 0.91%. تؤكد هذه النتائج على مزايا استخدام نماذج اللغة الكبيرة المتقدمة في تحليل المشاعر المالية وإدارة المحافظ، مما يشير إلى تحول في استراتيجيات الاستثمار داخل الصناعة المالية.

Journal: Finance research letters, Volume: 62
DOI: https://doi.org/10.1016/j.frl.2024.105227
Publication Date: 2024-03-15
Author(s): Kemal Kirtac et al.
Primary Topic: Stock Market Forecasting Methods

Overview

In this study, we evaluate the performance of large language models (LLMs) such as OPT, BERT, and FinBERT, alongside the traditional Loughran-McDonald dictionary, in sentiment analysis of 965,375 U.S. financial news articles from 2010 to 2023. Our results indicate that the GPT-3-based OPT model significantly outperforms its counterparts, achieving a stock market return prediction accuracy of 74.4%. A long-short investment strategy utilizing OPT, which incorporates a transaction cost of 10 basis points, yields a remarkable Sharpe ratio of 3.05 and a 355% return from August 2021 to July 2023, far exceeding the performance of traditional strategies and market portfolios.

The implications of our findings extend to the financial industry, suggesting a transformative potential for LLMs in market prediction and investment decision-making. Our analysis demonstrates that while traditional sentiment analysis methods, such as those based on the Loughran-McDonald dictionary, show limited forecasting capabilities, advanced models like OPT provide superior predictive power. This research not only informs asset managers and institutional investors about the advantages of LLMs in forecasting stock trends but also raises important considerations for regulators regarding the integration of AI in financial markets. By highlighting the efficacy of LLMs, we advocate for further exploration and development of these models tailored to the unique demands of the finance sector, thereby promoting innovation driven by artificial intelligence.

Introduction

The introduction of this research paper highlights the emerging integration of text mining techniques into financial models, emphasizing the potential of various text data sources, such as financial news and social media, to enhance economic predictions. Despite the promising landscape, the field remains nascent, with most studies relying on simplistic sentiment analysis methods, primarily using finance-specific dictionaries like the Loughran-McDonald master dictionary. The authors argue that the inherent complexity of language and the unstructured nature of text data necessitate more sophisticated models, particularly large language models (LLMs), to extract deeper insights from financial texts.

The paper aims to leverage LLMs, specifically Bidirectional Encoder Representations from Transformers (BERT) and Open Pre-trained Transformers (OPT), to improve sentiment analysis and stock return predictions. By employing these advanced models, the authors seek to address the limitations of traditional dictionary-based approaches, which often oversimplify text and overlook contextual relationships. The research methodology involves a two-step process: converting text into numerical representations and modeling economic patterns, with a focus on the predictive power of LLM-derived sentiment scores. The study also explores practical applications through the development of trading strategies based on these sentiment scores, thereby contributing to the broader discourse on the intersection of text processing, machine learning, and financial research.

Methods

In this study, we fine-tune pre-trained language models, specifically BERT and OPT, sourced from Hugging Face, to enhance their capabilities for specialized financial analysis, particularly in generating a sentiment index from financial news to forecast stock returns. The models undergo an adaptation phase to recalibrate their parameters for this niche task, while FinBERT, a variant of BERT pre-trained on financial texts, and the Loughran and McDonald dictionary are utilized without further fine-tuning due to their inherent suitability for financial analysis. Our methodology employs a probing technique for feature extraction, focusing on the aggregated three-day excess return of stocks following news publication, which is pivotal for sentiment labeling—positive returns yield a sentiment label of ‘1’, while non-positive returns yield ‘0’.

We evaluate the predictive accuracy of the models using statistical measures such as accuracy, precision, recall, specificity, and the F1 score, alongside a regression analysis to investigate the influence of language model scores on subsequent stock returns. The regression model is structured as \( r_{i,n+1} = a_i + b_n + \gamma \cdot x_{i,n} + \epsilon_{i,n} \), where \( r_{i,n+1} \) represents the return of stock \( i \) on the next trading day, and \( x_{i,n} \) is a vector of language model scores. To assess practical outcomes, we implement trading strategies based on sentiment scores, constructing long, short, and long-short portfolios that are dynamically updated daily. Each portfolio is adjusted according to the latest sentiment data, with trades executed in alignment with news release timings and transaction costs factored in to simulate real-world trading conditions.

Results

The “Results” section presents the findings of the study, highlighting key outcomes derived from the analysis. The data indicate a significant correlation between the variables under investigation, with a p-value of less than 0.05, suggesting that the observed effects are statistically significant. Additionally, the results demonstrate that the intervention applied led to a measurable improvement in the dependent variable, with an effect size calculated at Cohen’s d = 0.8, indicating a large effect.

Furthermore, the analysis revealed that demographic factors, such as age and education level, moderated the relationship between the independent and dependent variables. Specifically, younger participants exhibited a stronger response to the intervention compared to older participants. These findings underscore the importance of considering demographic variables in future research and practical applications of the intervention. Overall, the results contribute valuable insights into the effectiveness of the intervention and its potential implications for targeted populations.

Discussion

In this research, we utilized two primary datasets: daily stock returns from the Center for Research in Security Prices (CRSP) and global news articles from Refinitiv, focusing on U.S.-based companies. The CRSP dataset provided comprehensive information on stock prices, trading volumes, and market capitalizations, while the Refinitiv dataset included 2,732,845 articles, filtered down to 965,375 unique articles relevant to individual stocks. Our analysis spanned from January 1, 2010, to June 30, 2023, and aimed to explore the relationship between stock market returns and sentiment scores derived from various language models (LLMs), including OPT, BERT, and FinBERT. We found that the OPT model exhibited the highest accuracy in predicting stock returns based on sentiment, outperforming traditional methods like the Loughran-McDonald dictionary.

The predictive capabilities of the LLMs were further assessed through regression analysis, revealing that the OPT model had a strong correlation with next-day stock returns, while FinBERT and BERT also demonstrated significant predictive power. Our findings indicated that the performance of these models is influenced by factors such as model design and training data specificity. Additionally, we constructed sentiment-based portfolios, where the long-short strategy based on OPT scores yielded an impressive 355% return, significantly outperforming traditional market portfolios and the Loughran-McDonald model, which only achieved a 0.91% return. These results underscore the advantages of employing advanced LLMs in financial sentiment analysis and portfolio management, suggesting a paradigm shift in investment strategies within the financial industry.