تقدم كشف الأخبار الزائفة: التعلم العميق الهجين مع FastText والذكاء الاصطناعي القابل للتفسير Advancing Fake News Detection: Hybrid Deep Learning With FastText and Explainable AI

المجلة: IEEE Access، المجلد: 12
DOI: https://doi.org/10.1109/access.2024.3381038
تاريخ النشر: 2024-01-01
المؤلف: Ehtesham Hashmi وآخرون
الموضوع الرئيسي: المعلومات المضللة وتأثيراتها

نظرة عامة

تتناول هذه الدراسة القضية الحرجة للمعلومات المضللة على وسائل التواصل الاجتماعي من خلال اقتراح إطار عمل قوي لاكتشاف الأخبار المزيفة. مع الاعتراف بالثقة التي يضعها الأفراد في الشبكات الاجتماعية، تؤكد الدراسة على الحاجة إلى طرق اكتشاف فعالة تستفيد من العناصر متعددة الوسائط. باستخدام ثلاثة مجموعات بيانات متاحة للجمهور—WELFake وFakeNewsNet وFakeNewsPrediction—دمج المؤلفون تمثيلات الكلمات FastText مع تقنيات التعلم الآلي (ML) والتعلم العميق (DL) المختلفة. تكشف نتائجهم أن نموذجًا هجينًا يجمع بين الشبكات العصبية التلافيفية (CNN) وشبكات الذاكرة قصيرة وطويلة الأمد (LSTM)، المعزز بتمثيلات FastText، حقق أداءً تصنيفيًا متفوقًا، حيث وصلت دقة النتائج ودرجات F1 إلى 0.99 و0.97 و0.99 عبر جميع مجموعات البيانات. بالإضافة إلى ذلك، أظهرت النماذج المعتمدة على المحولات مثل BERT وXLNet وRoBERTa قدرات محسنة في إدارة التعقيدات النحوية، مما يحسن من التفسير الدلالي.

في الختام، لا تؤسس الدراسة إطار عمل شامل لاكتشاف الأخبار المزيفة فحسب، بل تسلط الضوء أيضًا على أهمية الذكاء الاصطناعي القابل للتفسير من خلال تقنيات مثل التفسيرات المحلية القابلة للتفسير (LIME) وتخصيص ديريشليت الكامن (LDA) من أجل الشفافية في اتخاذ القرار. ستركز الأعمال المستقبلية على توسيع إطار الكشف ليشمل عدة لغات، خاصة تلك التي تعاني من موارد محدودة، من خلال استكشاف المحولات متعددة اللغات مثل mBERT وmT5. كما يخطط المؤلفون أيضًا لدمج طرق التدريب العدائية لتعزيز قوة نموذجهم ضد التحدي المنتشر لنشر الأخبار المزيفة على مستوى العالم.

مقدمة

في مقدمة هذه الورقة البحثية، يناقش المؤلفون التحول الكبير من وسائل الإعلام التقليدية إلى المنصات الرقمية، وخاصة وسائل التواصل الاجتماعي، كمصادر رئيسية للمعلومات. لقد سهل هذا الانتقال الانتشار السريع للأخبار المزيفة—المعرفة بأنها معلومات خاطئة عمدًا—والتي تشكل تهديدًا خطيرًا للأنظمة الديمقراطية من خلال تقويض الثقة العامة في المؤسسات وتأثيرها على القضايا الاجتماعية الحرجة، مثل الانتخابات والرأي العام. يسلط المؤلفون الضوء على الانتشار المقلق للأخبار المزيفة خلال الانتخابات الرئاسية الأمريكية لعام 2016، حيث تم إنشاء حوالي 19 مليون حساب آلي لنشر المعلومات المضللة حول المرشحين، مما يوضح الحاجة الملحة لطرق اكتشاف فعالة.

لمكافحة التحديات التي تطرحها الأخبار المزيفة، تؤكد الورقة على أهمية استخدام تقنيات الذكاء الاصطناعي (AI) ومعالجة اللغة الطبيعية (NLP) للكشف الآلي. يقترح المؤلفون استخدام طرق التعلم الآلي (ML) والتعلم العميق (DL)، بما في ذلك النماذج المعتمدة على المحولات وتمثيلات الكلمات FastText، لتحليل وتصنيف المعلومات المضللة عبر مجموعات بيانات متنوعة. بالإضافة إلى ذلك، يتم تقديم دمج طرق الذكاء الاصطناعي القابل للتفسير (XAI)، مثل LIME، كوسيلة لتعزيز الشفافية والمساءلة في الحلول المدفوعة بالذكاء الاصطناعي، مما يسمح بفهم أفضل للعوامل التي تؤثر على توقعات النموذج. تمهد المقدمة الطريق للأقسام التالية، التي تحدد مساهمات البحث وتنظيم الورقة.

الطرق

تستخدم منهجية البحث الموضحة في هذه الدراسة نهجًا منهجيًا مصممًا لتحقيق نتائج واعدة، كما هو موضح في الشكل 1. تتكون هذه المنهجية من عدة خطوات رئيسية، يتم توضيح كل منها في الأقسام التالية. تضمن الطبيعة المنظمة للنهج إجراء البحث بدقة، مما يسهل الوصول إلى نتائج موثوقة وصحيحة. يهدف التفصيل الدقيق لكل خطوة إلى توفير الوضوح وتعزيز إمكانية إعادة الإنتاج في الدراسات المستقبلية.

النتائج

في هذا القسم، يقدم المؤلفون تقييمًا شاملاً لنماذجهم المقترحة لاكتشاف الأخبار المزيفة، باستخدام مقاييس الأداء القياسية مثل الدقة والدقة والاسترجاع ودرجة F1. تشير النتائج إلى أن مصنف آلة الدعم (SVM) تفوق باستمرار على مصنفات التعلم الآلي (ML) الأخرى عبر ثلاث مجموعات بيانات (WELFake وFakeNewsNet وFakeNewsPrediction)، محققًا درجات دقة تبلغ 0.92 و0.97 و0.91، على التوالي. بالمقابل، بينما حافظت نماذج التعلم العميق (DL) على أداء موثوق، لم تتجاوز فعاليتها فعالية SVM. برز نموذج CNN-LSTM كأفضل أداء بشكل عام، خاصة عند تعزيزه بتمثيلات FastText الخاضعة للإشراف، محققًا درجات دقة وF1 تبلغ 0.99 عبر مجموعات بيانات متعددة، مما يظهر قدرته على التقاط كل من الميزات المحلية والاعتمادات الطويلة الأمد في بيانات النصوص.

كما يؤكد المؤلفون على الكفاءة الحاسوبية لنماذجهم، مشيرين إلى أن وقت التنفيذ لنماذج ML وDL بلغ في المتوسط حوالي 5 دقائق لكل عصر، بينما تطلبت النماذج المعتمدة على المحولات حوالي 15 دقيقة بسبب تعقيدها. على الرغم من ذلك، فإن الأداء المتفوق لنموذج CNN-LSTM برر الاستثمار الحاسوبي الإضافي. علاوة على ذلك، تسلط الدراسة الضوء على التحسينات الكبيرة التي تم تحقيقها مع تمثيلات FastText الخاضعة للإشراف مقارنة بالتمثيلات غير الخاضعة للإشراف، حيث زادت درجات الدقة لنماذج مختلفة بشكل ملحوظ. تظهر النتائج قوة ومرونة المنهجيات المقترحة، حيث تجاوزت ليس فقط الدراسات الأساسية ولكن أيضًا أشارت إلى إمكانية تطبيق أوسع عبر مجموعات بيانات متنوعة.

المناقشة

في هذه الدراسة، نتقدم في اكتشاف الأخبار المزيفة من خلال تحسين المنهجيات الحالية من خلال التنظيم، وتقنيات التحسين، وضبط المعلمات الفائقة. يعتمد نهجنا على ثلاثة مجموعات بيانات متاحة للجمهور—WELFake، بالإضافة إلى اثنتين أخريين من Kaggle—مركزين على التصنيف الثنائي للتمييز بين المعلومات الواقعية والمزيفة. استخدمنا مزيجًا من تمثيلات FastText الخاضعة للإشراف وغير الخاضعة للإشراف المدمجة في نماذج التعلم الآلي (ML) المختلفة، بما في ذلك آلة الدعم (SVM) وشجرة القرار (DT) ومصنفات التجميع المتقدمة مثل تعزيز التدرج المتطرف (XGBoost). لتعزيز أداء النموذج ومنع الإفراط في التكيف، قمنا بتحسين المعلمات الفائقة وتقنيات التنظيم بشكل صارم. بالإضافة إلى ذلك، قمنا بتنفيذ نماذج التعلم العميق (DL) مثل الذاكرة قصيرة وطويلة الأمد (LSTM) والشبكات العصبية التلافيفية (CNN)، جنبًا إلى جنب مع نماذج المحولات المتطورة مثل BERT وRoBERTa، لالتقاط المعلومات السياقية المعقدة والاعتمادات التسلسلية في بيانات النصوص.

لتحسين القابلية للتفسير، خاصة بعد تحديد نموذج CNN-LSTM كالأكثر فعالية، طبقنا تقنيات الذكاء الاصطناعي القابل للتفسير (XAI)، بما في ذلك التفسيرات المحلية القابلة للتفسير (LIME) ونمذجة الموضوعات عبر تخصيص ديريشليت الكامن (LDA). تهدف هذه المنهجية الشاملة إلى تعزيز دقة وقابلية تعميم اكتشاف الأخبار المزيفة، كما تعالج الاستكشاف المحدود لـ XAI في هذا المجال، مما يساهم في فهم أوضح لقرارات النموذج. تم تنظيم الورقة لمراجعة الأبحاث الحالية، وتفصيل منهجيتنا، وعرض النتائج، ومناقشة القابلية للتفسير، مما يمهد الطريق في النهاية للتقدم المستقبلي في اكتشاف الأخبار المزيفة.

Journal: IEEE Access, Volume: 12
DOI: https://doi.org/10.1109/access.2024.3381038
Publication Date: 2024-01-01
Author(s): Ehtesham Hashmi et al.
Primary Topic: Misinformation and Its Impacts

Overview

The research addresses the critical issue of misinformation on social media by proposing a robust framework for fake news detection. Recognizing the trust individuals place in social networks, the study emphasizes the need for effective detection methods that leverage multimedia elements. Utilizing three publicly available datasets—WELFake, FakeNewsNet, and FakeNewsPrediction—the authors integrated FastText word embeddings with various Machine Learning (ML) and Deep Learning (DL) techniques. Their findings reveal that a hybrid model combining Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks, enhanced with FastText embeddings, achieved superior classification performance, with accuracy and F1-scores reaching 0.99, 0.97, and 0.99 across all datasets. Additionally, transformer-based models like BERT, XLNet, and RoBERTa demonstrated enhanced capabilities in managing syntactic complexities, further improving semantic interpretation.

In conclusion, the study not only establishes a comprehensive framework for fake news detection but also highlights the importance of explainable AI through techniques such as Local Interpretable Model-Agnostic Explanations (LIME) and Latent Dirichlet Allocation (LDA) for transparency in decision-making. Future work will focus on expanding the detection framework to multiple languages, particularly those with limited resources, by exploring multilingual transformers like mBERT and mT5. The authors also plan to incorporate adversarial training methods to enhance the robustness of their model against the pervasive challenge of fake news dissemination globally.

Introduction

In the introduction of this research paper, the authors discuss the significant shift from traditional media to digital platforms, particularly social media, as primary sources of information. This transition has facilitated the rapid spread of fake news—defined as intentionally false information—which poses a serious threat to democratic systems by undermining public trust in institutions and influencing critical societal issues, such as elections and public opinion. The authors highlight the alarming prevalence of fake news during the 2016 U.S. presidential election, where approximately 19 million bot accounts were created to disseminate misinformation about candidates, illustrating the urgent need for effective detection methods.

To combat the challenges posed by fake news, the paper emphasizes the importance of employing Artificial Intelligence (AI) and Natural Language Processing (NLP) techniques for automated detection. The authors propose using Machine Learning (ML) and Deep Learning (DL) methods, including transformer-based models and FastText word embeddings, to analyze and classify misinformation across various datasets. Additionally, the integration of Explainable AI (XAI) methods, such as LIME, is presented as a means to enhance transparency and accountability in AI-driven solutions, allowing for a better understanding of the factors influencing model predictions. The introduction sets the stage for the subsequent sections, which outline the contributions of the research and the organization of the paper.

Methods

The research methodology outlined in this study employs a systematic approach designed to yield promising results, as illustrated in Figure 1. This methodology consists of several key steps, each of which is elaborated upon in subsequent sections. The structured nature of the approach ensures that the research is conducted rigorously, facilitating the attainment of reliable and valid findings. The detailed breakdown of each step aims to provide clarity and enhance reproducibility in future studies.

Results

In this section, the authors present a comprehensive evaluation of their proposed models for fake news detection, utilizing standard performance metrics such as accuracy, precision, recall, and F1-score. The results indicate that the Support Vector Machine (SVM) classifier consistently outperformed other machine learning (ML) classifiers across three datasets (WELFake, FakeNewsNet, and FakeNewsPrediction), achieving accuracy scores of 0.92, 0.97, and 0.91, respectively. In contrast, while deep learning (DL) models maintained reliable performance, they did not surpass the SVM’s effectiveness. The CNN-LSTM model emerged as the top performer overall, particularly when enhanced with supervised FastText embeddings, achieving accuracy and F1-scores of 0.99 across multiple datasets, showcasing its capability to capture both local features and long-term dependencies in text data.

The authors also emphasize the computational efficiency of their models, noting that the execution time for ML and DL models averaged around 5 minutes per epoch, while transformer-based models required approximately 15 minutes due to their complexity. Despite this, the CNN-LSTM model’s superior performance justified the additional computational investment. Furthermore, the study highlights the significant improvements achieved with supervised FastText embeddings compared to unsupervised embeddings, with accuracy scores for various models increasing markedly. The results demonstrate the robustness and adaptability of the proposed methodologies, as they not only surpassed baseline studies but also indicated potential for broader applicability across diverse datasets.

Discussion

In this study, we advance the detection of fake news by refining existing methodologies through regularization, optimization techniques, and hyperparameter tuning. Our approach employs three publicly available datasets—WELFake, along with two others from Kaggle—focusing on binary classification to distinguish between factual and fabricated information. We utilized a combination of supervised and unsupervised FastText embeddings integrated into various machine learning (ML) models, including Support Vector Machine (SVM), Decision Tree (DT), and advanced bagging classifiers like Extreme Gradient Boosting (XGBoost). To enhance model performance and prevent overfitting, we rigorously optimized hyperparameters and regularization techniques. Additionally, we implemented deep learning (DL) models such as Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN), alongside state-of-the-art transformer models like BERT and RoBERTa, to capture complex contextual information and sequential dependencies in text data.

To improve interpretability, particularly after identifying the CNN-LSTM model as the most effective, we applied Explainable AI (XAI) techniques, including Local Interpretable Model-Agnostic Explanations (LIME) and topic modeling via Latent Dirichlet Allocation (LDA). This comprehensive methodology not only aims to enhance the accuracy and generalizability of fake news detection but also addresses the limited exploration of XAI in this domain, thereby contributing to a clearer understanding of model decisions. The paper is structured to review existing research, detail our methodology, present results, and discuss interpretability, ultimately paving the way for future advancements in fake news detection.