التصنيف الشخصي البايزي المتغير Variational Bayesian Personalized Ranking

المجلة: IEEE Transactions on Pattern Analysis and Machine Intelligence
DOI: https://doi.org/10.1109/tpami.2026.3672705
PMID: https://pubmed.ncbi.nlm.nih.gov/41805520
تاريخ النشر: 2026-01-01
المؤلف: Bin Liu وآخرون
الموضوع الرئيسي: أنظمة وتقنيات التوصية

نظرة عامة

تقدم ورقة البحث تصنيف بايزي الشخصي المتغير (VarBPR)، وهو إطار عمل جديد للتعلم الثنائي القائم على التغذية الراجعة الضمنية يهدف إلى معالجة التحديات مثل الإشراف النادر، والتفاعلات المزعجة، وانحياز التعرض في التصفية التعاونية. يعيد VarBPR صياغة عملية التعلم كاستدلال متغير على متغيرات الفهرسة الكامنة المنفصلة، مما يسمح بنمذجة صريحة للضوضاء وعدم اليقين في الفهرسة. تنقسم عملية التدريب إلى مرحلتين: الاستدلال المتغير، الذي يستخرج التوزيعات اللاحقة في شكل مغلق من خلال حد أدنى موحد للأدلة (ELBO) وهدف تنظيم، والتعلم المتغير، الذي يستخدم هدف ضغط التوزيع اللاحق لتعزيز الكفاءة الحسابية من التعقيد متعدد الحدود إلى التعقيد الخطي.

تظهر النتائج أن VarBPR لا يحسن فقط دقة الترتيب ولكن أيضًا يسهل التعرض المنضبط للعناصر ذات الذيل الطويل مع الحفاظ على التعقيد الزمني الخطي المميز لتصنيف بايزي الشخصي (BPR). تشمل المساهمات النظرية ضمانات تعميم قابلة للتفسير وتحليل هيكلي للخطأ يسلط الضوء على تكلفة الفرصة المرتبطة بإعطاء الأولوية لأنماط التعرض المحددة. يوفر هذا الإطار نهجًا مبدئيًا لتصميم أنظمة التوصية التي تكون فعالة وكفؤة وقابلة للتكيف مع سياسات التعرض المختلفة، مما يمهد الطريق للبحوث المستقبلية حول تصميم أولويات البيانات القابلة للتكيف والتحكم في سياسات التعرض.

مقدمة

تناقش مقدمة ورقة البحث هذه التحديات والتطورات في التعلم الثنائي للتصفية التعاونية الضمنية، خاصة في السياقات التي تكون فيها تغذية المستخدم غير مباشرة، مثل التجارة الإلكترونية ووسائل التواصل الاجتماعي. تعتبر طرق التعلم الخاضعة للإشراف التقليدية غير كافية بسبب نقص تفضيلات المستخدمين الصريحة، مما يدفع إلى تطوير تصنيف بايزي الشخصي (BPR). ينقل BPR التركيز من توقع التقييم إلى تحسين الترتيب من خلال زيادة احتمال ترتيب أزواج العناصر بشكل صحيح، مما يتماشى مع هدف التدريب مع التقييمات المعتمدة على الترتيب. ومع ذلك، يواجه BPR تحديات كبيرة، بما في ذلك صعوبات تقدير المعلمات بسبب البيانات غير المكتملة، والضوضاء الناتجة عن الإيجابيات الكاذبة والسلبية، وانحياز التعرض الذي يشوه بيانات التفاعل نحو العناصر الشائعة.

لمعالجة هذه التحديات، يقترح المؤلفون تصنيف بايزي الشخصي المتغير (VarBPR)، الذي يعيد صياغة التعلم الثنائي كاستدلال متغير على متغيرات الفهرسة الكامنة. يدمج هذا النهج توافق التفضيلات، وتقليل الضوضاء، وإزالة انحياز الشعبية تحت إطار عمل موحد للحد الأدنى من الأدلة (ELBO). يقدم VarBPR توزيعات لاحقة مثالية في شكل مغلق وتحكم قابل للتفسير على أنماط التعرض، مما يسمح بتعزيز التعرض للعناصر ذات الذيل الطويل. تسلط الورقة الضوء على الأسس النظرية لـ VarBPR، بما في ذلك ضمانات التعميم وإدارة تكاليف الفرصة المتعلقة بالتعرض، بينما تظهر أيضًا فعاليته التجريبية عبر نماذج مختلفة، مما يؤدي إلى تحسين دقة الترتيب وقابلية التوسع. تشمل المساهمات الرئيسية تطوير إطار عمل متغير موحد، وآليات استدلال تحليلية، ورؤى حول التحكم في التعرض، مما يثبت أن VarBPR هو حل قوي لأنظمة التوصية المعاصرة.

طرق

في هذا القسم، يحدد المؤلفون المنهجية المستخدمة لتقييم تصنيف BPR المتغير (VarBPR) من خلال سلسلة من التجارب المقارنة. تم هيكلة إطار التقييم في ست فئات متميزة من طرق الأساس، مما يضمن تقييمًا شاملاً لأداء VarBPR في تحسين الترتيب. تشمل الفئات: (i) خسائر الترتيب الأساسية، التي تتميز بأساليب كلاسيكية مثل تصنيف بايزي الشخصي (BPR) وInfoNCE المعتمد على التعلم التبايني؛ (ii) خسائر ثنائية محسنة، والتي تشمل GBPR المدرك للسلوك الجماعي وCPR التبايني عبر المجالات؛ (iii) طرق تقليل الضوضاء، بما في ذلك ADT الموجه نحو إزالة الضوضاء وSGDL الذاتي التوجيه؛ (iv) تقنيات إزالة الانحياز الإحصائي، التي تشمل DCL لفصل السببية وUBPR غير المنحاز وPUPL لإزالة الانحياز العملي؛ (v) أساليب إزالة انحياز الشعبية، الممثلة بـ PopDCL المفصول عن الشعبية وBC Loss المدمج في الانحياز؛ و(vi) استراتيجيات تعدين السلبية الصعبة، الممثلة بخسارة التباين الصعبة HCL وAdvInfoNCE المعزز بشكل عدائي.

تم تصميم هذا الإطار المقارن متعدد الأوجه لتوضيح نقاط القوة والضعف في VarBPR بالنسبة للمنهجيات الحالية، مما يوفر سياقًا واضحًا لأدائه في مهام الترتيب. يهدف التقييم المنهجي إلى المساهمة في فهم كيفية ملاءمة VarBPR ضمن المشهد الأوسع لتقنيات تحسين الترتيب.

نقاش

يتناول قسم النقاش في الورقة التقدمات في أنظمة التوصية، مع التركيز بشكل خاص على التعلم الثنائي والتعلم الذاتي الخاضع للإشراف (SSL). كانت أنظمة التوصية المبكرة تعالج بشكل أساسي توقع التقييم باستخدام تغذية راجعة صريحة، والتي أثبتت أنها غير كافية لسيناريوهات التغذية الراجعة الضمنية، حيث لا تنقل تفاعلات المستخدم مستويات الرضا. شكل إدخال تصنيف بايزي الشخصي (BPR) تحولًا كبيرًا نحو توقع الترتيب، مما نمذج تفضيلات المستخدم بشكل فعال من خلال مقارنة العناصر المتفاعلة وغير المتفاعلة. على الرغم من التحسينات المختلفة على BPR، مثل تصنيف بايزي الشخصي الجماعي (GBPR) وترتيب متعدد الإيجابيات (MPR)، لا تزال التحديات قائمة في إدارة الضوضاء الناتجة عن الإيجابيات الكاذبة والسلبية، والتي يمكن أن تقوض دقة التوصية.

تسلط الورقة الضوء أيضًا على ظهور التوصية الذاتية الخاضعة للإشراف (SSR)، التي تستفيد من تقنيات SSL، بما في ذلك النماذج التوليدية والتباينية، لتعزيز أداء التوصية. يتم التأكيد على توافق BPR مع مبادئ التعلم التبايني، خاصة قدرته على تصنيف العناصر تلقائيًا بناءً على تفاعلات المستخدم. ومع ذلك، فإن فعالية BPR وامتداداته مقيدة بمشكلات مثل الإشراف النادر وانحياز التعرض. يقترح المؤلفون نهجًا متغيرًا لـ BPR (VarBPR) يتضمن متغيرات كامنة لالتقاط الضوضاء وعدم اليقين بشكل أفضل في التغذية الراجعة الضمنية، مما يحسن من قوة و دقة التوصيات. لا يسهل هذا النهج فقط فهمًا أكثر دقة لتفضيلات المستخدمين ولكنه يقدم أيضًا آليات للتحكم في التعرض، مما يسمح بتوصيات مخصصة بناءً على سلوك المستخدم وخصائص العناصر.

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence
DOI: https://doi.org/10.1109/tpami.2026.3672705
PMID: https://pubmed.ncbi.nlm.nih.gov/41805520
Publication Date: 2026-01-01
Author(s): Bin Liu et al.
Primary Topic: Recommender Systems and Techniques

Overview

The research paper introduces Variational Bayesian Personalized Ranking (VarBPR), a novel framework for implicit-feedback pairwise learning aimed at addressing challenges such as sparse supervision, noisy interactions, and exposure bias in collaborative filtering. VarBPR reformulates the learning process as variational inference over discrete latent indexing variables, which allows for explicit modeling of noise and indexing uncertainty. The training process is divided into two stages: variational inference, which derives closed-form posteriors through a unified Evidence Lower Bound (ELBO) and regularization objective, and variational learning, which employs a posterior-compression objective to enhance computational efficiency from polynomial to linear complexity.

The findings demonstrate that VarBPR not only improves ranking accuracy but also facilitates controlled exposure to long-tail items while maintaining the linear-time complexity characteristic of Bayesian Personalized Ranking (BPR). The theoretical contributions include interpretable generalization guarantees and a structural error analysis that highlights the opportunity cost associated with prioritizing specific exposure patterns. This framework provides a principled approach to designing recommender systems that are effective, efficient, and adaptable to various exposure policies, paving the way for future research on data-adaptive prior design and exposure-policy control.

Introduction

The introduction of this research paper discusses the challenges and advancements in pairwise learning for implicit collaborative filtering, particularly in contexts where user feedback is indirect, such as e-commerce and social media. Traditional supervised learning methods are inadequate due to the lack of explicit user preferences, prompting the development of Bayesian Personalized Ranking (BPR). BPR shifts the focus from rating prediction to ranking optimization by maximizing the likelihood of correctly ordered item pairs, thus aligning the training objective with ranking-based evaluations. However, BPR faces significant challenges, including parameter estimation difficulties due to incomplete data, noise from false positives and negatives, and exposure bias that skews interaction data towards popular items.

To address these challenges, the authors propose Variational Bayesian Personalized Ranking (VarBPR), which reformulates pairwise learning as variational inference over latent indexing variables. This approach integrates preference alignment, noise reduction, and popularity debiasing under a unified evidence lower bound (ELBO) framework. VarBPR offers closed-form optimal posteriors and interpretable control over exposure patterns, allowing for enhanced long-tail exposure. The paper highlights the theoretical foundations of VarBPR, including generalization guarantees and the management of exposure-related opportunity costs, while also demonstrating its empirical effectiveness across various models, leading to improved ranking accuracy and scalability. Key contributions include the development of a unified variational framework, analytical inference mechanisms, and insights into exposure control, establishing VarBPR as a robust solution for contemporary recommender systems.

Methods

In this section, the authors outline the methodology employed to evaluate Variational BPR (VarBPR) through a series of comparative experiments. The evaluation framework is structured into six distinct categories of baseline methods, ensuring a comprehensive assessment of VarBPR’s performance in ranking optimization. The categories include: (i) Foundational Ranking Losses, featuring classic methods such as Bayesian Personalized Ranking (BPR) and the contrastive-learning-based InfoNCE; (ii) Improved Pairwise Losses, which encompass group-behavior-aware GBPR and cross-domain contrastive CPR; (iii) Noise-Reduction Methods, including denoising-oriented ADT and self-guiding SGDL; (iv) Statistical Debiasing Techniques, which involve causality-disentangling DCL, unbiased UBPR, and practical-debiasing PUPL; (v) Popularity Debiasing Approaches, represented by popularity-disentangled PopDCL and bias-combining BC Loss; and (vi) Hard Negative Mining Strategies, exemplified by hard contrastive loss HCL and adversarially-enhanced AdvInfoNCE.

This multi-faceted comparison framework is designed to elucidate the strengths and weaknesses of VarBPR in relation to existing methodologies, thereby providing a clear context for its performance in ranking tasks. The systematic evaluation aims to contribute to the understanding of how VarBPR fits within the broader landscape of ranking optimization techniques.

Discussion

The discussion section of the paper elaborates on advancements in recommendation systems, particularly focusing on pairwise learning and self-supervised learning (SSL). Early recommender systems primarily addressed rating prediction using explicit feedback, which proved inadequate for implicit feedback scenarios, where user interactions do not convey satisfaction levels. The introduction of Bayesian Personalized Ranking (BPR) marked a significant shift towards ranking prediction, effectively modeling user preferences by comparing interacted and non-interacted items. Despite various enhancements to BPR, such as Group Bayesian Personalized Ranking (GBPR) and Multi-Positive Ranking (MPR), challenges remain in managing noise from false positives and negatives, which can undermine recommendation accuracy.

The paper also highlights the emergence of Self-Supervised Recommendation (SSR), which leverages SSL techniques, including generative and contrastive models, to enhance recommendation performance. BPR’s alignment with contrastive learning principles is emphasized, particularly its ability to automatically label items based on user interactions. However, the effectiveness of BPR and its extensions is constrained by issues like sparse supervision and exposure bias. The authors propose a variational approach to BPR (VarBPR) that incorporates latent variables to better capture noise and uncertainty in implicit feedback, thereby improving the robustness and accuracy of recommendations. This approach not only facilitates a more nuanced understanding of user preferences but also introduces mechanisms for exposure control, allowing for tailored recommendations based on user behavior and item characteristics.