شبكة عصبية تلافيفية تعتمد على المنتج شبه المصفوفي Semi-tensor product-based convolutional neural network

المجلة: Science China Information Sciences، المجلد: 69، العدد: 4
DOI: https://doi.org/10.1007/s11432-025-4734-4
تاريخ النشر: 2026-03-10
المؤلف: Daizhan Cheng وآخرون
الموضوع الرئيسي: تحليل الموتر وتطبيقاته

نظرة عامة

تقدم البحث إطار عمل جديد لشبكة الأعصاب التلافيفية (CNN) يستخدم المنتج شبه التنسوري (STP) ليحل محل المنتج الداخلي التقليدي، مما يسهل العمليات عبر المتجهات ذات الأبعاد المتنوعة. يتضمن هذا الإطار منتج تلافيفي قائم على المجال (CP) يمكّن من التلافيف بدون حشو، مما يتجنب بشكل فعال إدخال الحشو الصفري أو الاصطناعي الذي يؤدي غالبًا إلى عيوب حدودية في شبكات CNN التقليدية. تم التحقق من الطريقة المقترحة من خلال تطبيقات في معالجة الصور وتحديد الإشارات من الدرجة الثالثة، مما يوضح قدرتها على التعامل مع البيانات غير المنتظمة وغير المكتملة وعالية الأبعاد دون التشوهات المرتبطة عادةً بالحشو.

في الختام، يضع الدراسة أساسًا رياضيًا صارمًا لعملية التلافيف المعتمدة على STP، والتي تعالج مباشرة التحديات التي تطرحها البيانات غير المنتظمة. لا يعزز الإطار استخراج الميزات في السيناريوهات الواقعية فحسب، بل يفتح أيضًا آفاقًا للبحث المستقبلي، بما في ذلك تنفيذ الانتشار العكسي للتدريب، والتحقق التجريبي على نطاق واسع مقابل النماذج الحالية، والتحسين للتطبيقات العملية. تهدف هذه الخطوات التالية إلى استكشاف ديناميكيات وكفاءة STP-CNNs بشكل أكبر، خاصة في مجالات مثل التصوير الطبي والاستشعار عن بعد، مما يعزز حالة الشبكات العصبية التلافيفية في التعامل مع هياكل البيانات المعقدة.

مقدمة

تسلط مقدمة الورقة الضوء على الدور المحوري لشبكات الأعصاب التلافيفية (CNNs) في تقدم الذكاء الاصطناعي، خاصة في مجالات مثل رؤية الكمبيوتر ومعالجة اللغة الطبيعية. بينما تتفوق شبكات CNN التقليدية في استخراج الميزات من خلال التجميعات الموزونة المحلية، فإنها تواجه تحديات عند التعامل مع البيانات غير المنتظمة أو الحدود أو المناطق المفقودة. غالبًا ما تعتمد الطرق التقليدية على الحشو الصريح أو القناع، مما يمكن أن يقدم عيوبًا وتحريفات، مما يؤدي في النهاية إلى تدهور الأداء. لمعالجة هذه القيود، تم اقتراح تحسينات متنوعة مثل التلافيف المتوسعة والقابلة للتشويه، ومع ذلك لا تزال تعمل ضمن فضاءات التنسور ذات الأبعاد الثابتة.

تقدم هذه الورقة إطار عمل تلافيفي جديد يعتمد على المنتج شبه التنسوري (STP) للمتجهات، مما يسمح بإجراء عمليات متسقة جبريًا عبر الأبعاد المتنوعة دون الحاجة إلى الحشو أو القناع. يعمل إطار STP-CNN المقترح في \( \mathbb{R}^\infty \)، مما يستوعب أحجام مجالات الاستقبال المتغيرة ويعزز المتانة ضد البيانات غير المنتظمة أو غير المكتملة. تشمل المساهمات الرئيسية مشغل تلافيفي بدون حشو يحسب المنتجات الداخلية عبر الأبعاد على البيانات الصالحة، وصياغة جبرية خطية موحدة متوافقة مع مكتبات التعلم العميق الحالية، وتعميم قابل للتوسع على الإشارات ثلاثية الأبعاد. تهدف هذه الطريقة إلى الحفاظ على النزاهة الرياضية للتلافيف مع تحسين القدرة على التكيف مع مجموعات البيانات عالية الأبعاد وغير المنتظمة، مما يمهد الطريق للتقييمات التجريبية المستقبلية ودمجها في الأطر الحسابية السائدة.

مناقشة

في هذا القسم، يؤسس المؤلفون الأساس الجبري لإطار عمل جديد لشبكة الأعصاب التلافيفية (CNN) باستخدام المنتج شبه التنسوري (STP) للمتجهات ومنتج داخلي عبر الأبعاد محدد في الفضاء \( \mathbb{R}^\infty \). يسهل STP العمليات بين المتجهات ذات الأبعاد المتنوعة، مما يعالج قيود المنتجات الداخلية التقليدية. يتم تقديم تعريفات رئيسية، بما في ذلك الجمع، والمنتج الداخلي، والمعيار، ومقاييس المسافة، والتي تشكل مجتمعة هيكل فضاء متجهات طوبولوجي. يوضح المؤلفون أنه يمكن تجهيز \( \mathbb{R}^\infty \) بطوبولوجيا مستمدة من المسافة المحددة، مما يسمح بإنشاء علاقات مكافئة وتشكيل فئات مكافئة تدعم الإطار الرياضي بشكل أكبر.

تقدم الورقة أيضًا مفهوم منتجات التلافيف (CP) في كل من الإعدادات المستمرة والمDiscrete، مع تسليط الضوء على خصائص مثل الاستقرار، والتبادلية، والترابط. يوسع المؤلفون هذه المفاهيم لتطبيقات معالجة الصور، موضحين تمثيلًا قائمًا على المصفوفات للتلافيف في شبكات CNN، والذي يتضمن الحشو الصفري ويقدم مصفوفة مجال الاستقبال (RFM) للحساب الفعال. يسمح منتج التلافيف المعتمد على STP (STP-CP) بإجراء عمليات التلافيف مباشرة على إدخالات البيانات الصالحة، مما يتعامل بشكل فعال مع الصور غير المنتظمة والتالفة جزئيًا دون الحاجة إلى الحشو الصفري. لا تعزز هذه الطريقة المبتكرة المتانة ضد عدم انتظام البيانات فحسب، بل تسهل أيضًا معالجة البيانات عالية الأبعاد، مما يضع أساسًا قويًا للبحث المستقبلي في استخراج الميزات القوية وديناميات تدريب CNN.

Journal: Science China Information Sciences, Volume: 69, Issue: 4
DOI: https://doi.org/10.1007/s11432-025-4734-4
Publication Date: 2026-03-10
Author(s): Daizhan Cheng et al.
Primary Topic: Tensor decomposition and applications

Overview

The research introduces a novel convolutional neural network (CNN) framework that utilizes the semi-tensor product (STP) to replace the conventional inner product, facilitating operations across vectors of varying dimensions. This framework incorporates a domain-based convolutional product (CP) that enables padding-free convolution, effectively avoiding the introduction of zero or artificial padding that often leads to boundary artifacts in traditional CNNs. The proposed method has been validated through applications in image processing and third-order signal identification, demonstrating its capability to manage irregular, incomplete, and high-dimensional data without the distortions typically associated with padding.

In conclusion, the study establishes a mathematically rigorous foundation for the STP-based convolutional operation, which directly addresses the challenges posed by irregular data. The framework not only enhances feature extraction in real-world scenarios but also opens avenues for future research, including the implementation of backpropagation for training, large-scale empirical validation against existing models, and optimization for practical applications. These next steps aim to further explore the dynamics and efficiency of STP-CNNs, particularly in fields such as medical imaging and remote sensing, thereby advancing the state of convolutional neural networks in handling complex data structures.

Introduction

The introduction of the paper highlights the pivotal role of convolutional neural networks (CNNs) in advancing artificial intelligence, particularly in fields like computer vision and natural language processing. While traditional CNNs excel in feature extraction through local weighted aggregations, they face challenges when dealing with irregular data, boundaries, or missing regions. Conventional methods often rely on explicit padding or masking, which can introduce artifacts and biases, ultimately degrading performance. To address these limitations, various enhancements such as dilated and deformable convolutions have been proposed, yet they still operate within fixed-dimensional tensor spaces.

This paper introduces a novel convolutional framework based on the semi-tensor product (STP) of vectors, which allows for algebraically consistent operations across varying dimensions without the need for padding or masking. The proposed STP-CNN framework operates in \( \mathbb{R}^\infty \), accommodating variable receptive field sizes and enhancing robustness against irregular or incomplete data. Key contributions include a padding-free convolution operator that computes cross-dimensional inner products over valid data, a unified linear-algebraic formulation compatible with existing deep learning libraries, and scalable generalization to three-dimensional signals. This approach aims to preserve the mathematical integrity of convolution while improving adaptability to high-dimensional, irregular datasets, setting the stage for future empirical evaluations and integration into mainstream computational frameworks.

Discussion

In this section, the authors establish the algebraic foundation for a novel convolutional neural network (CNN) framework utilizing the semi-tensor product (STP) of vectors and a cross-dimensional inner product defined on the space \( \mathbb{R}^\infty \). The STP facilitates operations between vectors of varying dimensions, addressing the limitations of traditional inner products. Key definitions are provided, including the addition, inner product, norm, and distance metrics, which collectively form a topological vector space structure. The authors demonstrate that \( \mathbb{R}^\infty \) can be equipped with a topology induced by the defined distance, allowing for the establishment of equivalence relations and the formation of equivalence classes that further support the mathematical framework.

The paper also introduces the concept of convolution products (CP) in both continuous and discrete settings, highlighting properties such as stability, commutativity, and associativity. The authors extend these concepts to image processing applications, detailing a matrix-based representation of convolution in CNNs, which incorporates zero-padding and introduces the receptive field matrix (RFM) for efficient computation. The proposed STP-based convolutional product (STP-CP) allows for convolution operations directly on valid data entries, effectively handling irregular and partially damaged images without the need for zero-padding. This innovative approach not only enhances robustness against data irregularities but also facilitates the processing of higher-dimensional data, laying a solid groundwork for future research in robust feature extraction and CNN training dynamics.