الهندسة المعمارية

Sparse Model

التعريف

A model architecture that activates only a subset of its parameters for any given input, rather than the full network. Sparse models—enabled by Mixture of Experts designs—achieve larger total capacity while keeping per-inference compute manageable.

مصطلحات ذات صلة

Mixture of Experts (MoE)

A model architecture where different sub-networks ("experts") specialise in different types of inputs, and a gating network routes each token to the most relevant experts. MoE enables very large model capacity at lower inference cost—Mixtral and GPT-4 are believed to use this approach.

Model Compression

A set of techniques—including quantization, distillation, pruning, and low-rank factorisation—that reduce model size and computational requirements while preserving performance. Model compression is essential for deploying powerful models on edge hardware or within cost budgets.

Inference

The process of running a trained model on new data to produce predictions or generated outputs. Inference cost and latency are the dominant operational concerns in production AI, particularly for large generative models that can cost cents per request at scale.

تحتاج مساعدة في فهم الذكاء الاصطناعي؟

احجز مكالمة تقييم ملاءمة Physical AI لمناقشة كيفية تطبيق مفاهيم الذكاء الاصطناعي هذه على قطاعك وتحدياتك.