A model architecture where different sub-networks ("experts") specialise in different types of inputs, and a gating network routes each token to the most relevant experts. MoE enables very large model capacity at lower inference cost—Mixtral and GPT-4 are believed to use this approach.
预约一次探索通话,探讨这些 AI 概念如何转化到您所在的具体行业与业务挑战中。