A decoding strategy that restricts the model's next-token choices to the smallest set of tokens whose cumulative probability exceeds a threshold p. Used alongside temperature, top-p sampling balances output diversity and coherence in production LLM deployments.
预约一次探索通话,探讨这些 AI 概念如何转化到您所在的具体行业与业务挑战中。