基础概念

Reinforcement Learning from Human Feedback (RLHF)

定义

A training technique that refines language model behaviour by learning from human preferences rather than fixed labels. RLHF is a primary method used to align LLMs like ChatGPT and Claude with desired values and reduce harmful outputs.

相关术语

Reinforcement Learning (RL)

A machine learning paradigm where an agent learns by interacting with an environment and receiving reward or penalty signals. RL is the foundation of game-playing systems like AlphaGo and increasingly powers robotics, logistics optimisation, and dynamic pricing.

Constitutional AI

A training methodology developed by Anthropic in which an AI model is guided by a written set of principles (a "constitution") to self-critique and revise its outputs. Constitutional AI is one approach to building safer, more controllable AI systems at scale.

AI Alignment

The research and engineering discipline of ensuring that AI systems pursue goals and exhibit behaviours that match human intentions and values. Misalignment risks range from models following instructions too literally to more speculative long-term risks discussed in AI safety literature.

了解术语只是第一步，将其落地应用才是第二步。

预约一次 Physical AI 适配性沟通，探讨这些 AI 概念如何转化到您所在的具体行业与业务挑战中。

Reinforcement Learning from Human Feedback (RLHF)

定义

相关术语

Reinforcement Learning (RL)

Constitutional AI

AI Alignment

相关服务

了解术语只是第一步，将其落地应用才是第二步。

Reinforcement Learning from Human Feedback (RLHF)

定义

相关术语

Reinforcement Learning (RL)

Constitutional AI

AI Alignment

相关服务

了解术语只是第一步，将其落地应用才是第二步。