Techniques

Tokenization

Definition

The process of splitting raw text into smaller units called tokens—typically sub-words or word-pieces—that serve as the basic input unit for language models. Token count determines both model context limits and API pricing, making tokenization an important operational consideration.

Related Terms

Context Window

The maximum amount of text (measured in tokens) an LLM can process in a single request, encompassing both the prompt and the generated output. Larger context windows—now exceeding 1 million tokens in some models—enable processing of long documents, codebases, and meeting transcripts in one pass.

Token Budget

The total number of tokens allocated for a model request, encompassing both input (prompt + context) and output. Managing token budgets is central to controlling inference cost in production LLM applications, especially when processing long documents or maintaining conversational history.

Large Language Model (LLM)

AI models trained on vast amounts of text data that can understand and generate human-like text. Examples include GPT-4, Claude, and Llama. LLMs power modern chatbots, content generation, and code assistance tools.

Knowing the Terms Is Step One. Applying Them Is Step Two.

Book a discovery call to discuss how these AI concepts translate to your specific industry and business challenges.