What is Token (LLM)?

A token lets a model process text as smaller units rather than as plain words. One word can become multiple tokens, and punctuation, numbers, and spaces can be handled differently depending on the tokenizer.

In LLM services, pricing, context limits, and speed are often calculated through tokens. The user input, system instruction, retrieved documents, and model answer all consume the total token budget.

Why It Matters

A context window is limited by token capacity. Sending a very long history or unnecessary documents can leave no room for the evidence that matters. Short, well-structured prompts can reduce both cost and latency.

Token accounting becomes important in RAG, bulk document summarization, customer support assistants, and API-based content generation. The same task may use a different number of tokens with another model or another language.

Business Use

Production systems should monitor token usage. Unexpected cost increases can come from overly long system prompts, repeated conversation history, or uncontrolled document insertion.

Good design evaluates summarization, chunk selection, answer length limits, and model choice together.

What is Token (LLM)?

Why It Matters

Business Use

Related Terms