What is Context Window?

A context window is the full amount of text the model can receive at once. The system instruction, user message, conversation history, retrieved documents, and generated answer all share this capacity.

This capacity is measured in tokens. A larger context window allows more documents or conversation history, but it does not automatically produce a better answer. Too much irrelevant content can make the model miss the important evidence.

How It Is Used

In RAG systems, the most relevant document passages are placed into the context window. That makes semantic search, chunking, and reranking quality important. If wrong or unnecessary sources fill the window, the model may struggle to answer correctly.

In chat applications, teams often summarize older messages, keep only important events, or use short-term memory instead of sending the full conversation every time.

Business Use

The context window affects cost, latency, and quality together. In legal document analysis, technical support assistants, or quote preparation, fitting the right sources into the available space becomes critical.

A large window should not mean dumping unplanned data into the prompt. Production systems need a clear policy for what enters the context, what stays out, and how sources are prioritized.