Tech Abstractions
Agentic AI·Medium

Design a Context Management System for LLM Agents

Asked at Anthropic, OpenAI, LangChain

Design a context management system for an LLM-powered agent that needs to operate within a limited context window while drawing on a large and diverse set of information sources: system prompts, user conversation history, retrieved documents, tool outputs, agent scratchpad notes, and task instructions. The system must decide what to include, what to compress, and what to exclude — dynamically, for each interaction.

Constraints

  • LLM context window: 100K tokens (but performance degrades on information in the middle of the window)
  • System prompt: 2K tokens (fixed, always included)
  • Conversation history: can grow to 200K+ tokens in long sessions
  • Retrieved documents: 10-50 documents per query, average 1K tokens each
  • Tool outputs: can range from 50 tokens to 10K tokens per call
  • Target: assemble context in under 200ms per agent step

Design Requirements

  1. Design the context budget allocation — how much of the window goes to each source?
  2. Design the dynamic context assembly pipeline that runs before each LLM call.
  3. Explain your strategy for compressing conversation history without losing critical information.
  4. Design the tool output integration — how to decide which tool outputs are relevant enough to include.
  5. Address what happens when the context overflows — graceful degradation strategies.

Your Answer

Unlock AI-powered scoring, all questions, and progress tracking.

Study the related chapter →