Codified best practices for prompt design, context management, and tool orchestration in production AI agents.
Engineering BlogDefined context engineering as managing the entire context state (system instructions, tools, MCP, external data, message history) for multi-turn agents. Detailed Claude Code's hybrid approach: CLAUDE.md upfront + just-in-time retrieval via glob and grep.
A paradigm shift from "prompt engineering" (writing better individual prompts) to "context engineering" (managing the entire information ecosystem around an agent). Prompt engineering optimizes single interactions. Context engineering optimizes what an agent knows across multi-hour sessions, what tools it can access, what external data it can query, and how information flows through the system.
Strategically allocating limited tokens across system instructions, task descriptions, tool definitions, few-shot examples, and conversation history. Too much context = noise and distraction (model pays attention to irrelevant information). Too little context = missing critical information. Effective agents optimize the signal-to-noise ratio of their context window.
Crafting the foundational instructions that guide agent behavior: defining goals, role, constraints, and decision-making principles. A good system prompt provides guidance without micromanaging, sets appropriate risk tolerance, and establishes the agent's values. System prompts for agents differ from chat because agents make autonomous decisions with real consequences.
Intelligently choosing which examples to provide to the agent based on the current task. Rather than providing all possible examples (which wastes tokens), effective agents dynamically select examples relevant to the current problem, improving performance while conserving context budget.
Understanding and optimizing the financial cost of context. Extended context is expensive — each token in your context costs money on every request. Effective context engineering balances quality (having the right information) against cost (how much it costs to include that information), treating tokens as a scarce resource to be optimized.