What is Context Engineering?
The discipline of designing, building, and maintaining the structured context that AI agents need to make reliable decisions in production.
Context engineering is to AI agents what data engineering is to analytics: the infrastructure that determines whether outputs are reliable or random.
The problem context engineering solves
Every AI agent operates within a context window — the information available when it makes a decision. Most teams treat this window as a prompt. They optimize the instruction and hope the model figures out the rest.
In demos, this works. In production, it collapses. The agent encounters stale data, contradictory rules, expired permissions, and decisions it made three steps ago that it can no longer recall. The prompt was fine. The context was broken.
Context engineering addresses the root cause: the information supply chain feeding the model is unmanaged. No freshness guarantees. No structural validation. No trace of why a particular piece of information was selected over another.
Where the term comes from
The term gained traction in mid-2025 when Shopify CEO Tobi Lutke and former OpenAI researcher Andrej Karpathy both endorsed it publicly. Karpathy described context engineering as the art and science of filling the context window with just the right information for the next step. Lutke called it a more accurate label for what production teams actually do when building AI systems.
The distinction matters. As Simon Willison noted, unlike "prompt engineering" — which people dismiss as typing things into a chatbot — "context engineering" has an inferred definition much closer to its intended meaning. It signals systems work, not wordsmithing.
The mental model
Think of the LLM as a CPU and its context window as RAM. The context engineer acts as the operating system — loading working memory with just the right code and data for each task. The quality of the output depends entirely on what is loaded into that window.
Context engineering vs prompt engineering
Prompt engineering is one layer of context engineering. It focuses on the instruction — the "what to do" part of the context window. Context engineering manages everything else: what information reaches the model, how it is structured, whether it is still valid, and how decisions are traced back to their inputs.
| Dimension | Prompt Engineering | Context Engineering |
|---|---|---|
| Scope | Single LLM call | Full information supply chain |
| Focus | Instruction quality | Information architecture |
| Temporal awareness | None — static text | Freshness, validity windows, expiry |
| State management | Stateless | Cross-step, cross-session state |
| Traceability | None | Decision traces, provenance |
| Failure mode | Bad output | Silent degradation at scale |
The five pillars of context engineering
Context engineering is not a single technique. It is a discipline built on five interconnected practices:
01
Context Selection
Determining what information reaches the model for each decision. Not everything in the knowledge base is relevant. Not everything relevant is valid right now. Context selection is the gatekeeper — choosing what goes in and what stays out based on the current task, the agent's state, and the decision being made.
02
Context Structuring
Organizing selected information so the model can reason over it effectively. Raw documents dumped into a prompt are not structured context. Structuring means providing explicit relationships, hierarchies, constraints, and metadata that reduce ambiguity and enable deterministic reasoning.
03
Temporal Management
Ensuring every piece of context carries validity windows, freshness guarantees, and expiration rules. A pricing rule from last quarter should not silently influence today's decision. Temporal management makes time a first-class property of every context element.
04
State Continuity
Maintaining coherent agent state across steps, tool calls, and sessions. When an agent approves a request in step 3, that decision must persist through steps 4 through 12. State continuity prevents the drift that causes agents to contradict themselves or lose track of their own actions.
05
Decision Traceability
Recording not just what the agent decided, but why — what context was available, what was selected, what was excluded, and what rules applied. Traceability transforms AI agents from black boxes into auditable systems.
Context engineering in production: emerging frameworks
Several teams building production agents have published their context engineering approaches. Three patterns keep emerging:
Anthropic's Three Categories
Anthropic's Applied AI team categorizes context engineering into static context (system prompts, tool definitions, few-shot examples), dynamic context retrieval (just-in-time loading, progressive disclosure), and long-horizon task management (compaction, structured note-taking, sub-agent architectures). Their guiding principle: find the smallest set of high-signal tokens that maximize the likelihood of a desired outcome.
Manus's Production Patterns
The Manus team, after rebuilding their agent framework four times, published six production-tested patterns: design around KV-cache hit rates (stable prefixes, append-only contexts), mask tools instead of removing them to preserve cache, externalize context to the file system as unlimited persistent storage, manipulate attention through recitation (agents maintain todo files to prevent goal drift across 50+ tool calls), preserve error evidence so models learn implicitly, and avoid few-shot brittleness through structured variation.
LangChain's Four Operations
LangChain identifies four core context operations: write (persist information), select (retrieve the right context), compress (summarize to fit the window), and isolate (give sub-agents clean context windows). They introduced Middleware in LangChain 1.0 specifically for programmatic context control.
The industry signal
Gartner declared 2026 the year of context, positioning context engineering as critical infrastructure for enterprise AI. Their finding: 4 out of 5 organizations increased AI investments in 2026, yet only 1 in 5 shows measurable ROI — a gap they attribute to fragmented context across documentation, tribal knowledge, and disconnected tools.
The Model Context Protocol (MCP) — now governed by the Agentic AI Foundation under the Linux Foundation — has become the universal connector standard with adoption from Anthropic, OpenAI, Google, and Microsoft. Martin Fowler's team has published detailed analyses of context engineering for coding agents, signaling the concept has reached mainstream software engineering.
The context graph: context engineering in practice
A context graph is the primary data structure for implementing context engineering. It captures not just facts and relationships (like a knowledge graph), but also applicability rules, temporal validity, exceptions, provenance, and decision traces.
Where prompt engineering optimizes the instruction and RAG optimizes the retrieval, context engineering via context graphs optimizes the entire decision substrate — the structured environment within which AI agents reason and act.
The Stack
Who needs context engineering
Context engineering becomes critical when AI agents move from single-turn assistants to multi-step, tool-using, decision-making systems deployed in production:
- Teams running multi-step agents that use tools and modify state
- Organizations requiring audit trails for AI-driven decisions
- Products where agent decisions affect revenue, compliance, or user safety
- Systems where context changes over time (pricing, permissions, policies)
- Platforms orchestrating multiple agents that share state
Frequently asked questions
Is context engineering just a new name for prompt engineering?
No. Prompt engineering is a subset of context engineering. It focuses on the instruction within a single LLM call. Context engineering manages the entire information supply chain across the full agent lifecycle — selection, structuring, temporal validity, state continuity, and decision traceability.
Do I need a context graph to do context engineering?
Not necessarily, but a context graph is the most effective implementation pattern. You can practice elements of context engineering with structured prompts, metadata enrichment, and state management systems. A context graph formalizes these practices into a coherent data structure.
How is context engineering related to RAG?
RAG (Retrieval-Augmented Generation) is a retrieval technique. It answers 'what documents are relevant?' Context engineering asks a broader set of questions: 'Is this information still valid? Does this rule apply to this specific situation? What did the agent decide three steps ago?' RAG is one input to context engineering, not a replacement for it.
When should I start investing in context engineering?
When your agents move from demos to production. The moment agents make decisions that affect real users, real money, or real compliance requirements, unmanaged context becomes a liability. Most teams discover this through silent failures — agents that produce plausible but wrong outputs because their context was stale, incomplete, or contradictory.