A Context Window Is Not Context

Patrick Joubert; Patrick Joubert

A Context Window Is Not Context

June 26, 2026·Patrick Joubert·4 min read

context-engineeringcontext-windowcontext-graphagent-reliabilityproduction-infrastructure

Every serious agent team spent two years chasing the same number: how much context can we fit in the window.

Bigger windows. More tokens. Longer sessions. The unofficial sport even had a name, tokenmaxxing, and the logic felt obvious. If the model can see more, it will decide better.

This week the market changed its mind. Reporting in late June described a clear shift among OpenAI and Anthropic customers away from tokenmaxxing and toward efficiency: spend fewer tokens, send less context, get more done.

The pivot is being told as a cost story. It is actually a correction to a category error.

Capacity was never the constraint. Structure was.

A context window is room. Context is structure. They are not the same thing, and the gap between them is exactly where production agents fail.

What tokenmaxxing assumed

The bet was simple: comprehension scales with capacity. Give the model more room and it will hold more of the truth, so its decisions will improve.

That bet paid off for a while, because early windows were genuinely too small to hold a real task. Expanding them removed a hard ceiling.

But somewhere past the point where the relevant facts already fit, teams kept adding. More retrieved documents. More history. More just in case. The window became a place to dump everything and hope the model would sort it out.

Hope is not an architecture.

A window is room, not understanding

Put a million tokens in front of an agent and ask it to approve a refund, update an account, or escalate an incident. The window can hold the customer record, the policy, the contract, three months of history, and every edge case.

It still cannot tell the agent which of those facts is current, which policy version is active, which clause a private amendment superseded this morning, or which of two contradictory sources wins.

The information is present. The structure that makes it usable is not.

Bigger context window	Context graph
Holds more tokens	Structures the few facts that matter
No notion of freshness	Marks what is current versus stale
No notion of scope	Enforces the decision scope
No notion of precedence	Encodes what supersedes what
Recall degrades as it fills	Retrieval stays targeted
Answers "what did we put in?"	Answers "what applies here, now?"

A window measures how much an agent can see. A context graph decides what an agent should act on. Only one of those is context.

Efficiency is a structure problem in disguise

The efficiency shift works for a reason worth saying out loud: most of what teams were stuffing into windows was noise.

When a leaner prompt with the right structured context outperforms a bloated one, that is not a compression trick. It is evidence that the extra tokens were never load-bearing. The model was doing well despite them, not because of them.

Efficiency, done properly, is not sending less of the same soup. It is sending the specific facts that bear on the decision, already scoped, already dated, already reconciled. That requires knowing which facts matter before the model runs. Which is a structure problem, not a capacity problem.

This is the same argument The Context Graph made in Context Engineering in 2026. The difference is that the market has now confirmed it with its own budget. And it echoes Why RAG Is Not Enough for Production AI Agents: retrieving more passages was never the same as retrieving the ones that govern the decision.

What context actually is

Context is not a volume of text. It is a structured account of a situation: the entities involved, the relations between them, the scope the decision lives in, whether each fact is current, and which rule applies when rules conflict.

That structure is a decision context graph. It is what lets an agent answer the only question that matters at the moment of action: given everything true right now, is this specific action valid?

A window cannot answer that question no matter how large it grows, because size does not encode applicability, freshness, or precedence. Those are properties of structure, and you either model them or you gamble that they happened to land in the prompt.

Rippletide is one reference implementation of that structured layer, evaluating an agent's proposed action against current, scoped context before it executes. The architectural point stands on its own: the reliability of an agent is set by the structure of its context, not the size of its window.

The test

Ask one question of any agent your team is scaling:

If you doubled the context window tomorrow, would the agent make better decisions?

If the honest answer is "only when the right facts happen to be in there," you do not have a problem that more capacity solves. You have a structure problem wearing a capacity costume.

Tokenmaxxing is ending because the industry is relearning something it half-knew. More room to hold information is not the same as knowing what to do with it.

A window is where context sits. It is not context. Context is the structure that tells an agent which part of that window is true, relevant, and safe to act on right now.

The Context Graph is a weekly newsletter for AI engineers building production agents. Read the glossary of agent decision infrastructure for the vocabulary behind context graphs, pre-execution enforcement, accountable agents, and causal decision traces.

Cite this memo

Patrick Joubert. (2026). "A Context Window Is Not Context." The Context Graph. https://thecontextgraph.co/memos/a-context-window-is-not-context

Running into these patterns in production?

Compare notes