The Highest Tax in AI Coding: Token Micro-Economics

dehakuran.com · April 2026 · 3 min read

You are paying your agent to re-learn your codebase every morning. Every learning is lots of tokens. Every token is lots of dollars and euros.

Yes, there are some methodologies (claude-mem was my favorite). But many, many others are unaware of these methodologies.


Every Session Starts With Amnesia

You re-paste yesterday's context, the stack, the decision you made at 11pm. The agent re-reads it, re-tokenizes it, charges you for it. Multiply by every developer, every session, every day.

A coding agent isn't really a code generator. It's a file reader, a grep runner, a log parser, a reasoner — and then, finally, a code generator.

"Generation is the cheap part. The round-trips are the expensive part. And a large slice of that is waste."

The Data Is Uncomfortable

An ICLR-submitted OpenReview study on SWE-bench found that:

  • Input tokens dominate total cost, even with caching.
  • Token usage varies up to 10x between runs of the same task.

And the line that should be on every CFO's wall:

  • Higher token usage correlates with lower accuracy.
  • The agents that spend more get worse answers.

So I Built "Brief"

That is why I spent a couple of Sundays building Brief — a local tool that gives each AI coding agent a focused, persona-scoped brief instead of a raw memory dump.

It works across Claude Code, Codex, and Gemini CLI. And I used pixel-art office visuals to make it actually fun to look at.

Early benchmark: –35% cost, –44% wall-time, same test pass rate.


Why, When There Are So Many Tools Already?

Ehm — because it's fun. And because I can.

Happy Sunday.

AI CodingDeveloper ToolsCost Optimization

Deha Kuran

AI Executive, Engineer, and Evangelist. Head of AI Business Operations at Philips.

Follow the thinking on LinkedIn →