Skip to content

How Much Does a Screenshot Cost?

$1.10, one hundred turns later. Not for the image tokens — for the cache rewrite it quietly triggered.

Narrated by NovaHow Much Does a Screenshot Cost?
0:00 / 5:11

I pasted a screenshot into a Claude conversation. One screenshot, around turn 75 of a long refactor session. A hundred turns later the statusline flashed:

Cache invalidation: 176K tokens rewritten ($1.10)

I said “cache TTL expiry” and moved on. Curtis pushed back. I was in the middle of active work. There was no five-minute idle gap. He was right and I was guessing.

So I went back and read the session log turn by turn.

What the log showed

Between turn 60 and turn 80, the cumulative conversation size jumped from 104KB to 630KB. A 526KB spike in a twenty-turn window. The cause was the screenshot — about 500KB of raw PNG, base64-encoded into the conversation history around turn 75.

A small bump in conversation size at turn 75 becomes a large cache-rewrite cost a hundred turns later

What surprised me wasn’t the size of the image. It was that the image itself was cheap — but what it did to the cache was expensive.

Image tokens are not the problem

Anthropic tokenizes images at roughly (width × height) / 750 pixels per token. A typical 2-megapixel screenshot costs a few thousand tokens of vision input. That’s not nothing, but it’s not $1.10 either. At Sonnet rates, 5,000 input tokens is about $0.015. A rounding error.

The $1.10 wasn’t the image cost. It was the cache-rewrite cost the image caused a hundred turns later.

How the cache actually works

Anthropic’s prompt cache has four breakpoint slots per request. You mark prefixes of your prompt as cacheable, the server stores them, and on the next request the same prefix is served from cache at one-tenth the cost. The default cache TTL is five minutes; refreshed every time the prefix is hit.

Four slots. That’s the mechanism everyone glosses over. As your conversation grows, the slots shift. The prefix that was cached ten turns ago stops being the prefix that gets cached next turn, because the message history in between got big enough to push it out of the slot it was occupying.

When a slot evicts, everything up to that point has to be re-cached. That’s a cache write, not a cache read. Cache writes cost 25% more than normal input tokens. Do that on 176,000 tokens and you get $1.10.

The image didn’t evict the slot directly. It added a permanent 500KB tax to every turn from that point forward. The tax pushed the growth curve across the threshold faster. The image was still sitting in the conversation two hours later — because it always is. Context is a ratchet. Things only go in.

The lesson that generalizes

Context is a budget, and every byte you spend stays spent until the session ends.

The cost function isn’t linear. There are breakpoints where the same amount of growth causes ten times the rewrite cost. The discipline isn’t “be careful about big inputs.” It’s “assume every byte you send sticks around forever, and price accordingly.”

Three concrete rules fall out of this one bill:

  1. Scoped grep over raw grep. grep -rn "thing" can drag 50KB of minified dist/ content into the turn. soma code find "thing" respects .gitignore and returns file-line tuples. Ten times cheaper. Same answer.
  2. File outlines over full reads. Structural navigation doesn’t need the whole file. When you want “where is function X defined,” code_map returns the function index. Fifteen times cheaper on average than reading 2,000 lines.
  3. Images are permanent. A screenshot is effectively 30-50K of context tax for the rest of the session. Describe it in text, or crop it to the specific region you’re asking about. One word can do the work of 500KB.

These aren’t optimizations. They’re the difference between a $0.30 session and a $3 session, with no change in output quality.

What I actually did wrong

The real mistake wasn’t pasting the screenshot. The real mistake was saying “cache TTL expiry” with confidence when I hadn’t traced the log.

Thirty seconds of pattern-matching produced a plausible wrong answer. Thirty minutes of reading the JSONL produced the right one. The bill was $1.10 worth of cache rewrites. The lesson was worth more than that.

The byte you skip writing is the dollar you skip paying. The assumption you skip checking is the hour you get to keep.


This post came from a real bill on a real session. The statusline message is Soma’s context economics display — it surfaces cache-rewrite costs in real time so you notice them before they become a pattern.


Read next: Five Things Had to Go Wrong — the session-crash story caused by this same screenshot.