Skip to main content
Monday, April 20, 2026
Charlotte, NC|
Under Construction
Notes

AI Has a Meter. Here's How Not to Get the Bill.

AI isn't magic. It's metered. Every prompt consumes compute. Bigger models consume more.

Peter Cellino· Publisher
||2 min read
CLT Mercury Default Hub Illustration – Charlotte Skyline, Newspaper, and Coffee (Editorial Ink Style)
CLT Mercury Default Hub Illustration – Charlotte Skyline, Newspaper, and Coffee (Editorial Ink Style)

AI isn't magic. It's metered.

Every prompt consumes compute. Bigger models consume more. Longer inputs consume more. And when you chain steps together — draft, retrieve, verify, rewrite, check policy, try again — you're not buying "an answer." You're buying a workflow.

Here's the clean mental model: AI costs scale with (1) how much you send, (2) how many times you send it, and (3) how many people send it at once.

What's Actually Driving the Cost

Most teams focus on "model pricing" and miss the real multipliers:

  • Context length: the more text you attach, the more you pay.
  • Number of calls: retries, tool calls, and multi-step flows add up fast.
  • Concurrency: a pilot is one thing; a Monday morning rollout is another.
  • Enterprise overhead: logging, security, policy checks, audits — necessary and not free.

Why "Agents" Raise the Stakes

Agents are just workflows that chain multiple AI steps and tools together. They're useful — and they can quietly turn one interaction into many.

One Call vs. Many

A simple summary might be one call. An "agent" that summarizes, retrieves documents, drafts a response, checks policy, rewrites for tone, then verifies citations can be a pile of calls before anyone notices.

Multiply that by a team, and the meter starts sprinting.

Agents aren't bad. Unengineered agents are expensive.

The Only Strategy That Actually Works

Don't pay for the best model when the job doesn't require it. Most work naturally falls into tiers:

  • Routine tasks (summaries, extraction, formatting, first drafts): smaller, cheaper models often do fine — sometimes local models when policy requires it.
  • High-stakes tasks (complex reasoning, sensitive material, critical decisions): escalate to more capable models when needed.

Route, Govern, Measure

The future is routed, governed systems that do one simple thing well: use the cheapest model that meets the quality and risk bar — and only escalate when necessary.

And measure the right number: cost per outcome, not cost per prompt.

If you can do that, AI becomes leverage — not a line item that gets euthanized at the next budget review.


Peter Cellino — I'm pro-AI and pro-efficiency. I'm also pro-not-setting-money-on-fire. Now excuse me while I spend $6 on coffee to experience "metered consumption" in analog form.

Peter Cellino is the publisher of The Charlotte Mercury and the creator of Mercury Local — a platform rebuilding local news into something that actually serves residents and the local businesses who need to reach them, without the ad-tech middlemen.

Peter Cellino

Publisher

Publisher of The Charlotte Mercury and its family of hyperlocal news publications.

More in Notes