Discussion about this post

User's avatar
Ed Barrow's avatar

This really clicked for me, especially the customer-level vs platform-level utilization distinction.

What I appreciated is how clearly this explains why low utilization on reserved or hourly compute is often a rational outcome, not a failure. If you are a customer sizing for p95–p99 traffic, dealing with roadmap uncertainty, and running human-facing workloads, 10–40% utilization is kind of the default state.

It also mirrors how mature hyperscalers evolved. Reserved instances looked great on paper, but over time commitments became a way for providers to drive predictability, utilization, and revenue visibility. The efficiency gains were real, but the variance risk largely moved onto customers.

Your point that “reserved inference tends to be expansion revenue, not the economic core” stood out. It feels like reserved inference is less about where platforms want to be at the core, and more about where providers naturally push as usage stabilizes and financing efficiency starts to matter.

If that’s right, we probably replay a familiar pattern: commitments increase to improve provider economics, while customers still struggle with forecasting and workload variability.

Which makes the utilization and aggregation lens you lay out even more important as inference moves from experimentation to real scale, especially as demand smoothing and risk shifting start to matter as much as runtime efficiency.

Really strong piece - this should be required reading for anyone thinking seriously about inference economics.

Les Barclays's avatar

I have no idea how I missed this wonderfully written piece when it dropped!

I’m nowhere near an AI expert, more of a finance bro if anything but I’m curious about AI & teaching myself to code + build something finance focused from scratch on Hugging Face.

This article will somewhat inspire my next AI focused piece as I’ve had some thoughts on the unit economics of AI inference & the wider business model for a while - I want to explore an unresolved tension between better margins and bigger losses: why do AI companies lose money despite good unit economics + margins and what this could all mean for the AI trade in 2026…?

5 more comments...

No posts

Ready for more?