You're Watching the Wrong Bubble
Everyone's talking about the AI bubble. They're wrong - not because there isn't a bubble, but because they're looking at the wrong thing.
The bubble isn't in AI. It's in LLMs.
And understanding that distinction will change how you invest your time, your money, and your strategy for the next decade.
The Confusion
When people say "AI bubble," they mean one of two things. Either they think the technology is overhyped and the crash is coming. Or they think AI is transformative but the market has gotten ahead of itself.
Both camps are making the same mistake: they're treating "AI" and "LLMs" as synonyms. They're not.
LLMs - the large language models that power ChatGPT, Claude, Gemini, and their competitors - are one layer of the AI stack. An important one. But they're the layer that is commoditizing fastest, and the layer most likely to see its economics collapse.
AI, on the other hand, is the broader capability: the ability to automate judgment, augment decision-making, and operate with context at a scale humans can't. That isn't going anywhere. If anything, it's just getting started.
The question isn't whether AI is a bubble. It's whether you're investing in the part that holds value or the part that's about to become a utility.
The Commodity Curve
Here's a number that should change how every CEO thinks about their AI strategy: the cost of running an LLM at GPT-3.5-level quality has dropped by roughly 280x in two years. In 2022, it cost $20 per million tokens. By late 2024, the same capability cost $0.07.
And that's not an outlier. Andreessen Horowitz documented what they call "LLMflation" - the cost to reach a fixed capability bar is dropping approximately 10x per year. Epoch AI's research shows task-specific declines of up to 50x annually.
To put that in perspective: when GPT-4 launched in March 2023, it cost $60 per million output tokens. Today, models that outperform it on every major benchmark cost between $0.10 and $0.40 per million tokens. That's a 99.3% price decline in three years while the product got better.
This isn't new. It's the same pattern we've seen in every foundational technology layer:
- Cloud compute - once a differentiator, now a utility nobody thinks about.
- Bandwidth - Edholm's Law predicted this; it played out exactly.
- Storage - the cost per gigabyte collapsed until it became irrelevant to strategy.
LLMs are following the same trajectory. The model is becoming infrastructure. And when something becomes infrastructure, you stop competing on it.
The Performance Gap Has Collapsed
The other half of commoditization is this: the gap between the best model and the rest has nearly disappeared.
In early 2024, there was a clear hierarchy. GPT-4 was in a league of its own. Closed-source models led by 15-20 points on standard benchmarks. Open-source alternatives were useful but noticeably inferior.
That's gone now.
Epoch AI's research shows that open-weight models now lag the closed-source frontier by roughly three months on average. WhatLLM's January 2026 benchmarks show the gap has compressed to just five points - from 12 in early 2025. On some specific benchmarks, open-source models like DeepSeek V3.2 and GLM-4.7 match or beat the best proprietary models.
DeepSeek V4 - which you can self-host, modify, and run without paying a per-token fee - achieved the highest multi-step score in a 38-task benchmark, surpassing Claude Opus 4.7. Its API costs roughly 50x less than the equivalent proprietary model.
When the fifth-best option performs at 95% of the quality of the best option and costs 90% less, you're not competing on the model anymore. You're competing on everything else.
Smart Routing Is Already the Norm
The companies that actually use AI at scale - not the ones talking about it at conferences - have already figured this out. They don't pick one model. They route.
Enterprise LLM API spending passed $8.4 billion in 2025, and the leading infrastructure pattern that emerged isn't model selection - it's model routing. Gateways that sit between your application and model providers, dynamically choosing the best model for each query based on complexity, cost, latency, and importance.
The data backs this up: empirical research from Zylos shows 50-70% of enterprise requests can be handled by the cheapest model tier without any quality loss. LLM routing systems can reduce inference costs by up to 85% while maintaining 95% of frontier-model performance.
Simple customer query? Route it to a fast, cheap open-source model. Complex strategic analysis? Send it to frontier. Compliance-sensitive task? Use the model with the best audit trail. Time-critical response? Whichever model is fastest right now.
This is the equivalent of what happened in financial markets with smart order routing. Nobody cares which exchange executes their trade. They care about best execution. The same is happening with LLMs right now.
The model is becoming the commodity. The routing logic is becoming the product.
So Where's the Actual Moat?
Gartner forecasts worldwide AI spending will hit $2.59 trillion in 2026 - a 47% increase year-over-year. But here's the breakdown that matters: $1.43 trillion goes to AI infrastructure. Only $32.6 billion goes to the models themselves.
Read that again. Infrastructure outspends models by 44 to 1.
The market is already telling you where the value lives. It's not in the engine. It's in everything the engine needs to be useful.
If the model is a commodity and everyone can route to the cheapest-best option, what differentiates companies?
Data and context.
I wrote about this in my last post, "The Human Context Window." When I built an AI agent to run the intelligence layer of my company, the most powerful model in the world was useless in week one. It had intelligence but no context. No knowledge of our business model, our org structure, our history, our language. It was noise.
The agent became genuinely useful - sometimes brilliant - only after I built its context window: memory systems, decision logs, relationship data, institutional knowledge. The model could have been swapped out at any point. The context couldn't.
That principle applies to every company, in every industry.
The companies that will dominate the AI era aren't the ones with the best model. They're the ones with:
- Proprietary data that compounds. Your customer interactions, your operational history, your institutional knowledge - the data that no competitor can replicate because it's generated through your unique operations.
- Context architecture. Systems that feed the right data to the right model at the right time. Not just a data warehouse. A living, continuously updated body of knowledge that gives the AI what it needs to make decisions specific to your business.
- Feedback loops. Structured mechanisms that capture what works, what doesn't, and feed that learning back into the system. This is the flywheel. Every interaction makes your AI marginally smarter than your competitor's, even when you're running the same underlying model.
Bain's latest research calls this "proprietary intelligence" and argues that most CEOs think they're leading AI transformations when they're actually managing a portfolio of pilots. The companies pulling ahead aren't moving faster on the same path. They're building something different.
The Backwards Investment
Here's the uncomfortable reality: most companies are spending in exactly the wrong ratio.
Millions on model licenses and API costs. Almost nothing on data infrastructure. They're buying the most expensive engine and pouring regular unleaded into it.
McKinsey's data shows 88% of organisations are using AI in at least one business function. But only 39% report enterprise-level impact on EBIT. That gap isn't a model problem. It's a context problem. These companies deployed intelligence without giving it anything useful to be intelligent about.
What they should be doing:
-
Investing in data capture and curation. Making institutional knowledge machine-readable. Turning tribal knowledge into structured, queryable context. Every decision, outcome, and lesson that lives only in someone's head is a context gap your AI can't bridge.
-
Building context delivery systems. Getting the right data to the AI at the right moment. This is the hard engineering work that nobody talks about because it's not glamorous. It's plumbing. But plumbing is what makes cities work.
-
Developing routing intelligence. Matching query complexity to model capability and cost. This alone can cut AI costs by 40-70% without sacrificing output quality. Enterprise AI teams that don't have a routing strategy by now are leaving money on the table.
-
Creating feedback mechanisms. So the system gets smarter with use, not just with model upgrades. When a model gives a better answer because it has better context, that's your competitive advantage compounding. When it gives a better answer because the model got upgraded, that's everyone's advantage improving equally.
The Recruiting Parallel
I run a 350-person recruitment company. In our industry, the "model" is arguably interchangeable - there are plenty of smart recruiters in the world. The ones who consistently outperform aren't operating on superior raw talent. They're operating on superior context.
They know the client's culture. They understand the hiring manager's actual preferences versus their stated ones. They have relationship data that compounds over years. They know which candidates from similar roles at similar companies succeeded and which didn't, and why.
Companies that treat recruiters as interchangeable commodities lose. The ones that invest in context infrastructure - CRM data, placement history, client intelligence, market knowledge systems - win. Not because their people are smarter, but because their people have better context.
That's exactly the dynamic playing out across every industry with AI. The model is the recruiter's brain. The context is everything that makes the brain useful.
What This Means for You
If you're a CEO or operator building your AI strategy, three things:
First, stop asking "which AI model should we use?" and start asking "what data do we have that nobody else does, and how do we make it usable?" Your choice of model matters less every quarter. Your choice of what to feed it matters more every quarter.
Second, invest in the unsexy stuff. Data architecture. Context systems. Knowledge management. Feedback loops. These don't make good conference talks but they make good businesses.
Third, understand the timeline. The bubble - if it pops - will hit companies that bet on model superiority. They'll wake up one morning and discover that an open-source model running on commodity hardware matches their expensive proprietary setup at 5% of the cost. The companies that bet on data and context will barely notice. Their moat wasn't the model.
Nobody brags about which cloud provider their servers run on anymore. In three years, nobody will brag about which LLM they use either.
The question is: what will you have built by then that actually matters?