The Great Intelligence Plateau
Why $700 Billion Might Not Buy a Smarter AI
When GPT-4 was released, something unusual happened. People who had said that AI was overhyped suddenly went quiet. They weren’t skeptical, they just stopped talking. The experience was different. Anyone who used it for just an hour could notice how it felt. This was not just an upgrade, it was something entirely new.
That was three years ago. Now, the companies developing AI are betting that this hype can last, as long as they spend enough money. Amazon, Google, Meta, and Microsoft plan to invest more than $700 billion in AI this year, with the latest forecasts pointing to $725 billion. That amount is more than the GDP of more than 87% of countries worldwide.
At first glance, this seems reasonable. Data and computing power created GPT-4, so one might expect that ten times more resources will yield something ten times better. However, many people involved in building these systems believe we have hit a kind of ceiling. It’s not a complete halt; it feels like running in sand. Each step forward demands more effort than the last.
The $700 Billion Gamble
The figures are hard to digest. Amazon has budgeted $200 billion for 2026, just for itself. Meta plans to spend between $125 and $145 billion. Alphabet is allocating between $180 and $190 billion. Microsoft is expected to invest around $190 billion. Amazon is ramping up spending so quickly that analysts believe the company might actually lose money this year, spending more than it is earning.
Many voices argue that this investment will never pay off. The revenue side tells a different story though. OpenAI made $6 billion in 2024, $20 billion by the end of 2025, and hit $24 billion annualized run-rate by April 2026. Anthropic went from $9 billion to $30 billion in four months. AWS grew 28% in Q1 2026, Azure by 40%, and Google Cloud by 63%.
But still, Bank of America found that these companies are now putting around 94% of their cash flow toward AI infrastructure. The whole bet rests on models keep improving as spending goes up. If that turns out to be wrong, the consequences for these companies would be severe.
The Narrowing Delta: GPT-5.5 vs the World
Changes have emerged in 2026 that are difficult to articulate. Benchmarks are still rising, and press releases continue to claim breakthroughs. However, users of these models sense that something is different. The initial shock has faded.
GPT-5.5 was released in April. It is genuinely new, not a fine-tuning job but a full ground-up rebuild. On ARC-AGI-2, a benchmark designed to test reasoning you can’t memorize your way through, it scored 85%. That’s an 11.7-point jump over GPT-5.4’s base score of 73.3%, though compared to GPT-5.4 Pro’s 83.3% the gap is suddenly under two points. How one interprets this varies.
The bigger picture shows that no one is truly dominating now. Claude Opus 4.7 is the best at coding tasks, achieving 64.3% on SWE-bench Pro versus GPT-5.5’s 58.6%. On PhD level science questions, Gemini 3.1 Pro, GPT-5.4 Pro, and Claude Opus 4.7 are essentially tied around 94%, with GPT-5.5 just behind at 93.6%. GPT-5.5 leads on agentic tasks with 82.7% on Terminal-Bench 2.0, a category that didn’t exist two years ago. Everyone has strengths, but nobody wins all contests.
Regular users notice this too. The models perform better, but they don’t feel fresh anymore. It’s like a routine software update rather than an encounter with something groundbreaking.
The Math of Running in Place
The slowdown isn’t due to a lack of effort. It’s just how the math works.
These systems rely on what are known as scaling laws. To achieve a small improvement, a huge increase in resources is necessary. Not like twice as much, more like a hundred times more for each meaningful step forward. The charts suggest steady progress, but that’s because they use a log scale. In reality, the cost of each advancement keeps multiplying.
Researchers have found a workaround. Instead of simply increasing training size, giving the model more time to think when answering a question can also improve results. Letting it work through steps before arriving at a conclusion is effective. The gains are real, but this approach doesn’t apply to everything and it’s not free either. It simply shifts the cost from training to running the model. The underlying issue persists.
The Data Wall: Running Out of Human Thought
Another vital issue often overlooked is where the training data comes from.
These models require tens of trillions of words of quality human text. Wikipedia is done, every digitized book, most of the usable internet. It’s basically all been used already.
The industry’s answer has been to have AI generate text that other AIs then train on. This approach works to a point, but it has a drawback known as model collapse. If a model trains too much on its own outputs, it gradually loses diversity. The writing becomes dull and the responses more predictable, like making repeated photocopies of a document. It remains readable, but with each copy, something is lost.
No one knows how much this is already affecting the models in use today. The labs don’t discuss it much. It is a genuine structural issue that worsens with time.
The Energy Ceiling: From Software to Power Grids
Everyone talks about chips and data. Nobody mentions that you can buy all the hardware you want and still have nowhere to plug it in.
The claim that AI consumes ten times more energy than a Google search comes from one contested 2023 study. A ChatGPT query runs about 0.34 watt-hours according to Sam Altman. The commonly cited Google figure of 0.3 watt-hours is from a 2009 benchmark and Google has since said its data centers have become far more efficient, meaning the real modern comparison is probably closer to ten times more energy per AI query than a current Google search, not nearly identical. Either way the aggregate impact is what matters. Training is a completely different story. Anthropic has predicted that training one frontier model by 2028 will require about five gigawatts of continuous power, the equivalent of five nuclear reactors running nonstop for a single training run. They also estimate the US will need 50 gigawatts of new electrical capacity by 2028.
Residents in data-center-heavy states are already seeing meaningful bill increases. Half the country’s data centers shifted to Northern Virginia and the grid wasn’t built for it. This isn’t something that can be resolved with a software update. Securing permits takes years, constructing new power lines takes even longer. Those who planned this growth seem to have assumed the electricity would simply be available.
It won’t be.
The Transformer Trap: A Master of Guessing With Workarounds Emerging
Every major AI model operates on the same architecture called the Transformer, which has a significant limitation that simply adding more of it won’t resolve.
These models predict the next word or token using familiar patterns. They excel at this task, but they don’t reason through problems like humans do. The architecture processes information in a fixed number of steps, limiting the complexity of logical chains it can follow. Bigger doesn’t mean deeper.
Researchers have identified a workaround: making the model show its work. By writing out reasoning before giving an answer, the thinking gets spread over multiple steps. This has proven helpful, and GPT-5.5 was designed with this in mind. But is that real thinking or just text that looks like thinking? That’s a question researchers still genuinely disagree on.
What it definitely represents is a workaround. The core issue remains.
The Aviation Parallel: Why We Do Not Fly at Mach 3
Ask someone in 1970 where aviation would be by 2025, and they might have predicted supersonic travel for everyone. The Concorde was already flying and seemed to work, operating reliably for 27 years. However, the development cost never paid back, and Air France struggled to make it work commercially. Sonic booms made supersonic flight over US land illegal, and tickets were incredibly expensive. Nobody could make the economics work.
It was retired in 2003. Despite everything, widespread supersonic travel never happened. Turned out that was fine. Planes became more efficient, far more people fly now than anyone imagined in 1970, and the industry improved in ways nobody predicted. The progress became less exciting and more impactful at the same time.
This reflects where AI seems to be heading. As the question arises of if I spend another 100 billion on AI capex will that really get me 100 billion worth of value for minimal improvements.
The Application Layer
I understand the counterargument. OpenAI went from $6 billion to $20 billion in a single year. Anthropic went from $9 billion to $30 billion in four months. None of this looks like a ceiling.
But notice what’s actually driving that revenue. It isn’t people waiting for GPT-6. It’s people finally figuring out what to do with what already exists. The interesting numbers in AI right now aren’t on benchmark leaderboards. They’re in the application layer.
Claude Code didn’t exist 12 months ago and is already at $2.5 billion annualized. Codex went from nothing to 2 million weekly users in about a year. Cursor, Harvey, Glean, none of them are foundation labs. They took a model someone else trained and turned it into a specific tool that saves a specific person time on specific work. That’s where the next phase of value capture lives.
The pattern is familiar. The internet’s core protocols stopped evolving around 1995, and the trillions of dollars in value came from Amazon, Google, Stripe, the people who built on top. The car engine was solved by the 1920s, and the auto industry compounded for another century. The constraint shifts. The money shifts with it.
So when I ask whether the $700 billion pays back, the answer doesn’t run through GPT-6. It runs through whoever turns the models we already have into products people actually use. That’s the phase we’re entering, and almost nobody is framing it that way.


