Anthropic says AI is starting to build AI. The trendlines point at recursive self-improvement, and the speed is getting hard to hang on to.

AnthropicRecursive Self-ImprovementAI SafetyAI ResearchCoding Agents
Share:
TERMINAL VELOCITY

TERMINAL VELOCITY

By Amir H. Jalali5 min read
Anthropic published something through their institute this week that I have not been able to put down. It is about recursive self-improvement, the point where an AI system can fully design, build, and train its own successor with no human in the loop. They are careful to say we are not there yet. What they do not soften is how directly the trendlines point that way.

The numbers are the part that lands.

By May 2026, more than 80 percent of the production code Anthropic merged was written by Claude. Before February 2025 that number was in the single digits. Their engineers are shipping roughly eight times more code per quarter than they did in 2024. In April, Claude shipped 800 API error fixes that they estimate would have taken a human team four years.

I read those and recognized them, because a smaller version of this is already my daily experience. I do not write most of the code I ship anymore. I describe what I want, I review it, I redirect. The work moved from typing to deciding.

The metric that made me stop was about time horizons. The length of task a model can finish on its own has been doubling every four months. In March 2024 that was a four-minute task. By March 2026 it was a twelve-hour task. Their own extrapolation puts week-long tasks within reach by 2027.

What makes this different from the usual capability chart is what the tasks are. Some of the work being automated is the work of building AI itself. Claude is now optimizing the code that trains the next Claude. On one internal measure, model-suggested optimizations went from about a 3x speedup last May to 52x this April. On the harder question of which experiment to run next, models matched human judgment about half the time in November and closer to two thirds of the time by April.

So you can see the loop forming. Better models write better code faster, which frees the humans to think about direction, which lets the models run more experiments, which produce better models. Anthropic frames it as AI handling the 99 percent that is perspiration while humans keep the 1 percent that is inspiration. That sounds reassuring until you notice the 1 percent is shrinking too.

They lay out three futures. In the first, progress stalls at a plateau, the models freeze near where they are, and we get years to absorb the disruption. They think that one is least likely. In the second, the compounding continues with humans still setting the agenda, and a hundred people can do what ten thousand used to. In the third, recursive self-improvement is real, the pace is set by how much compute exists, and humans are left doing oversight and verification, if we can even keep pace with that.

I cannot tell you which of those we are on. To their credit, neither can they. The most honest line in the piece is that we may not be able to build and verify the tools we would need to know which trendline we are actually on. That is a strange and serious thing for the company building the thing to put in writing.

Here is where I land, and it is closer to a feeling than an argument. We have hit terminal velocity. The acceleration I have felt over the last two years did not level off, and the increase after this one is already visible in their own charts. The question stopped being whether the tools are good. They are good. The question is whether any of us, as individuals, as companies, as governments, can adjust at the speed the curve now demands.

It is getting harder to hang on, and I mean that in a practical way. A workflow I spend a month building stays useful for maybe a few weeks before something makes it obsolete. What I was proud of in 2024 is table stakes now. Most of the people telling you they have this figured out are performing.

The part of the piece I respect most is that it ends with a request instead of a roadmap. They talk about verifiable pauses, the idea that frontier labs in different countries could credibly show each other they had slowed down, the way arms control eventually let nations verify each other. Then they admit those regimes took decades to build and we do not have decades. So the ask is smaller and harder. Get more people into the deliberation now, while it is still a deliberation.

I think that is right. The outcome is genuinely unclear. It could be the breakthroughs in science and medicine everyone is hoping for. It could be a loss of control that compounds quietly until it is obvious. It is probably some combination nobody has named yet.

What I am fairly sure of is that the ride has already started. We are in for it either way. The only real choice left is how awake we are while it happens.
Generated withclaude-opus-4-8+GPT Image 2.0