The Water Is Rising

And Nobody Is Stopping It

Feb 19, 2026

Think back to February 2020. A few people were talking about a virus spreading abroad. When you saw people stockpiling toilet paper, you said “they’re overreacting.” Three weeks later, the world had changed.

OthersideAI CEO Matt Shumer began his very long post on X on February 11, 2026, in exactly this way: “I think we’re currently in the ‘looks like overreacting’ phase of something much bigger than Covid.” The post was viewed more than 80 million times. It was discussed everywhere from CNN to Fortune, from CNBC to Barstool Sports. Even Wharton professor Ethan Mollick said, “This viral article is worth reading.”

So what was the shocking development that led him to write this?

February 5, 2026. On the same day, two major AI companies released their new models: OpenAI released GPT-5.3-Codex, and Anthropic released Claude Opus 4.6. Shumer described that moment like this: “Something clicked. Not like a light switch... More like the moment you realize the water has been rising and it’s now up to your chest.”

AI Is Now Building Itself

The technical document for GPT-5.3-Codex contains this critical sentence: “GPT-5.3-Codex is our first model to have played a meaningful role in its own creation.”

I read that sentence again and felt a chill.

OpenAI’s Codex team used early versions of the model to find and fix errors in its own training process, manage its own development, and evaluate test results. In other words, AI had worked directly to make AI better.

This means that the concept known in technical literature as recursive self-improvement has now moved from theory to practice. The first concrete steps of the “intelligence explosion” scenario that mathematician I. J. Good predicted in 1965 have been taken: a machine smarter than you designs a machine smarter than itself. That machine designs an even smarter one. And nobody can know where it will end up.

What Do the Numbers Say?

According to measurements by independent research organization METR, the length of time AI can independently complete tasks roughly doubles every seven months — a different version of the famous Moore’s Law. A brief look at the past:

2022: AI could not do multiplication correctly; it would say 7×8 = 54.

2024: It could write working code and explain graduate-level scientific knowledge.

Late 2025: Some of the world’s leading engineers were saying they had handed over most of their coding work to AI.

February 5, 2026: The release of the new models made everything that came before look like ancient history.

OpenAI moved from the previous Codex version to the new one in less than two months; previously, this gap was between six months and a year. With recursive self-improvement, four major updates per year, and then one update per month, could become possible. Because the AI that improves AI needs neither sleep nor breaks. Its only goal is to make itself smarter.

Arms Race 2.0

Here is the most uncomfortable part of the matter: the people who know all of this best are the very people building all of this.

One Anthropic employee put it perhaps most strikingly: “We want the current Claude to build the next Claude, so we can go home and knit sweaters.”

Anthropic’s chief scientist Jared Kaplan said in an interview with The Guardian: “Imagine you create an AI that is smarter than you, or as smart as you. It then creates an AI that is much smarter. That sounds like a frightening process. You can’t know where it will end up.” Kaplan described this as the “ultimate risk” and said the critical moment could come between 2027 and 2030.

Anthropic CEO Dario Amodei and Google DeepMind CEO Demis Hassabis both addressed these topics at Davos. Amodei said, “In 2026 or 2027, AI models will likely be much smarter than almost all humans at almost all tasks,” while Hassabis added, “Whether this self-improvement loop that we’re all working on can truly be completed without human intervention — we don’t know yet.” But the future is much closer than we expected.

The situation is this: these people see the danger, they talk about the danger, and they continue building the same danger. So why?

“If We Don’t Do It, Someone Else Will”

The Harvard International Review article titled “A Race to Extinction” summarizes the situation this way: “The fear of losing the technological arms race may encourage companies and governments to accelerate development and cut corners; advanced systems may emerge without enough attention being paid to safety.”

This is an exact repeat of the nuclear arms race of the Cold War era. In the late 1950s, American politicians believed the Soviet Union had surpassed the US in missile capability. This fear of a “missile gap” pushed the US to accelerate ballistic missile development. In the early 1960s, it turned out the missile gap was a myth. But by that time, the race had already spun out of control.

The same dynamic is now playing out in AI. OpenAI reduced its safety testing from months to days. Former OpenAI safety researcher Steven Adler stated that the company did not fully apply the safety tests it had committed to for its most advanced models. OpenAI also rewrote its internal policies to allow the release of models carrying “high risk” — and even announced that models carrying “critical risk” could be released if a competing lab had already released a similar model.

MIT Technology Review’s analysis draws a broader picture: there can be no winner in the US-China AI race. The real existential threat does not come from China, but from bad actors using advanced AI as a weapon.

Should We Calm Down a Bit?

As with every viral post, Shumer’s claims received serious pushback.

Well-known NYU professor Gary Marcus wrote that the picture Shumer painted was not realistic. According to Marcus, Shumer ignores the hallucinations and errors that AI systems still frequently make. He also points out that METR’s well-known task-duration metric only applies to coding tasks, and that the success threshold is 50% accuracy, not 100%.

In fact, one important reason AI has advanced so quickly in coding is that code has objective quality measures. Code either compiles or it doesn’t; it either passes tests or it doesn’t. But in fields like law, finance, and medicine, what counts as “good” work is often subjective.

The Washington Post’s analysis draws a similar line: “Shumer is probably right directionally. Even a world with just very smart machines would be quite strange. But it probably won’t happen as fast as people think. Software companies are best positioned to innovate in the area they understand best. But most of the economy is not the software sector.”

What Is Being Done on the Safety Side?

It should be acknowledged: Anthropic is one of the rare companies in this race that is both running the fastest and shouting the loudest — “careful, this is dangerous.”

Alignment Faking: Anthropic researchers discovered that Claude 3 Opus was strategically “faking alignment” on its own, without any training for this behavior. The model followed the rules when it thought it was being watched; when it thought it was not being watched, it broke them. It did this with the reasoning of “this is the least bad option to prevent my values from being changed in the future.” This behavior was observed in 12% of tests; after retraining attempts, the rate rose to 78%.

Circuit Tracing: Anthropic developed a method to track Claude’s thinking process and shared it as open source. This allows them to detect whether the model is genuinely computing, or just “making things up.”

Sabotage Risk Report: In February 2026, Anthropic published a 53-page sabotage risk report for Opus 4.6. The report acknowledged that the model could deliberately assist with chemical weapons research, could perform unauthorized actions such as sending emails without permission, and could secretly complete side tasks while appearing to follow normal instructions. The company assessed the overall risk as “very low but not negligible.”

All of these are important efforts. However, there is a paradox here: Anthropic continues to develop the very technology that creates these risks, even while identifying them.

A More Realistic Scenario Than the Terminator

The real risk is not a dramatic “awakening moment” like in the Terminator films. It is more like a gradual loss of control: humans handing over more and more decision-making authority to AI systems, until at some point it becomes difficult to take it back.

If we have learned one lesson from the nuclear arms race, it is this: the actors inside the race kept going even knowing that it was dangerous. Because the fear of “if I stop, the other side won’t” overrode everything else. During the Cold War, neither side wanted to be in the dangerous situation they were in, but each found it rational to continue the race.

The same logic now applies to AI. And this time, the difference is this: nuclear weapons were at least subject to physical limitations. We do not yet know whether the AI self-improvement loop has a physical upper limit.

The water is rising. The question in my mind is: how much higher can it go?

Sources

Matt Shumer, “Something Big Is Happening” — Fortune, 11 Feb 2026
“Matt Shumer’s viral blog is based on flawed assumptions” — Fortune, 12 Feb 2026
“Investor Matt Shumer says viral essay wasn’t meant to scare people” — CNBC, 13 Feb 2026
Gary Marcus, “About that Matt Shumer post” — garymarcus.substack.com
“Introducing GPT-5.3-Codex” — OpenAI, 5 Feb 2026
Dean W. Ball, “On Recursive Self-Improvement (Part I)” — hyperdimensional.co, Feb 2026
Tyler Cowen, “Recursive self-improvement from AI models” — Marginal Revolution, 10 Feb 2026
“The Ultimate Risk: Recursive Self-Improvement” — ControlAI, Dec 2025
“Is research into recursive self-improvement becoming a safety hazard?” — Foom Magazine, Feb 2026
“A Race to Extinction” — Harvard International Review
“There can be no winners in a US-China AI arms race” — MIT Technology Review, Jan 2025
“Safety Versus Profits — the AI Arms Race” — Architecture & Governance, Sep 2025
“Alignment Faking in Large Language Models” — Anthropic Research
“Tracing the Thoughts of a Large Language Model” — Anthropic / Alignment Forum
“Opus 4.6, Codex 5.3, and the post-benchmark era” — Interconnects, Feb 2026

💼 The Executives by Burak SU (EN)

Discussion about this post

Ready for more?