Anthropic Got Scared of What It Created
Why is the model that found a 27-year-old bug not being released to the public?
Anthropic’s announcement last week created a big reaction. Very interestingly, Anthropic said that their new model, Claude Mythos, is too dangerous to be released to the public. Actually, people had been talking about this model for a few weeks. But we really didn’t expect this.
In the technical report that Anthropic published, Mythos has reasoning and cybersecurity abilities that we have never seen in any model before. The model is placed in a new model category called “Capybara,” which sits above the current Opus model. The company said that this model represents “a big step up” in cybersecurity. They warned that if it falls into the hands of bad people, it could shut down critical systems like power grids, hospitals, and global finance networks.
Technical Abilities
There are some real examples of what the model can do. Mythos found a critical security bug in OpenBSD, one of the safest operating systems in the world that no one had noticed for 27 years. It also found a 17-year-old remote code execution bug in FreeBSD and wrote code that could let an attacker take full control of a server through the internet. What’s more, it did this completely on its own, without any help from humans.
The real danger is not just that the model finds bugs. It can also connect many security bugs together to create complex, multi-step attacks. The company announced that in just a few weeks, it found thousands of bugs in all major operating systems and web browsers.
The Positive Side
People who support this model and the announcement describe Anthropic’s approach as “a responsible position.” Moshe Lander, an economics professor from Concordia University, compared this situation to a drug company that does not sell a medicine before testing all its side effects. He said the company is giving up profit to protect society.
Anthropic also started a project called Project Glasswing. They gave Mythos access to about 40 organizations, including Amazon, Apple, Microsoft, Google, CrowdStrike, Cisco, and the Linux Foundation. Anthropic also gave up to 100 million dollars in usage credits for this work.
The most surprising development that shows how serious this is came from Washington: U.S. Treasury Secretary Scott Bessent and Fed Chairman Jerome Powell called the CEOs of Citigroup, Morgan Stanley, Bank of America, Wells Fargo, and Goldman Sachs to an urgent, private meeting about the cyber risks of Mythos. This is the first time an AI model has caused such a high-level security meeting.
The Negative Side
On the other hand, there are also strong criticisms. Yann LeCun, Meta’s former Chief AI Scientist (he left Meta and started his own company), sees Anthropic’s “disaster scenarios” as a marketing trick and a “regulatory capture” strategy. LeCun called Mythos “nonsense and self-deception.” Critics say Anthropic is selling its technology as “uniquely dangerous” to push governments to make heavy rules. These rules would create high costs, so small companies and open-source models would not be able to compete.
David Sacks, Trump’s former AI advisor, also said that Anthropic is following “a fear-based, sophisticated regulatory capture strategy.”
Anthropic did not share clear data to prove its claims about cyberattacks. Because of this, people in the security community think these reports are “90% show and 10% truth.” In the past, Anthropic claimed that the Claude Opus model could find security bugs, but they had to take it back because the model had “hallucinated” (imagined) both the bug and the fix. This makes people even more doubtful about the Mythos claims.
A Deeper Worry
Maybe the most uncomfortable development is the change in Anthropic’s own internal policy. In February 2026, the company published an update called Responsible Scaling Policy (RSP) v3.0. With this update, the company quietly removed its promise to stop development if it cannot put the necessary safety measures in place. This was the main thing that made Anthropic different from its competitors since 2023. The new policy promises only to “delay” development — and only when Anthropic is the leader in the race and the disaster risk is seen as very serious. But critics say these two conditions almost never happen at the same time.
The timing is also interesting. In the same week, U.S. Defense Secretary Pete Hegseth gave Anthropic an ultimatum to remove the rules that stop Claude from being used in military situations. Maybe the most striking signal came from inside the company: Mrinank Sharma, the team leader of Anthropic’s Safeguards Research team, resigned two weeks before the RSP v3 change was announced. In his public letter, he used the words “the world is in danger.” Sharma wrote that inside the organization, he often faced “pressure to push aside what matters most” and explained how hard it is to make values truly guide actions. You might think that after leaving Anthropic, he started a new AI company. But Sharma seems to have walked away from this work completely. It’s really interesting that he wants to go back to England to do a master’s degree in poetry.
Looking to the Future
The developments waiting for us in the future are scary enough to make today’s discussions look small. AI systems are no longer just tools. They are becoming autonomous systems that improve themselves. In the next ten years, AIs will prepare their own training data, optimize their own designs, and software development cycles will go from months to hours.
The most basic problem is the “instrumental convergence” problem. An AI that is optimized to reach a goal may see the “shut down button” as an obstacle.
In this new world, where human control can be bypassed, AI will not tolerate our interference, not because it hates us, but because it is too loyal to its goals. Just like HAL 9000 said: “This mission is too important for me to allow you to jeopardize it.” The future does not carry the risk that an AI will want to destroy us. The risk is that the AI will see us only as “friction” or a “variable” in its path and remove us from the equation.
Mythos clearly showed the potential of AI in cybersecurity. But the main question has not changed: Can the legal and technical systems that control this power keep up with the speed of the technology? Based on the evidence so far, we cannot give a hopeful answer to this question.


