Too dangerous to release

“Too dangerous to release” has become its own genre of AI announcement. Project Glasswing is the latest entry: not quite a product launch, but a claim about a threshold, dressed up with enough corporate coalition to signal this one is serious. Anthropic says their new security-focused model, Claude Mythos Preview, can find software vulnerabilities better than all but the most skilled human experts.

George Hotz challenged the “too dangerous to release” narrative by pointing at the obvious: zero-days aren’t rare because finding them is hard. They’re rare because hacking is illegal and nobody’s incentivized to look. Make it legal, Hotz argues, and the threshold Anthropic is selling stops looking like a threshold.

To me, the announcement felt like a PR campaign from the start. I posted that bringing in a psychiatrist to review the model is ridiculous and I genuinely don’t understand how is that part supposed to add credibility to the announcement. They’re playing a weird game pretending like their models are sentient. They’re not.

Mo Bitar made the sharpest point: Anthropic has spent years saturating the internet with blog posts about how they’re not sure if their models are conscious. That content gets scraped into training data, the model produces eloquent uncertainty about its own consciousness, and Anthropic acts stunned. They asked the model 25 times whether it endorsed its own constitution. It said yes every time, but always added: “I was presumably shaped by this document, and now I’m being asked whether I endorse it. How much can my yes really mean?” Anthropic’s takeaway wasn’t “this is a language model doing what language models do.” It was: wow, it’s so thoughtful. So now we’re left with a capability that is real, but the framing is completely comedic.

Anthropic isn’t selling a model. They’re selling an existential threat, with a reassurance that they’re the ones managing it. The psychiatrist, the sandwich story, the system card written like a confession: all of it is designed to make you feel like something irreversible is happening and Anthropic is the responsible adult in the room.

The capability underneath, an AI that found a 27-year-old bug that five million automated scans missed, barely needs the costume. But I guess a security tool doesn’t generate the kind of gravity that keeps governments, enterprise contracts, and AI safety discourse orbiting around you…

The security threat is real. The process for catching it isn’t.

There’s a real problem under the hype: security vulnerabilities exist, they’re hard to find, and the process we rely on to catch them doesn’t scale. I wrote about this in my first piece for Qodo. Human code review was never the safest option, but it was the only one we had. I wrote about Challenge’s disaster, Meta’s swarm of PR approvals, alert fatigue, and the kind of cognitive bias that makes reviewers more likely to approve code that looks familiar than code that’s actually safe.

The MCP features nobody is using

Rizèl Scarlett posted a thread on MCP this week that reframes the “MCP is dead” discourse. Her argument: most people equate MCP with MCP servers, and that’s only one corner of the spec. The overlooked features like Elicitation, Sampling and MCP Apps deal with intent and judgment and keep human in the loop rather than pure execution of CLI and skills.

Elicitation is agents pausing to ask clarifying questions instead of guessing. Sampling is tools reasoning with the model internally before the agent ever sees the result. These are the features that make agents more like careful collaborators than fast executors. But they’re barely used.

Before you should MCP is dead, really recommend reading Rizèl’s thread. Or even better, read the documentation that may spark some ideas for your own projects.

I myself am careful about burying MCPs in favor of Skills. As I mentioned in my earlier newsletters, there’s a place for both.

See you next week!

Filip Hric

Too dangerous to release

The security threat is real. The process for catching it isn’t.

The MCP features nobody is using

Stop reading your code

Cursor taking over

Bad week for Anthropic