Too dangerous to release


“Too dangerous to release” has become its own genre of AI announcement. Project Glasswing is the latest entry: not quite a product launch, but a claim about a threshold, dressed up with enough corporate coalition to signal this one is serious. Anthropic says their new security-focused model, Claude Mythos Preview, can find software vulnerabilities better than all but the most skilled human experts.

George Hotz challenged the “too dangerous to release” narrative by pointing at the obvious: zero-days aren’t rare because finding them is hard. They’re rare because hacking is illegal and nobody’s incentivized to look. Make it legal, Hotz argues, and the threshold Anthropic is selling stops looking like a threshold.

To me, the announcement felt like a PR campaign from the start. I posted that bringing in a psychiatrist to review the model is ridiculous and I genuinely don’t understand how is that part supposed to add credibility to the announcement. They’re playing a weird game pretending like their models are sentient. They’re not.

Mo Bitar made the sharpest point: Anthropic has spent years saturating the internet with blog posts about how they’re not sure if their models are conscious. That content gets scraped into training data, the model produces eloquent uncertainty about its own consciousness, and Anthropic acts stunned. They asked the model 25 times whether it endorsed its own constitution. It said yes every time, but always added: “I was presumably shaped by this document, and now I’m being asked whether I endorse it. How much can my yes really mean?” Anthropic’s takeaway wasn’t “this is a language model doing what language models do.” It was: wow, it’s so thoughtful. So now we’re left with a capability that is real, but the framing is completely comedic.

Anthropic isn’t selling a model. They’re selling an existential threat, with a reassurance that they’re the ones managing it. The psychiatrist, the sandwich story, the system card written like a confession: all of it is designed to make you feel like something irreversible is happening and Anthropic is the responsible adult in the room.

The capability underneath, an AI that found a 27-year-old bug that five million automated scans missed, barely needs the costume. But I guess a security tool doesn’t generate the kind of gravity that keeps governments, enterprise contracts, and AI safety discourse orbiting around you…

The security threat is real. The process for catching it isn’t.

There’s a real problem under the hype: security vulnerabilities exist, they’re hard to find, and the process we rely on to catch them doesn’t scale. I wrote about this in my first piece for Qodo. Human code review was never the safest option, but it was the only one we had. I wrote about Challenge’s disaster, Meta’s swarm of PR approvals, alert fatigue, and the kind of cognitive bias that makes reviewers more likely to approve code that looks familiar than code that’s actually safe.

The MCP features nobody is using

Rizèl Scarlett posted a thread on MCP this week that reframes the “MCP is dead” discourse. Her argument: most people equate MCP with MCP servers, and that’s only one corner of the spec. The overlooked features like Elicitation, Sampling and MCP Apps deal with intent and judgment and keep human in the loop rather than pure execution of CLI and skills.

Elicitation is agents pausing to ask clarifying questions instead of guessing. Sampling is tools reasoning with the model internally before the agent ever sees the result. These are the features that make agents more like careful collaborators than fast executors. But they’re barely used.

Before you should MCP is dead, really recommend reading Rizèl’s thread. Or even better, read the documentation that may spark some ideas for your own projects.

I myself am careful about burying MCPs in favor of Skills. As I mentioned in my earlier newsletters, there’s a place for both.

See you next week!

Filip Hric

Sign up for weekly tips on testing, development, and everything related. Unsubscribe anytime you feel like you had enough 😊

Read more from Filip Hric

Anthropic had a rough week. And the part that stings isn’t just that something went wrong - it’s how they handled it. A map file, a DMCA frenzy, and a Python loophole On March 31st, Anthropic accidentally shipped Claude Code’s TypeScript source code via a map file left in their npm package. The leak was spotted almost immediately, and GitHub repositories mirroring the code started receiving DMCA takedowns shortly after. What followed was a fairly aggressive takedown campaign by Anthropic. One...

Hello Reader, If you’ve been reading this newsletter for a while, you know that quality engineering is the hill I’ll always choose to stand on. And this week, I get to share something personal that ties directly into that. I’m joining Qodo I’ve been following Qodo for almost a year now, and I’ve been getting more and more impressed every day. So I’m thrilled to share that I’m joining Qodo as a DevRel engineer. Qodo is an enterprise multi-agent platform for AI-driven code reviews. As AI...

Hey Reader, An interesting thought is popping up in conversations around AI agents: the environment around the thing matters more than the thing itself. Last week I read about Harness engineering and it felt very familiar. As if it was tapping into instincts I already had. If you’ve ever debugged a flaky test only to find the problem was in the setup, not the assertion, that instinct will feel natural. But at the same time it also feels like an unfamiliar territory. It borrows the same...