Cursor taking over

Hey Reader,,

A few years ago Cursor was a nicer place to type code. This week it announced Origin, its own Git competitor built for agent workloads, and got acquired by SpaceX for sixty billion dollars. Somewhere in between, it stopped being an editor and started becoming the whole stack — the place you write, review, merge, and increasingly test your software. I come from QA, so my first instinct isn’t excitement, it’s a question: when one tool owns every step of the loop, who’s the independent check on quality?

Everyone builds with Cursor. Almost nobody tests with it.

This is the gap that’s been bugging me, and it’s why I put together a workshop on AI-powered quality engineering with Cursor. Everyone reaches for it to generate features. Far fewer people point it at the thing that tells you whether those features actually work.

The catch is that an agent will happily write you 10,000 lines of test slop — tests that pass, look thorough, and verify almost nothing. So the goal isn’t “Cursor, generate my tests.” It’s the opposite: you steer it with smart decisions about what behavior actually matters, and the test becomes the way you encode that.

Rizèl Scarlett pointed to a conversation with Angie Jones that says this better than I can — tests matter more in the age of agents, not less, because a test is how you teach an agent how your software is supposed to behave. Angie’s been one of the clearest voices in testing for years, and her full episode is worth the listen. The oldest idea in QA turns out to be the leash, and learning to hold it well is the 2.5 hours I want to spend with people.

When the same tool writes and grades the homework

Origin is the part I keep turning over, because resolving merge conflicts with an agent means the same product now writes, hosts, and proposes to merge your code. Addy Osmani put numbers on why that should worry you in a sharp thread on agentic code review. Agents produce roughly 4x more output but only about 10% more real value, and the gap is review work nobody’s keeping up with. His phrase for it: “We made writing cheap, and understanding stayed exactly as expensive.” Code churn up 861%, defects climbing from 9% to 54%, zero-review merges up 31%.

I don’t agree with all of it, though. Addy suggests running multiple AI reviewers on the same PR, and I pushed back on that. More tools flag more issues, sure, but the cost is alert fatigue, and a wall of low-signal flags is just another way of not really reviewing. Reconstructing intent is worth doing — there are more focused ways to do it than piling on reviewers.

My colleague Nnenna makes the sharper structural point: code quality is a governance question before it’s a tooling one. Risk-profile your repos so a payment flow and a throwaway internal script don’t get the same scrutiny, push deterministic checks earlier in the workflow, and treat AI review as independent verification tied to explicit rules. That’s the “blast radius” idea made operational, and it’s the opposite of letting one tool decide everything by default.

This is the half of the job I think gets too little attention, which is why we put together a free Code Review Academy at Qodo, where I work — for people who’d rather get good at reviewing than outsource it entirely.

None of the tools here are the villain. Cursor is good enough that owning the write-review-merge-test loop feels like convenience rather than a trap, and that’s exactly what makes it worth watching. When one product can do every step, quality stops being something the workflow hands you and becomes something you have to insist on — the test you actually steer, the risk map for what deserves real scrutiny, the review that isn’t graded by the thing that wrote it. None of that shows up by default. You put it there on purpose, or it isn’t there.

So draw that line before the tooling draws it for you. If your team already lives inside one tool, what’s the one quality check you’d refuse to give up — and would it survive the next convenient update? That’s the part I’m still working out, and I’d genuinely like to hear what you’re holding onto.

Filip Hric

Cursor taking over

Everyone builds with Cursor. Almost nobody tests with it.

When the same tool writes and grades the homework

Stop reading your code

Too dangerous to release

Bad week for Anthropic