|
Hey Reader, An interesting thought is popping up in conversations around AI agents: the environment around the thing matters more than the thing itself. Last week I read about Harness engineering and it felt very familiar. As if it was tapping into instincts I already had. If youāve ever debugged a flaky test only to find the problem was in the setup, not the assertion, that instinct will feel natural. But at the same time it also feels like an unfamiliar territory. It borrows the same principle, but the challenges are different: managing context windows, orchestrating tool access, building guardrails for non-deterministic outputs. This week Iām exploring this new discipline, and why the mindset quality engineers already have might be a real head start. The model is almost irrelevantāRohitās article on harness engineering puts forward a compelling argument: the harness, meaning the complete designed environment around a language model (its tools, context management, guardrails, scaffolding), is what separates teams shipping real software from teams fighting their agent pipelines. What Rohit wrote makes a lot of sense to me. Iāve spent years building test environments where the goal wasnāt to make the tests smarter, but to make the environment around them so well-structured that even simple tests could produce reliable results. A good test harness handles setup, teardown, data isolation, and retry logic so that the actual test code can stay clean and focused. What Rohit describes is the same principle applied to AI agents. The model is the test. The harness is everything else. And āeverything elseā is where the real engineering happens. Your CLAUDE.md is probably being ignored (and how to fix it)Speaking of environments that shape agent behavior, Dex Horthy wrote a great article about how Claude Code handles your CLAUDE.md file. Claude wraps your CLAUDE.md contents in a The fix is surprisingly elegant: wrap domain-specific sections in Speaking of things I didnāt know, Lydia Hallie from Anthropic shared a cool tip: dynamic skills, where you embed executable commands directly in SKILL.md files that Claude runs at invocation time. Itās the same thing as using ā!ā in your Claude Code. I also enjoyed these 50 Claude Code tips from Vishwas. One tip thatās not in the list but I randomly found, is setting up the attribution for your commits - if you ever feel like āCo-Authored-By Claude Sonnet 4.6ā is making you look less cool š The IDE might not be where the work happens anymoreIt seems like one of the hot debate of last week was about usage of IDEs. It started with Andrej Karpathyās tweet and rippled through the internet. Brace Sproul argues that, your IDE is actively slowing you down. His case for agent-first workflows is that we should be opening our editors less, not more. He points to LangChainās Open SWE as an example of what happens when you let agents handle the mechanical parts of development while humans focus on direction and review. āA similar sentiment was shared by Theo, who went through a whole history of IDEs and where they are now. I recommend watching it, as he shares some great points on how coding has changed from single workspace coding environments to multi-workspace agent-first environments. Whatās been keeping me busyOn a personal note, I was part of a panel discussion in Slovak. We discussed how many of the rules weāve been teaching for years might no longer apply in testing anymore. I also did a webinar with Tricentis on effective AI workflows for quality engineers, which touches on many of the themes from this newsletter, specifically how to build the right environment for AI-assisted testing rather than just hoping the model figures it out. As I said in the intro, designing the environment around a model matters more than the model itself. Most of what I read this week seems to confirm that. It should also make sense to quality engineers out there - it is the same instinct that makes quality engineers obsess over test infrastructure. āHarness engineeringā isnāt just test setup with a new name. Itās a new discipline with its own challenges. The good news is that the mindset transfers. The instinct to treat the environment as a first-class engineering problem puts you ahead of most. The unfamiliar part is everything else - and thatās where the interesting work is. Iād love to hear how youāre approaching it. |
Sign up for weekly tips on testing, development, and everything related. Unsubscribe anytime you feel like you had enough š
āToo dangerous to releaseā has become its own genre of AI announcement. Project Glasswing is the latest entry: not quite a product launch, but a claim about a threshold, dressed up with enough corporate coalition to signal this one is serious. Anthropic says their new security-focused model, Claude Mythos Preview, can find software vulnerabilities better than all but the most skilled human experts. George Hotz challenged the ātoo dangerous to releaseā narrative by pointing at the obvious:...
Anthropic had a rough week. And the part that stings isnāt just that something went wrong - itās how they handled it. A map file, a DMCA frenzy, and a Python loophole On March 31st, Anthropic accidentally shipped Claude Codeās TypeScript source code via a map file left in their npm package. The leak was spotted almost immediately, and GitHub repositories mirroring the code started receiving DMCA takedowns shortly after. What followed was a fairly aggressive takedown campaign by Anthropic. One...
Hello Reader, If youāve been reading this newsletter for a while, you know that quality engineering is the hill Iāll always choose to stand on. And this week, I get to share something personal that ties directly into that. Iām joining Qodo Iāve been following Qodo for almost a year now, and Iāve been getting more and more impressed every day. So Iām thrilled to share that Iām joining Qodo as a DevRel engineer. Qodo is an enterprise multi-agent platform for AI-driven code reviews. As AI...