My experience in QA taught me that without fast feedback, developing complex software is subject to a high risk of error. When I worked as a software engineer in JS, TDD was my basic tool — it allowed for ongoing verification of the architecture and protected against regression.
Today, my workflow is based on managing AI agents that generate code based on my instructions. In this model of work, TDD has become even more important than before.
For the sake of clarity: this isn't the "vibe coding" seen on social media. I still perform rigorous reviews, but my focus has shifted. Instead of analyzing code character by character, I concentrate on setting the right boundaries—how the system should work and what value it will bring to the business. I no longer waste time wondering if a function could be written slightly differently. Of course, when I know a specific fragment is critical, I check it very thoroughly. However, in other cases, I trust my agents—it's their job to take care of technical details, while I keep my finger on the pulse of the bigger picture.
Verifying AI-Generated Code
Collaboration with LLMs changes the nature of Code Review. Instead of analyzing every conditional statement, I focus on the architecture and data flow. Implementation details are verified automatically.
In the world of machine-generated code, high test coverage guarantees that:
- The happy path fulfills business assumptions.
- Edge cases, which AI might skip in the token generation process, are handled.
- Subsequent iterations do not cause regression in existing modules.
Without tests, AI-generated code can give the impression of being structurally correct while not correctly implementing the logic in scenarios other than the basic ones.
Tests as Constraint Definitions for the Agent
AI models require precise frameworks for action. By defining tests at the beginning of the process, you create a set of rules (guardrails) that the implementation must follow. Thanks to this, the agent does not have to interpret unclear requirements — it receives specific, binary criteria for correctness.
Such an approach also forces specific architectural decisions. For example, moving logic to the backend may be dictated solely by the ease of testing it and full control over the operation result. Tests thus become a tool for precise system design. Even if the agent generates too complex or incorrect logic, the tests will immediately show a discrepancy with the expected result and force corrections — and you won't even have to be involved in the process!
Implementation Plan Instead of Improvisation
As the models developed, my workflow also began to evolve. Today, I very rarely start by writing code. Instead, I first create an implementation plan, the main principle of which is TDD.
The plan is not created in one iteration. First, I prepare an initial version, and then I iterate on it with several agents, each of which has a different focus, such as:
- system architecture,
- implementation quality,
- problem domain specifics.
Only after several such iterations is a plan created that is truly polished.
Phase-Based Implementation
When the plan is ready, the actual implementation begins. Each phase is run in a separate chat to maintain a clean context. And here, TDD plays a key role — I use it so often, that I have already a Claude Skill for that.
In phase 0, tests are defined or implemented. From that moment on, their modification is practically forbidden — unless something truly exceptional happens. The further process looks like this:
- The agent implements a given phase.
- It verifies the implementation with tests.
- Then the code goes to three reviewing agents — usually a Software Architect, a Senior Engineer, and a domain expert in the field I am working in.
Each of them is critical and evaluates the solution from a different perspective. Their comments are then verified (because false positives also occur). If they are justified, the agent introduces corrections and reruns the tests. This cycle usually repeats three times. At the end, the agent updates the implementation plan and marks the phase as completed — and in a new chat, I start the next phase (all this to limit the amount of context I overwhelm the agent with).
Remember my brain dump? I keep all progress there. I also ask to keep "progress notes" in the brain dump so I can check the high-level flow of agents on each phase, which allows me to revert some of the changes or come back to some context.
30 Minutes of AI Work Instead of Hours of Debugging
From the outside, such a process may seem long. AI can work on one phase for even 20–30 minutes before returning with the final result. In practice, however, the effect is completely different than with rapid code generation. And with good work planning (e.g., using git worktrees or working on several projects simultaneously, or in a different part of the codebase), I can focus on something else during that time.
Instead of a non-working implementation, hidden edge cases, or accidental design decisions, I get a solution that:
- passes tests,
- has been analyzed from several perspectives,
- is consistent with the previously established plan.
Tests play the role of an objective verification mechanism here. Not only for me — but also for the AI itself.
Why TDD is Even More Important Today
In a world where code is created faster and faster, what becomes most valuable is not the implementation itself, but the certainty that it works correctly. For me, TDD has always been a tool that accelerates software development. Working with AI, it has become something more. It has become a mechanism of control over a system that can generate code faster than a human is able to read it.
I have always appreciated TDD, but today I see what a huge role it can play in modern software development.



