I wrote about agentic coding a few days ago and said the key skill shift is learning to delegate well. That article was about the “what.” This one is about the “how,” because the community has split into two very different camps on how you should actually work with AI coding agents.
Camp one says: just prompt iteratively. Describe what you want, let the agent build it, review the output, correct as needed. Keep it conversational. Some people call this vibe coding, though I think that term has gotten stretched beyond usefulness at this point.
Camp two says: write detailed specifications first. Break the work into structured documents. Define requirements, constraints, architecture, and acceptance criteria before the agent writes a single line of code. Then hand the spec to the agent and let it execute.
That second approach has a name now. It is called spec-driven development, and it is everywhere. GitHub launched an open source toolkit called Spec Kit that has over 72,000 stars. AWS built an entire IDE called Kiro around the concept. Martin Fowler wrote about it on his blog. Thoughtworks published a deep analysis. It was trending on Hacker News three separate times in the past month, with one post hitting 332 points just yesterday.
The question nobody seems to agree on: is this actually better, or is it Waterfall methodology wearing a trendy new hat?
I spent the last week trying it on a real project. Here is what I found.
What Spec-Driven Development Actually Is
Let me be precise about the definition, because “spec-driven development” is already getting used loosely.
In traditional software development, you think about what to build, then you build it. The code is the source of truth. Documentation, if it exists at all, is a secondary artifact that usually falls out of date within weeks.
Spec-driven development inverts this. The specification is the source of truth. Code is a generated artifact that implements the spec. If the code and the spec disagree, you fix the code, not the spec.
In practice, the workflow looks like this:
-
Specify: You write a clear description of what you are building and why. User stories, acceptance criteria, edge cases, the full picture. The AI agent can help you generate and refine this.
-
Plan: You define technical constraints. What stack to use, what patterns to follow, what architectural decisions matter. The agent produces an implementation plan based on the spec and constraints.
-
Tasks: The plan gets broken into concrete, testable work items. Each task has clear inputs, expected outputs, and validation criteria.
-
Implement: The coding agent works through the tasks systematically, using the spec and plan as context for every decision it makes.
The core insight is straightforward: language models are excellent at pattern completion, but they are terrible at mind reading. The more context and structure you give them, the better the output. A detailed spec is just very good context.
The Waterfall Comparison
The first thing most experienced developers think when they hear this is: “Wait. Writing exhaustive requirements documents before coding? Did we not try this already? Did it not fail spectacularly?”
That reaction is completely fair. The Waterfall model dominated software development for decades and was abandoned because it makes a fundamentally flawed assumption: that you can fully specify a system before building it. In practice, requirements change. Users do not know what they want until they see something working. The feedback loop between specification and implementation is where real understanding happens.
Spec-driven development gets compared to Waterfall constantly. The Hacker News thread literally had “The Waterfall Strikes Back” in the title. And the comparison is not entirely wrong.
But here is where it diverges in a meaningful way.
In Waterfall, the feedback loop between spec and implementation was months long. You wrote a 200-page requirements document, handed it to a team of developers, and waited three to six months for a deliverable. By the time you saw working software, the requirements were already wrong.
In spec-driven development with AI agents, the feedback loop is minutes. You write a spec. The agent generates the implementation in five to fifteen minutes. You review it. If the spec was wrong or incomplete, you update the spec and regenerate. The entire cycle that took months in Waterfall takes an afternoon.
That is not a small difference. That is a category difference. The reason Waterfall failed was not that specifications are bad. It was that the cost of discovering your specification was wrong was catastrophically high. When the cost of regenerating code from an updated spec drops to nearly zero, the economics change completely.
What I Actually Tried
I have a side project that needed a notification system. Users should get in-app notifications for certain events, with read/unread state, a notification preferences page, and email fallback for critical alerts. Not trivial, but not wildly complex either. A solid real-world test.
I tried building it two ways.
Approach 1: Iterative Prompting
This is how I normally work with Claude Code. I described the notification system in natural language, gave it context about the existing codebase, and let it work. When it made wrong assumptions, I corrected them. When it went in a direction I did not like, I told it to try something different.
The result was functional after about 45 minutes of back-and-forth. The code was decent. A few things needed manual cleanup. The agent made some assumptions about my database schema that were wrong, and I had to course-correct twice on the notification preferences UI.
Overall: fine. Shipped something real. The process felt natural.
Approach 2: Spec-Driven
I used a simplified version of the Spec Kit workflow. I wrote a specification document first: what notifications exist, when they trigger, what the data model looks like, how preferences work, the email fallback logic. I included acceptance criteria for each feature. I specified the technical constraints: use the existing database ORM, follow the existing API pattern, match the current UI component library.
Then I handed the entire spec to Claude Code and told it to implement it.
The initial implementation took about 12 minutes. And it was noticeably better on the first pass. The database schema matched what I wanted. The API endpoints followed the existing patterns without me having to correct course. The notification preferences UI used the right components.
There was still cleanup needed. The email template rendering had a bug. One edge case around batch notification marking was not handled. But the amount of back-and-forth was dramatically less.
The catch? Writing the spec took me about 30 minutes. So the total time was roughly the same.
Where Spec-Driven Development Wins
After trying both approaches and reading a lot of community discussion, here is where I think SDD genuinely earns its keep:
Team projects with multiple contributors. When you are working solo, iterative prompting works fine because the context lives in your head. When three developers are using AI agents on the same codebase, having a shared spec that defines what “done” looks like is valuable. It is the same reason engineering teams write design documents, just formatted for AI consumption.
Complex features with many moving parts. The notification system was moderately complex. For something with ten interconnected components, a spec that defines how they all relate to each other prevents the agent from making inconsistent decisions across different parts of the system. The spec becomes a consistency anchor.
Legacy codebases and brownfield projects. If you are adding features to an existing system with established patterns and conventions, a spec that explicitly states “follow the pattern in X file” and “use the existing Y service” dramatically reduces the agent’s tendency to invent its own approach. This was the biggest win I noticed. The spec-driven approach produced code that looked like it belonged in the existing codebase. The iterative approach produced code that worked but felt slightly foreign.
Handoffs and async work. If you are writing specs during the day and running agents overnight (something I mentioned in the agentic coding article), having a well-defined spec makes the overnight run much more reliable. The agent has everything it needs without the ability to ask clarifying questions.
Where It Falls Apart
I also want to be honest about the downsides, because the hype around SDD is getting a bit intense.
Spec maintenance is real overhead. Once you have a spec and an implementation, you now have two things that need to stay in sync. When requirements change (and they always change), updating the spec before updating the code adds friction. Some tools try to keep specs and code synchronized automatically, but that is still an imperfect process.
It can feel like busywork for small tasks. Writing a detailed specification for a bug fix or a small feature is overkill. Not every task needs a formal spec. The overhead-to-value ratio only makes sense for features above a certain complexity threshold.
The agent does not always follow the spec anyway. This was the most frustrating part of my experiment. Even with a detailed spec, the agent occasionally ignored constraints or made its own decisions about things I had explicitly specified. The spec improved the hit rate, but it did not guarantee compliance. You still need to review everything. The review step does not go away.
You can over-specify. There is a real risk of spending so much time perfecting the specification that you lose time overall. I caught myself doing this. I was writing acceptance criteria for edge cases that, realistically, I would have just handled on the fly with iterative prompting. The spec became a procrastination tool disguised as preparation.
It does not replace domain knowledge. Some advocates talk about SDD as if it enables non-developers to build software by writing specs. That is misleading. You still need to know what a good database schema looks like. You still need to understand your API patterns. You still need to recognize when the agent’s output has subtle bugs. The spec is a communication tool, not a substitute for expertise.
The Tools Landscape
The tooling around spec-driven development is moving fast. Here is what is out there right now:
GitHub Spec Kit is the most popular open source option. It provides templates and workflows for the specify, plan, tasks, and implement cycle. It supports over 22 AI agent platforms including Claude Code, GitHub Copilot, Amazon Q, and Gemini CLI. At 72,000+ stars, it has serious community momentum.
AWS Kiro is an IDE built around spec-driven workflows. It encodes the full spec lifecycle with steps like Constitution, Specify, Plan, Tasks, Implement, and PR. If you are already in the AWS ecosystem, this is worth looking at.
Tessl takes it further with a framework and registry approach. Think of it as npm for specifications. You can publish and share spec templates across teams.
CLAUDE.md and AGENTS.md files are the low-tech version of SDD that a lot of developers are already using without calling it spec-driven development. Writing detailed project context, conventions, and constraints in a markdown file that your agent reads on every session is, structurally, the same idea. Just less formalized.
If you are already using CLAUDE.md files effectively, you are already doing a lightweight form of spec-driven development. The formal tools add structure and workflow on top of what is fundamentally the same principle: give the agent better context, get better code.
Context Engineering: The Bigger Picture
Spec-driven development is actually a subset of a broader discipline that is gaining traction: context engineering.
Context engineering is the practice of curating the entire information environment an AI agent operates within. That includes the files it reads, the rules it follows, the history it carries, the tools it can reach, and the structure of the project it navigates. Martin Fowler wrote about it in the context of coding agents, and by mid-2025 it was being called the discipline that determines whether AI coding agents ship reliable code or generate expensive technical debt.
The reason this matters is that specs are just one kind of context. Other forms of context that affect agent output quality include:
- Project conventions (how you name files, how you structure imports, what patterns you use)
- Architectural decisions (why you chose this database, why the API is structured this way)
- Team knowledge (what you tried before that did not work, what constraints come from business requirements)
- Codebase memory (what the agent learned from previous sessions)
Spec-driven development formalizes the “requirements” portion of context. But the best results come from getting all forms of context right, not just the spec.
This is why I think the SDD conversation is useful but incomplete. Obsessing over perfect specifications while ignoring project conventions, architectural context, and codebase memory is like writing a detailed recipe but forgetting to stock the kitchen. The spec matters, but it is not the only thing that matters.
My Honest Take: It Depends (But Here Is When)
After trying it and reading extensively about it, here is my practical recommendation:
Use spec-driven development when:
- The feature is complex enough that you would write a design document anyway
- Multiple people or agents will be working on the same feature
- You are working in a large existing codebase with established patterns
- You are delegating work to run asynchronously (overnight agent runs, for example)
- You want a reviewable artifact that captures the “why” behind implementation decisions
Skip the spec and iterate when:
- The task is small, well-understood, and you know what you want
- You are prototyping or exploring and do not yet know the right approach
- You are working solo on a project you understand deeply
- The cost of a wrong first attempt is low (you can regenerate cheaply)
Never do:
- Write specs for the sake of following a methodology. If the spec is not genuinely useful context for the agent, it is wasted effort.
- Treat the spec as a guarantee. Review the output as rigorously as you would without a spec.
- Use SDD as a substitute for understanding your own codebase. If you cannot evaluate whether the agent’s output is correct, no spec will save you.
The Real Question Nobody Is Asking
Here is what I think the SDD debate is actually about, underneath the methodology arguments.
We are in a transition period where developers are figuring out how to work with AI coding agents. The tools are powerful but they require a new kind of interaction pattern. Some developers lean toward more structure (specs, plans, formal workflows). Others lean toward less structure (iterative prompting, conversational development, vibe coding).
Neither is universally right. The optimal approach depends on the task, the team, and the codebase. The developers who will be most effective are those who can fluidly move between structured and unstructured approaches based on what the situation calls for.
That is not a satisfying answer for people who want a methodology to follow. But it is the honest one.
Spec-driven development is a useful tool in the toolbox. It is not the only tool. And calling it Waterfall 2.0 misses the point as badly as calling it the future of all software development.
The future is probably messier and more pragmatic than either camp wants to admit. And I am fine with that.
Getting Started If You Want to Try It
If you want to experiment with spec-driven development, here is a low-friction way to start without adopting a full framework:
-
Pick a feature you are about to build. Something with at least three or four moving parts.
-
Before prompting your AI agent, write a markdown document that covers: what you are building, why, what the user experience should be, what technical constraints exist, and what “done” looks like.
-
Hand that document to your agent along with the implementation request. Compare the output quality to what you normally get with iterative prompting.
-
If the output is meaningfully better, gradually formalize the process. If it is not, your iterative approach is probably working well enough for your use case.
You do not need Spec Kit or Kiro or any specific tool to try this. A markdown file and your existing agent are enough. The value is in the thinking you do while writing the spec, not in the tooling around it.
Start small. See if it helps. Adjust from there. That is how every useful methodology actually gets adopted, regardless of what the thought leaders say.