The Agent Model Split: How Four Spec Tools Think About AI

Four tools, four takes on spec-driven development, four completely different ideas about how AI should relate to your specifications. OpenSpec, GitHub Spec Kit, AWS Kiro, and Augment Code all promise structured, verifiable development. But they disagree on something fundamental: the agent model.

That disagreement defines everything.

The Agent Model Split

Every SDD tool has to answer a core question: what role does the AI agent play? The answers diverge sharply.

OpenSpec: One agent, twelve specialized skills. A single AI agent switches between focused skill contexts. It explores like a thinking partner, specs like a product manager, implements like a developer, and verifies like a QA lead. Same agent, different hats, loaded via markdown skill files. The agent stays unified while the skills provide structure.

Spec Kit: One agent, sequential commands. A single AI agent follows slash commands through a fixed pipeline. /speckit.specify then /speckit.plan then /speckit.tasks then /speckit.implement. The agent is a generalist that follows instructions. The commands provide the structure.

Kiro: One built-in IDE agent with hooks. The AI agent is embedded directly in the IDE (a VS Code fork). It generates requirements, design docs, and tasks from natural language. But the real differentiator is hooks: automated processes that fire on every file save. Change a component? The hook auto-updates documentation, validates against specs, checks style. The agent is always watching.

Augment Code: One agent with a Context Engine. The AI agent has a superpower: a semantic index of your entire codebase. 400,000 files across multiple repositories, all understood. Dependencies mapped. Patterns recognized. History tracked. The agent doesn’t need you to explain architecture. It already knows.

"The question isn't which tool has the best AI. It's which tool gives AI the best context for your specific problem."

Where Each Tool Shines

OpenSpec shines when you need lifecycle management. The full workflow, from exploring an idea through specification, implementation, verification, and archival, is the most complete lifecycle of any tool. Living system specs that accumulate across changes are unique to OpenSpec. Delta sync (ADDED/MODIFIED/REMOVED sections merging into system specs) means your specification becomes a persistent, evolving asset. The three-dimension verification gate (completeness, correctness, coherence) provides real quality assurance before you close a change.

Spec Kit shines when you need to start fast. Zero configuration. specify init . and you’re running. The fixed pipeline means no decision fatigue. GitHub’s brand and 88 contributors mean broad compatibility and community support. The constitution.md concept, a project-wide “supreme law” that every AI agent reads first, is simple but powerful. If you’ve never done SDD, start here.

Kiro shines when you want continuous enforcement. Hooks change the game. Instead of validating specs at discrete checkpoints (before implementation, after implementation), Kiro validates on every file save. Change a React component and a hook instantly checks that it matches the spec, updates the docs, validates the style. Agent Steering rules can be auto-generated from your existing codebase, so the AI learns your conventions instead of imposing its own. The full IDE experience (it’s a VS Code fork) means no context switching.

Augment Code shines when scale is the problem. If you have 100 engineers, 50 repositories, and millions of lines of code, the other tools hit a wall. Augment’s Context Engine semantically indexes everything. The agent understands cross-repository dependencies, historical patterns, and architectural decisions without you having to explain them. Specs become executable contracts: builds literally fail if code diverges. It’s enterprise infrastructure, not a developer tool.

Spec Format and Precision

The tools also disagree on how detailed specs should be.

OpenSpec uses Gherkin-style behavioral scenarios. Each requirement has WHEN/THEN conditions that map directly to automated tests. “WHEN a visitor navigates to the homepage THEN the hero section SHALL display the heading.” This is precise, testable, and leaves no room for interpretation.

Spec Kit uses freeform requirements with acceptance criteria. “Hero section with tagline and description” paired with “Hero renders on page load with correct heading.” Human-readable, but requires interpretation to become a test.

Kiro auto-generates structured requirements from natural language prompts. You describe what you want, and the AI produces user stories and acceptance criteria. Fast, but sometimes vague where precision matters.

Augment embeds spec enforcement in the codebase itself. Specs aren’t separate documents. They live in the code, and the build system enforces them. This eliminates spec drift entirely but means specs are harder to read as standalone documents.

Verification: The Weak Spot

This is where the tools diverge most.

OpenSpec has the strongest formal verification. The /opsx:verify command produces a structured report across three dimensions (completeness, correctness, coherence) with issues scored as CRITICAL, WARNING, or SUGGESTION. It checks task completion, requirement coverage, scenario coverage, and design adherence.

Kiro’s hooks provide continuous but shallow verification. They catch style violations and spec mismatches in real time but don’t produce a comprehensive quality report.

Augment’s approach is the most enforceable. Build failures are hard to ignore. But it requires enterprise setup and CI/CD integration.

Spec Kit has the lightest verification. /speckit.analyze checks consistency across artifacts, but there’s no structured quality gate.

"The best spec in the world is worthless if nothing checks that the code actually matches it."

They’re Not Competing. They’re Layers.

Here’s the insight most comparisons miss: these tools operate at different layers of the development stack.

Spec Kit is the starter kit. Templates and commands to begin SDD.
OpenSpec is the workflow engine. The full spec lifecycle with verification and archiving.
Kiro is the IDE. A polished development environment with built-in spec support.
Augment is the infrastructure. Enterprise-scale context and enforcement.

The winning stack might combine them: Spec Kit’s simplicity to onboard, OpenSpec’s lifecycle for rigor, Kiro’s hooks for continuous enforcement, and Augment’s context engine for scale. The spec-driven development space is early enough that these layers will likely consolidate. For now, pick the layer that matches your most pressing problem: starting (Spec Kit), managing (OpenSpec), enforcing (Kiro), or scaling (Augment).