The Five Levels of AI-Assisted Development

Dan Shapiro (CEO, Glowforge) articulated this framework to describe the spectrum of AI integration in software development. It has become the standard vocabulary for discussing dark factory maturity.

The Five Levels

Level 0 — Spicy Autocomplete

AI suggests code lines while humans write the majority of the software.

Tools: GitHub Copilot (primary use case), basic Copilot usage in any IDE.

The experience: Tab-complete suggestions. AI handles the syntax you’d have to type; you’re still making every structural decision.

Level 1 — Discrete Task Delegation

AI handles discrete, well-scoped tasks while humans manage architecture, judgment, and integration.

Tools: Copilot, Cursor chat, Claude for small well-defined pieces.

The experience: “Write me a function that does X.” You review the output, integrate it, and move on. You’re still writing most of the code.

Level 2 — Multi-File Agent

AI handles multi-file changes across modules while humans review all code output.

Tools: Cursor Composer, Claude Code for contained features, Devin for isolated tasks.

The experience: “Add user authentication to this app.” The agent touches 8 files. You review every diff. Most teams claiming “AI native” are here.

Shapiro’s observation: “90% of developers who say they are AI native are operating at level two.”

Level 3 — Human Directs, AI Builds

The relationship shifts: humans direct AI and review features rather than reviewing code.

Tools: Claude Code, Devin, Cursor for longer agentic runs.

The experience: You write a feature description. The agent implements. You test whether the feature works as intended — you don’t read every line of code. This is where the psychological difficulty begins: trusting output you haven’t read.

Level 4 — Humans as Product Managers

Developers become product managers: writing specifications and evaluating test outcomes only. No code writing, minimal code review.

Tools: Claude Code at depth, Devin, dark factory setups.

The experience: You write a spec. The agent implements, runs tests, iterates. You check whether the outcomes match the spec. If they do, it ships. You may never open the file the agent modified.

The bottleneck has completely shifted from implementation speed to specification quality.

Level 5 — The Dark Factory

No human code writing. No human code review. Fully autonomous. The dark factory.

The defining rule: “Code must not be written by humans. Code must not be even reviewed by humans.”

Tools: Attractor (StrongDM), Claude Code + external scenario testing methodology.

The experience: Engineers write markdown specification files and evaluate whether scenarios pass. They don’t see the code. The agent runs, the scenarios run, it either passes or it doesn’t.

Currently documented in production: StrongDM (3-person team, Rust/Go codebase, Attractor agent).

Why Most Teams Stop at Level 3

Shapiro identifies the psychological barrier: “Almost everybody tops out at level three because they struggle with the psychological difficulty of letting go.”

Level 3 feels like safety. You’re reviewing features, you’re in the loop. Level 4 requires trusting evaluation over code review. Level 5 requires trusting the scenarios over evaluation.

Each step up requires:

Better specifications
Better test coverage
More trust in the pipeline
Organizational structures designed for machines, not humans

The Organizational Implication

At Level 5, every meeting, process, and organizational structure designed to coordinate humans writing code is pure overhead. StrongDM eliminated:

Daily standups
Sprint planning
Code reviews
Jira boards

They didn’t eliminate outcomes tracking or specification quality. They eliminated the human coordination layer around implementation — because there’s no longer a human implementation to coordinate.