Adopt / Maturity Path
Maturity path
Start where you are. Set the level that matches your codebase needs and your team's comfort. Move up when your quality gates prove you're ready. Not every team needs Level 4, and that's fine.
Each level corresponds to a disposition gate profile, a configuration of which phase failures are fatal, optional, or human-reviewed. See Quality Gates for the gate taxonomy that makes this concrete.
Assisted
Duration: 1-2 weeksHow it works
You run each phase manually, review every output, and make all the decisions. Everyone starts here, on purpose. You're learning what your agentic layer can do and where it needs tuning.
What you're learning
- Whether your prompt templates produce good, actionable output
- Which constraints are missing from your templates (you'll discover these by what the agent gets wrong)
- How your quality gates perform in practice. Are they catching real failures?
- The rhythm of Plan → Build → Test and when to use self-healing loops
At Level 1, you are the Intent phase. You read the issue, decide what it really means, fill in the gaps, and paste a clear description into the Plan template. As you mature, the Intent phase can automate this refinement step.
A day at level 1
You pick an issue from your backlog. You open the Plan template, fill in the variables, and paste it into your agent. The agent researches and produces a plan. You read it. Looks good, references real files. You paste the Build template with the plan artifact. The agent writes tests and code. You run the tests. Two fail. You copy the test output, paste it back to the agent with the Build template. It fixes the implementation. Tests pass. You review the diff, commit, and push. That's one workflow run. Tomorrow, you'll do another.
Graduation checklist
Check these off as you complete them. When all are checked, you're ready for Level 2.
Supervised
Duration: 2-4 weeksHow it works
The agent runs phases on its own, but you approve at checkpoints: after Plan, after Test, and after Review. Self-healing feedback loops run without your intervention. If tests fail, the agent retries on its own. You only step in at designated approval points.
Checkpoint flow
The ⟲ symbol indicates a self-healing loop. The agent retries automatically; you only see the final result at the approval checkpoint.
What you're learning
- Whether the feedback loops can self-correct without your help
- Whether quality gates are calibrated correctly (too many false positives? too lenient?)
- Whether your approvals are confirming what the gates already caught, or surfacing new issues. If the gates consistently catch what you would catch, you may be ready for L3.
Graduation checklist
When these are consistently true, you can drop the checkpoints and move to L3.
Autonomous
Duration: 1-3 monthsHow it works
The agent runs the full workflow end-to-end. No checkpoints. You review only the final output: the pull request. Quality gates are your safety net. The agent plans, builds, tests, reviews, documents, and ships without waiting for approval at any step in between.
Most teams operate here. That's good engineering.
Level 3 is the right place for most engineering work. You get autonomous velocity while keeping human oversight on the final output. Don't rush to L4. Stay at L3 for anything that touches critical paths, new patterns, or areas where you don't trust your templates yet.
At Level 3, adding the optional Intent phase enables fully autonomous issue processing. Monitor-generated issues pass through Intent for refinement before reaching Plan, enabling fully autonomous issue processing from intake to PR creation.
What you're learning
- Whether the full workflow consistently produces PR-ready output
- Which types of work succeed at L3 and which still need supervision
- How to configure the Monitor phase to catch regressions early
- Your comfort with risk. What would need to be true for you to trust auto-merge?
Graduation checklist
These criteria are deliberately strict. L4 means zero human intervention, so you need high confidence.
L4: Autonomous Software Engineering
The destination, for the right workHow it works
Agent runs the workflow. Quality gates pass. Auto-merge. No human in the loop. You configure the agentic layer, and the workflow handles everything from issue to merged PR. This is Autonomous Software Engineering (ASE): humans out of the active development loop.
When L4 is appropriate
- Low-risk changes: dependency updates, simple bug fixes, well-understood patterns
- Repositories with excellent test coverage and monitoring
- Issue types with proven track records (e.g., patches that have succeeded at L3 for months)
- Services with fast rollback capabilities
When L4 is NOT appropriate
- Security-sensitive code (authentication, authorization, encryption)
- Breaking API changes or public interface modifications
- New architectural patterns the workflow hasn't seen before
- Regulatory or compliance-sensitive areas
- Any work where a bad merge could cause significant business impact
The realistic picture: most teams operate at L2-L3 for critical paths and L4 for routine maintenance. That's not a failure. That's good engineering. L4 everywhere is a long way off. L4 for the right work, right now, is realistic and worth pursuing.
Measuring progress
Track these metrics to know where you are and when you're ready to graduate. Don't optimize for L4. Optimize for reliability at your current level.
| Metric | L1 | L2 | L3 | L4 |
|---|---|---|---|---|
| Workflow success rate | Track manually | >70% | >90% | >95% |
| Avg loop count | N/A | <3 | <2 | <1.5 |
| Escalation rate | 100% | <30% | <10% | <5% |
| Human intervention | Every phase | Checkpoints | PR review | None |
What to track
Success rate
87%
last 30 runs
Avg loops
1.8
retries per run
Escalation
12%
human needed
Time to PR
14m
avg duration
Example dashboard values for a team operating between L2 and L3. Your numbers will vary based on codebase complexity and template maturity.