The Agentic Layer
Quality Gates
Pass/fail criteria that must be satisfied before a phase completes. Better gates mean higher autonomy. Gates are the reason you can trust the workflow.
What are quality gates
A quality gate is a set of pass/fail criteria evaluated after a workflow phase completes. If the gate passes, the workflow moves to the next phase. If it fails, the gate triggers an action: retry the phase, patch and re-run, or escalate to a human. Gates are why autonomous workflows can be trusted.
Quality gates are the single biggest factor in your autonomy level. Weak gates keep you at Level 1-2 (human reviews everything). Strong, thorough gates let you reach Level 3-4 (autonomous / ASE). Invest in your gates.
The three concerns inside every gate
A quality gate looks like one thing, but it handles three distinct concerns. Understanding the layers matters because they change at different rates and for different reasons.
Classification gates
Classification gates operate on the structured output of a single phase. They answer: within this phase's results, what matters?
A review phase produces a list of issues. The classification gate says: blockers matter, tech-debt gets logged but does not block, skippable issues are noted without action. A test phase produces pass/fail results. The classification gate says: syntax errors are blockers, lint warnings are informational.
Classification gates are the innermost layer. They parse structured output and categorize findings by severity. They rarely change between autonomy levels; a blocker is a blocker whether a human is watching or not.
Healing gates
Healing gates wrap around phase execution. They answer: when something fails, should I try to fix it, and have I made progress?
A test healing gate says: resolve the failures, re-run, check if the failure count decreased. If no progress was made, stop retrying, because the same fix won't work twice. A review healing gate says: create patches for blocker-severity issues, re-implement, re-review.
Healing gates have configurable retry limits and termination conditions. They change slowly across autonomy levels. You might increase retries at higher levels where the cost of escalation is greater.
Disposition gates
Disposition gates are the outermost layer. They answer: after classification and healing are exhausted, does this phase's failure stop the workflow?
This is where autonomy levels live. At Level 3 (Autonomous), a test failure might be non-fatal (the workflow continues to review, and a human sees the failures in the PR). At Level 4 (ASE), the same test failure is fatal. The workflow aborts because there's no human to catch it downstream.
Disposition gates change dramatically between autonomy levels. They are the primary mechanism that differentiates one autonomy configuration from another.
How the layers compose
Classification feeds healing: only blocker-severity issues trigger the healing loop. Healing feeds disposition: if healing exhausts its retries and the phase is marked required, the workflow aborts. If the phase is marked optional, the workflow logs a warning and continues.
The same classification and healing configuration can produce different workflow behavior by changing only the disposition layer. This is why autonomy levels are gate configurations, not workflow configurations. You're changing what authority gate decisions have, not what they evaluate.
Anatomy of a quality gate
Trigger
which phase this gate runs after, e.g., after-test, after-review.
Criteria
what conditions must be met. A list of key-value assertions against phase output.
Action on fail
retry (loop), patch (fix and re-run), escalate (human).
Severity
blocker (must pass to proceed) or warning (log and continue).
gate: test-coverage
trigger: after-test
severity: blocker # DISPOSITION: does failure stop the workflow?
# CLASSIFICATION: what counts as a failure?
criteria:
all_tests_pass: true
coverage_minimum: 80
no_skipped_tests: true
no_snapshot_regressions: true
# HEALING: what to do on failure?
on_fail: retry # retry | patch | escalate | log
max_retries: 3
retry_strategy: rebuild
stop_on_no_progress: true # terminate early if failure count unchanged
# DISPOSITION: what happens when healing exhausts?
escalation:
target: human
context:
- test_results
- build_report
- fix_attempts
Gate examples by phase
# gates/test-coverage.yaml
gate: test-coverage
version: 1
trigger: after-test
severity: blocker
criteria:
all_tests_pass: true
coverage_minimum: 80 # percentage
no_skipped_tests: true
max_test_duration_ms: 300000 # 5 minutes
no_snapshot_regressions: true
on_fail: retry
max_retries: 3
retry_strategy: rebuild # triggers build agent to fix failures
escalation:
after_retries_exhausted: human
message: "Tests failed after {retries} attempts. Manual review needed."
context:
- test_results
- build_report
- failure_history
Writing custom gates
The criteria format
Criteria are key-value assertions. Keys reference fields in the phase's output artifact. Values are the expected state.
| Type | Example | Meaning |
|---|---|---|
| Boolean | all_tests_pass: true | Field must be exactly true |
| Numeric min | coverage_minimum: 80 | Field must be >= 80 |
| Numeric max | tech_debt_max: 5 | Field must be <= 5 |
| Exact | security_issues: 0 | Field must be exactly 0 |
| String match | verdict: "PASS" | Field must match string |
| Exists | has_changelog_entry: true | Field must exist and be truthy |
Combining criteria (AND/OR)
# Default: all criteria are AND (all must pass)
criteria:
all_tests_pass: true
coverage_minimum: 80
# Explicit OR: use any_of
criteria:
any_of:
- coverage_minimum: 80
- coverage_minimum: 60
has_integration_tests: true # 60% OK if integration tests exist
# Nested: AND groups within OR
criteria:
any_of:
- all_of:
- coverage_minimum: 80
- no_skipped_tests: true
- all_of:
- coverage_minimum: 90 # higher coverage forgives skipped tests
Gate criteria reference the structured output produced by the phase. This is why the Output Format section in prompt templates matters: gates parse it.
Gate quality determines autonomy level
Classification and healing gates mature over time. You learn what severity thresholds work, what retry counts are effective, and what healing strategies succeed. This maturity is what earns the right to higher autonomy.
Disposition gates are what you configure when you change autonomy levels. Moving from Level 2 to Level 3 means changing test disposition from "require human approval on failure" to "abort or continue based on gate result." Moving from Level 3 to Level 4 means tightening dispositions: test failure becomes fatal (no human to catch it), documentation failure becomes non-fatal (shouldn't block auto-merge).
You don't advance to higher autonomy by trusting AI more. You advance by observing that your classification and healing gates are accurate, then granting disposition gates more authority over workflow flow.
See Autonomous Software Engineering for the full maturity path.
Gate anti-patterns
The Rubber Stamp
A gate with criteria so loose it never fails. coverage_minimum: 0, tech_debt_max: 999. This gate adds workflow overhead without adding safety.
Fix: set criteria based on your team's actual quality bar. Start conservative, relax only with data.
The Perfectionist
A gate that rarely passes. coverage_minimum: 100, lint_warnings_max: 0. The workflow exhausts retries and escalates every run. Your team stops trusting the system.
Fix: use severity: warning for aspirational criteria. Reserve severity: blocker for true blockers.
The Glass Cannon
A gate with max_retries: 0 or no retry strategy. A single flaky test stops the entire workflow. No self-healing, no recovery.
Fix: always include retry logic. Even 1 retry catches most transient failures. Use max_retries: 3 as a default.