Framework Overview
What is the Agentic Engineering Framework?
Building software with AI isn't just about writing code, it's everything around it (testing, reviewing, documenting, coordinating). AEF is a framework that lets developers build an agentic development workflow through an agentic layer (prompt templates, quality gates, and workflow config) that sits on top of their codebase, so AI agents handle the full development lifecycle from plan to pull request. The workflow self-heals: failed tests and reviews automatically trigger fix-and-retry loops before ever escalating to a human. Developers build the tools AI needs to build the code.
Every team builds their own workflow. The architecture below is the blueprint. Browse the samples repository for prompt templates, alternative implementations of each layer, and patterns across different AI coding tools. Your implementation will look different: same patterns, your tooling, your conventions.
The Problem
Why we need to adapt traditional SDLC for agentic development
Agents are powerful. Adding structure helps them produce more consistent, repeatable results. Three common challenges tend to emerge as teams scale agent adoption.
Institutional knowledge doesn't transfer
Different developers naturally prompt agents differently. Without shared templates, output consistency varies and new team members cannot benefit from patterns the team has already discovered.
Consistency at scale
Without shared templates, the same type of task may get built differently each time. Code style can drift, test coverage may vary, and documentation quality becomes harder to maintain consistently. The agent is only as consistent as its instructions.
Shared patterns reduce duplicated effort
Without a structured methodology, teams often build similar workflow patterns independently. A shared agentic layer lets teams invest that effort once and reuse it across projects.
The Shift
Developers as workflow architects
Traditional engineering practices — planning, TDD, code review, documentation, least privilege — work because experienced developers enforce them. AEF encodes those same practices into prompt templates, quality gates, and tool permissions so AI agents follow them too. The collaboration point shifts: instead of writing code directly, developers write the layer that governs how agents write code.
How the role shifts
| Developer-driven | Agent-driven (AEF) | |
|---|---|---|
| Who writes code | Developers | Agents, following the agentic layer |
| Standards enforcement | Team culture, reviews, mentoring | Prompt templates, quality gates |
| Institutional knowledge | Docs, wikis, undocumented expertise | Same knowledge, also encoded in the agentic layer |
| Review model | Human code review | Autonomous review + self-healing loops |
| Reuse across projects | Libraries, shared patterns | Reusable agentic layer |
The three-layer architecture
Agentic layer
HUMANS WORK HEREPrompt templates, tool configurations, quality gates, workflow definitions. This is what your team builds and maintains.
Prompt templates
one per phase. Define role, context, constraints, output format. These are the instructions agents follow.
Quality gates
pass/fail criteria per phase. Coverage thresholds, review severity rules, completeness checks.
Tool configurations
which tools each phase can access. File system, code execution, external APIs. Follows least-privilege.
Workflow config
Loop limits, escalation rules, phase ordering, parallel execution settings.
Workflow engine
FRAMEWORK RUNS HERERuns the 7-phase lifecycle. Manages state, artifact chaining, feedback loops, and escalation.
Phase orchestration
manages the 7-phase lifecycle from Plan through Monitor, running each phase in sequence.
State management
chains artifacts between phases via template variables like $plan_artifact and $test_results.
Self-healing loops
runs feedback loops when Test or Review fails, retrying before escalating.
Metrics tracking
Success rates, iteration counts, cost per workflow run, drift detection across runs.
Your codebase
AGENTS WORK HERESource code, tests, documentation, infrastructure. Agents read, write, test, and ship code following patterns from the agentic layer.
The codebase is the output. In this model, developers shape it through the agentic layer rather than editing it directly. Agents read, write, test, and ship code using the patterns defined in the agentic layer above. The quality of that output depends on the quality of your layer.
The Layer
What you build: the agentic layer
This is what you actually create. The minimum agentic layer is two prompt templates and one quality gate. Start there. Expand when you need to.
# Your agentic layer directory
agentic-layer/
├── prompts/ # One template per phase
│ ├── intent.md # Optional: requirement refinement
│ ├── plan.md
│ ├── build.md
│ ├── test.md
│ ├── review.md
│ ├── document.md
│ ├── deploy.md
│ └── monitor.md
├── gates/ # Quality criteria per phase
│ ├── test-coverage.yaml
│ └── review-severity.yaml
├── tools/ # Integration configurations
│ ├── file-system.yaml
│ └── ci-cd.yaml
├── commands/ # Reusable skills & commands
│ ├── commit.md
│ └── security-audit.md
└── workflow.yaml # Orchestration rules
Prompt templates
One markdown file per workflow phase. Each template defines a role, injects context via template variables, sets constraints, and specifies an output format. Versioned alongside your code. Diff-friendly. Reviewable in PRs.
Quality gates
YAML files defining pass/fail criteria per phase. Minimum test coverage, review severity thresholds, required documentation sections. Gates are the workflow's guardrails; if a gate fails, the feedback loop activates.
Tool configurations
Define which tools each phase can access. Categories: file system, code execution, external APIs, context providers. Follows the least-privilege principle: planners get read-only, builders get full access.
Commands and skills
Reusable prompt+tool bundles for common operations. Commands are human-invoked, skills are agent-invoked. Examples: commit, security-audit, dependency-update.
Workflow config
Orchestration rules: phase ordering, loop limits, escalation rules, parallel execution settings. Scales from a 3-phase minimum to a full 7-phase workflow with all loops.
The Engine
What the framework runs: the workflow engine
The workflow engine runs up to 8 phases (7 core plus optional Intent), manages state between them via artifact chaining, self-heals when phases fail, and tracks metrics. For the full technical breakdown, see Workflow Architecture.
Intent (optional)
Clarify requirements, produce intent specification
Plan
Research codebase, produce implementation plan
Build
Write tests first, then implementation code
Test
Run test suite, trigger retry loop on failure
Review
Analyze code against quality gates
Document
Generate docs from actual diffs
Deploy
Create structured pull request
Monitor
Track production health and feed issues back
Execution
How agents execute against your codebase (illustrative example)
The following walkthrough shows how a fully configured workflow processes a feature request. The reference example implements this flow; your implementation will follow the same pattern with your tooling.
Intent refines the request (optional)
The raw request "Add GET /users endpoint with pagination and auth" enters the workflow. The Requirements Analyst agent researches the codebase, finds the existing route structure in src/routes.ts and cursor-based pagination in the /orders endpoint. It produces an intent specification: EARS-format user stories ("When a client sends GET /users with page and limit params, the system shall return the paginated user list"), testable acceptance criteria, and explicit scope boundaries (GET only, no CRUD operations).
Issue arrives
The intent specification (or raw feature request, if Intent was skipped) enters the workflow. The workflow initializes a manifest and transitions to PLANNING.
Plan phase reads the codebase
The architect agent reads the codebase (read-only). It finds the existing route structure, controller patterns, and authentication middleware. It produces an implementation plan with file targets and acceptance criteria.
# Plan artifact (excerpt)
Files to modify: src/routes.ts:45, src/controllers/user.ts
Files to create: tests/user.test.ts
NOT doing: User creation, deletion, or update
Build phase writes code
The engineer agent follows the plan. Tests first: GET /users, pagination logic, auth checks. Then implementation until tests pass. No scope creep; the plan is the contract.
Test phase validates
Tests run. 14 pass, 1 fails on a pagination edge case. The workflow does not stop. It spawns a builder agent with the failure context, the fix is applied, and tests re-run. All 15 pass on the second attempt.
# Test results (retry 1)
Passed: 15 Failed: 0 Coverage: 87%
Retries used: 1 of 3
Review phase evaluates
The reviewer agent checks against quality gates. No blockers found. One tech-debt item logged for future pagination refactoring. Gate passes.
Document phase writes docs
Docs are generated from the actual diffs, not from scratch. API docs for the new endpoint, a changelog entry, and inline code comments.
Deploy phase ships a PR
A pull request is created with: descriptive title, summary of changes, test evidence (15/15 pass, 87% coverage), review findings, and doc updates. The team designed and built the layer. The workflow executed the full lifecycle from plan to PR.