IRAF

A Git-native architecture for autonomous, self-improving AI systems.

IRAF — Iterative Refinement Agentic Framework

Recursion You Can Trust

IRAF is a governance-first framework where AI agents plan, build, evaluate, and refine — all inside a single Git repository. Every decision is auditable. Every change is reversible.

Why Trust Matters in Agentic AI

Autonomous AI systems are powerful — but power without transparency is dangerous. IRAF solves this by making every iteration visible in Git history. There are no hidden states, no opaque reasoning chains. The entire evolution of a system lives in commits, diffs, and scores.

The framework enforces a closed refinement loop: an Architect decomposes goals, a Frontend Engineer implements, a Copywriter shapes narrative, and an independent Evaluator judges quality against a constitution (clauses.md). If the score falls below the threshold, the loop iterates — automatically.

This isn't just automation. It's human-AI collaboration with a built-in audit trail — designed so humans can verify, override, or extend any decision the agents make.

Governance Principles

Strict Containment — Agents only modify files within the repository. No external API calls, no side effects.

Scored Evaluation — Every iteration is judged against a constitutional document with weighted criteria.

Anti-Stagnation — If scores plateau, agents must refine their own blueprints before continuing.

Git-Native Logging — Every change is committed with a standardized, machine-readable message format.

The IRAF Recursion Loop

A closed-loop system where agents autonomously plan, execute, evaluate, and refine — converging toward a score of 90 or above.

graph LR
  A["🎯 User Goal"] --> B["🏗️ Architect"]
  B --> C["⚙️ Frontend Engineer"]
  B --> D["✍️ Copywriter"]
  C --> E["📄 index.html"]
  D --> E
  E --> F["🧪 Evaluator"]
  F -->|"Score ≥ 90"| G["✅ Commit & Deploy"]
  F -->|"Score < 90"| B
          

How the Evaluator Scores

Every iteration is judged by an independent Evaluator against the constitution in clauses.md. The scoring weights are public and immutable:

40%

Technical Correctness

Does the code run? Is Mermaid rendering? Are CDN scripts loaded correctly?

40%

Trust & Clarity

Is the human-AI collaboration message clear? Is transparency and auditability communicated?

20%

Visual Polish

Gradients, layout, responsiveness, hero image usage, and overall design quality.

Minimum passing score: 90/100 · Maximum iterations per bootstrap: 5

Core Capabilities

Git-Native Architecture

Every iteration lives in version control. No databases, no hidden state — just commits, diffs, and branches.

Multi-Agent Delegation

Architect, Engineer, Copywriter, and Evaluator collaborate through structured roles and clear handoffs.

Constitutional Governance

A clauses.md constitution enforces safety rules, scoring criteria, and anti-stagnation policies.

Self-Improving Loop

Agents evaluate their own output and refine blueprints if scores stagnate — true closed-loop recursion.

Full Transparency

Every decision, score, and refinement is visible in Git history. Humans can audit, override, or extend at any point.

Zero-Cost Deployment

Single-file HTML with CDN dependencies. Push to master, and Netlify deploys automatically — no build step required.

Ready to See Recursion in Action?

Clone the repository, paste the bootstrap prompt in your AI IDE, and watch the agents build, evaluate, and refine — live in your Git history.

View on GitHub