Skip to content

SDLC Matrix — Hallucination-Proof Software Development

Complete 14-phase × 12-role framework for AI-powered software development with forensic verification, multi-agent consensus, and evidence chain tracking.

The Prime Directive

Hallucinations must be impossible.

A PRD based on hallucinated data is worse than no PRD. An architecture document referencing non-existent APIs is worse than no architecture document. Every claim made by every agent must be verifiable against ground truth.

The 14 SDLC Phases

| # | Phase | Lead Role | Work Products | Validation Gate |

|---|-------|-----------|---------------|-----------------|

| 1 | Discovery & Research | Product Strategist | Discovery report, competitive analysis | Forensic (external claims) |

| 2 | Requirements & PRD | Product Manager | PRD, user stories, acceptance criteria | Forensic + Human approval |

| 3 | Architecture & Design | Architect | ADRs, diagrams, API contracts | Forensic + Human approval |

| 4 | Sprint Planning | Tech Lead | Sprint backlog, estimates, dependencies | Auto (confidence) |

| 5 | Implementation | Sr. Developer | Code, unit tests, commits | Build verification |

| 6 | Code Review | Reviewer ×3 | Review findings, approval/rejection | Consensus (2-of-3) |

| 7 | Testing | QA Engineer | Test plan, results, coverage | Build + Forensic |

| 8 | CI/CD | DevOps/SRE | Pipeline config, quality gates | Auto (pass/fail) |

| 9 | Staging & QA | QA + PM | QA sign-off, UAT results | Human approval |

| 10 | Deployment | DevOps/SRE | Runbook, rollback plan, release notes | Build + Human approval |

| 11 | Monitoring | DevOps/SRE | Alerts, dashboards, SLOs | Auto (baseline) |

| 12 | Incident Response | DevOps/SRE | Timeline, RCA, post-mortem | Forensic + Human |

| 13 | Documentation | Technical Writer | API docs, guides, changelogs | Forensic (code refs) |

| 14 | Maintenance | Tech Lead | Debt inventory, dep audit, patches | Forensic (versions) |

The 12 Virtual Roles

| Role | Model Strategy | Tool Access | Confidence |

|------|---------------|-------------|------------|

| Product Strategist | Quality-first (Opus) | web_search, web_fetch, file_read | 0.8 |

| Product Manager | Quality-first (Opus) | file_read, grep, glob, tasks | 0.8 |

| Software Architect | Quality-first (Opus) | file_read, grep, glob, git (read-only) | 0.85 |

| Tech Lead | Combined | file_read, grep, glob, tasks | 0.7 |

| Senior Developer | Quality-first (Sonnet) | ALL tools | 0.7 |

| Developer | Cost-first (Haiku) | file ops, shell (no git_commit) | 0.6 |

| Security Engineer | Quality-first (Opus) | file_read, grep, shell, web_search | 0.9 |

| QA Engineer | Quality-first (Sonnet) | file_read, grep, glob, shell | 0.8 |

| DevOps/SRE | Combined | ALL tools | 0.8 |

| Technical Writer | Quality-first | file_read, grep, glob | 0.85 |

| Data Analyst | Combined | file_read, grep, web_search, shell | 0.85 |

| Forensic Verifier | Quality-first (Opus) | file_read, grep, glob (READ-ONLY) | 0.95 |

Anti-Hallucination Framework (5 Layers)

Layer 1: Ground Truth Anchoring (Zero Cost)

  • Read before write enforced
  • Git diff verification after every edit
  • Path validation on all file references
  • Import validation on all module references

Layer 2: Evidence Chain (Minimal Cost)

  • Passive tracking of every tool call
  • Every file_read recorded with path + line range
  • Provenance attached to every artifact automatically
  • Enables post-hoc audit

Layer 3: Forensic Verification (~$0.05/artifact)

  • Independent agent reads codebase and verifies claims
  • Per-claim VERIFIED/DISPUTED status with evidence
  • Disputes block phase progression
  • Triggered after specs, architecture, docs, security assessments

Layer 4: Multi-Agent Consensus (~$0.15/vote)

  • 3 independent agents using DIFFERENT models
  • 2-of-3 must approve for passage
  • Prevents groupthink from same-model bias
  • Triggered for code review and security assessment

Layer 5: Human-in-the-Loop (Zero Cost)

  • PRD approval before architecture
  • Architecture approval before implementation
  • Deploy approval before production
  • Budget/confidence threshold gates

Failure Mode Catalog

| Failure Mode | Detection | Prevention |

|---|---|---|

| Phantom file reference | Forensic: glob for file | Ground truth anchoring |

| Phantom function | Forensic: grep for definition | Read-before-reference |

| Invented API endpoint | Forensic: grep for route def | Evidence chain tracking |

| Wrong version number | Forensic: read package.json | Version pinning |

| Fabricated test results | Build verification: run tests | Mandatory test execution |

| Invented metric | Evidence chain: check sources | Source citation requirement |

| Stale information | Evidence chain: check file hash | Re-read before asserting |

| Confident wrong answer | Multi-agent consensus | Diverse model voting |

Full Documentation

The complete SDLC Matrix with work product schemas, validation gate configurations, evidence chain protocol, forensic verification protocol, and industry framework alignment is available in the [full research document on GitHub](https://github.com/justinjilg/brainstorm/blob/main/docs/internal/sdlc-matrix.md).