The AI Dark Factory: How to Build a Codebase That Ships Itself

Get AI income methods before they spread.

Free weekly intelligence newsletter.

An AI dark factory is a codebase that manages its own GitHub issues, reviews its own pull requests, and evolves its own rules — without a human in the loop. Cole Medin built one using three markdown governance files and Archon workflows, with research showing the system performs at 70% PR acceptance versus 6.7% for a single unguided LLM.

What Is a Dark Factory Codebase?

The term comes from manufacturing. A dark factory runs lights-out — no humans on the floor. Applied to software, it means a codebase where AI handles the full development cycle: reading issues, writing code, reviewing changes, and merging PRs. The difference from basic AI coding is governance. Without rules, an autonomous AI codebase drifts. With the right governance layer, it stays on track.

Three files govern everything: mission.md, factory-rules.md, and CLAUDE.md.

How Does the Three-File Governance Layer Work?

Every Archon workflow in the system loads three files at the start of execution:

mission.md — defines what the project is for, what problems it solves, and what done looks like
factory-rules.md — code quality standards, naming conventions, testing requirements, and what the AI is not allowed to do
CLAUDE.md — the technical conventions for this specific codebase: stack choices, folder structure, patterns to follow

These files are checked into source control. When rules evolve — because a bug revealed a gap, or a new pattern proved better — the governance files get updated. The next workflow run picks up the new rules automatically.

What Does the Triage Workflow Actually Do?

The triage workflow is the entry point for all incoming GitHub issues. The sequence runs in four steps:

A bash step fetches open issues from the GitHub API
An LLM step reads each issue against the governance files and decides: accept, reject, or request clarification
A bash step applies GitHub labels based on the LLM decision
Accepted issues move to the implementation queue automatically

No human sees the issue until it is either rejected (with a reason label) or implemented and ready for review. For a side project with irregular maintenance windows, this means the backlog stays managed even when you are not looking at it.

What Is Holdout Validation and Why Does It Matter?

Holdout validation is Cole's solution to the core problem with AI code review: the reviewer and the implementer share context, so the reviewer cannot catch what the implementer missed.

In the dark factory, the review workflow only sees the git diff — never the implementation context, the issue description, or the reasoning the implementation agent used. The reviewer evaluates the change on its own merits: does this diff do what the governance rules require? Does it break anything?

A reviewer that cannot see the implementer's reasoning catches what the implementer rationalised away.

This is the same principle as a code review between two different developers. The holdout structure enforces it mechanically, without relying on the same model to critique its own reasoning.

How Do You Build a Minimal Version of This Today?

Start with the governance layer. Create mission.md, factory-rules.md, and CLAUDE.md in your project root. Keep them short and specific — three clear rules are more useful than thirty vague ones.

Then build one triage workflow in Archon. The workflow needs three nodes: bash to fetch issues, LLM to classify them against your governance files, bash to apply labels. Run it manually first. Once it classifies correctly on ten consecutive issues, put it on a schedule.

Add the holdout review workflow only after triage is stable. Sequence matters — a broken review workflow on top of a broken triage workflow is undebuggable.

Frequently Asked Questions

What is Archon and how does it relate to Claude Code?

Archon is an open-source orchestration layer that sits above Claude Code. It runs YAML workflows where each node can be either a bash step or an LLM prompt. Claude Code handles individual coding tasks; Archon handles the sequence of tasks across a full development cycle.

Does this require a paid Anthropic plan?

Archon supports multiple LLM providers via OpenRouter, so you can route cheaper models to deterministic steps and use Claude only where reasoning quality matters. The governance files load into every prompt, so the context cost is fixed per workflow run, not per token of output.

How do the governance files get updated?

Manually, by the human owner — but infrequently. The files define the rules the AI operates under. When a bug reveals a missing rule, you add it. When a pattern proves better, you update the convention. The files evolve slowly and intentionally, not automatically.

What kinds of projects are best suited to this approach?

Side projects and internal tools where you want continued progress without daily maintenance. The dark factory pattern works best on codebases with a clear scope, a stable tech stack, and a backlog of small, well-defined issues. It is not suited to greenfield projects where the direction is still unclear.

Watch Cole build the full system: The AI Dark Factory on YouTube

https://www.youtube.com/watch?v=_dark_factory