graph8 lifecycle
Engineer-grade · the 10 principles
Engineering spec · graph8 setup · infra ↔ g8

Setting up Lifecycle across graph8-com/infra and graph8-com/g8 — file by file, query by query, day by day.

Companion to the change, the 7-day setup plan, and graph8 after. This page is for the engineer who's actually going to ship it — at graph8, that's one engineer dispatching agents. Setup only: the system wiring between graph8-com/infra (axon agents, MBM service, crons, Postgres, k8s manifests, dashboards) and graph8-com/g8 (workflows, .claude config, MCP wrapper distribution). Architecture, components, data model, every file path with a status badge, every SQL query — for the 7-day setup sprint. Improvements (the 2–3× lift) ship after, against the foundation this builds.

1 Architecture

The whole system on one screen.

Five planes: Capture (every skill fire, everywhere) · Store (one Postgres table) · Visualize (two Grafana dashboards) · Govern (org policies that bind regardless of where the agent runs) · Close loops (agent A's output is agent B's input).

flowchart TB classDef capture fill:#fef3c7,stroke:#d97706,color:#78350f,stroke-width:2px classDef store fill:#dbeafe,stroke:#3b82f6,color:#1e3a8a,stroke-width:2px classDef vis fill:#d1fae5,stroke:#10b981,color:#064e3b,stroke-width:2px classDef gov fill:#eef2ff,stroke:#6366f1,color:#1e1b4b,stroke-width:2px classDef loop fill:#fce7f3,stroke:#ec4899,color:#831843,stroke-width:2px subgraph CAPTURE["1 · CAPTURE (everywhere a skill fires)"] L1["Local Claude Code
+ MCP wrapper"]:::capture L2["Local Codex
+ MCP wrapper"]:::capture L3["K8s axon agents
(via axon runner)"]:::capture L4["Modal hibernation
(via runtime)"]:::capture L5["Webhook handlers
(MBM service)"]:::capture end subgraph STORE["2 · STORE"] DB[("skill_invocations
Postgres table
+ context_fetches
+ agent_health_stats")]:::store API["MBM service
POST /v1/trace
POST /v1/grade/engineer"]:::store end subgraph VIS["3 · VISUALIZE"] G1["Grafana ·
Engineer utilization"]:::vis G2["Grafana ·
K8s agent economics"]:::vis end subgraph GOV["4 · GOVERN"] P1["Branch protection
(GitHub)"]:::gov P2["CODEOWNERS
(per-repo)"]:::gov P3["MBM rubric
(versioned YAML)"]:::gov P4["pr_fixer escalation
(attempt-4 rule)"]:::gov end subgraph LOOPS["5 · CLOSE LOOPS · agent A → agent B"] A1["test-writer
on PR open"]:::loop A2["agent_health
daily cron"]:::loop A3["mbm_critic
daily cron"]:::loop A4["skill_mortality
weekly cron"]:::loop A5["knowledge_compactor
daily cron"]:::loop A6["contract_test_runner
on PR open"]:::loop A7["bug_predictor
on PR open"]:::loop A8["onboarding
daily cron · first 28 days"]:::loop end subgraph WEBHOOK["WEBHOOK ROUTER (replaces 2h polling)"] W1["Cloudflare Worker
/tenants/<id>/webhook"]:::store W2["TaskSpawner CRs
(triggers.yaml)"]:::store end L1 --> API L2 --> API L3 --> API L4 --> API L5 --> API API --> DB DB --> G1 DB --> G2 G1 -.audit.-> P1 & P2 & P3 DB --> A2 & A3 & A4 & A5 A1 --> API A2 --> DB A3 --> DB A6 --> API A7 --> API L5 --> W1 W1 --> W2 W2 --> L3 P4 -.governs.-> L3
Capture · client wrappers + agent runners Store · Postgres + MBM API Visualize · Grafana Govern · policy enforcement Close loops · agent-to-agent
2 Components

23 things to build, grouped by phase.

Every component has a status badge (NEW · EDIT · DELETE · SETTINGS), a target day, the file path, and a one-line "what" and "depends on."

Foundation

Days 1–7 · capture + visibility + governance

9 components

Branch protection · infra

SETTINGS
GitHub repo setting. Require PR + status checks + CODEOWNERS approval + linear history on main.
Where github.com/graph8-com/infra/settings/branches
Day 1 · 10 min

Branch protection · g8

SETTINGS
Same rule on main and qa. MBM bot review counts as the required approval for non-migration paths.
Where github.com/graph8-com/g8/settings/branches
Day 1 · 10 min

Lowercase @mbm review normalizer

NEW
GitHub Action that re-posts the trigger as lowercase if anyone types it wrong-case. Closes a silent-failure landmine.
File g8/.github/workflows/normalize-mbm-trigger.yml
Day 1 · 30 min

Sync token guard

EDIT
Add a fail-loud step at the top of sync-main-to-qa.yml. If SYNC_BOT_TOKEN is missing, fail with a Slack ping. No more silent skipping.
File g8/.github/workflows/sync-main-to-qa.yml
Day 1 · 15 min

pr_fixer attempt-4 rule

EDIT
Document escalation: on attempt 4, label mbm/needs-input, post a summary comment tagging the author, stop. Removes "agent stuck forever" mode.
File tenants/graph8-eng/agents/pr_fixer/prompt.md
Day 1 · 15 min

Retire pr_cleanup + rage_click_detector

DELETE
Both have prompts but no TaskSpawner dispatches them. Decide: wire (add trigger to triggers.yaml) or delete the folder. No "in-between."
Where tenants/graph8-eng/agents/{pr_cleanup,rage_click_detector}/
Day 1 · 30 min

Trace ledger table

NEW
Single Postgres table that captures every skill invocation across local Claude Code, local Codex, K8s agents, and Modal. Foundation of everything that follows.
File services/mbm/migrations/20260517_skill_invocations.sql
Day 2 · half-day · depends on mbm-pg

Trace MCP wrapper + Claude hook

NEW
TypeScript MCP server that wraps tool calls and POSTs trace rows. Settings.json hook covers slash-command fires. Distributed to every engineer.
File services/lifecycle-trace-mcp/ · .claude/mcp.json (both repos)
Day 3–4 · 1 day · depends on ledger

Engineer-utilization dashboard

NEW
Grafana dashboard reading from the ledger. Shows fires / unique-skills / top-skill / utilization-grade per engineer.
File k8s/monitoring/dashboards/lifecycle-engineer.json
Day 5 · 2 hr · depends on MCP wrapper

Agent-economics dashboard

NEW
Grafana dashboard showing per-K8s-agent token spend, PRs opened/merged, $/merged-PR. Where real OAuth-pool $ optimization lives.
File k8s/monitoring/dashboards/lifecycle-agent-economics.json
Day 6 · 2 hr
First loops

Days 6–14 · close the agent-to-agent loops the system can close immediately

7 components

Wire test-writer agent

NEW
The agent already exists at g8/.claude/agents/test-writer.md but isn't dispatched. New workflow triggers it on PR open, opens a follow-up PR with the missing tests.
File g8/.github/workflows/test-writer-on-pr.yml
Day 6 · 3 hr

agent_health cron

NEW
Daily cron measures each axon agent's PR survival rate. Flags regressions, surfaces unwired agents. The first agent that watches the agents.
File tenants/graph8-eng/agents/agent_health/prompt.md · crons.yaml
Day 7 · 4 hr

Engineer scoring service

NEW
Go module + 8 test cases. Replaces the prompt-derived algorithm in /assign-prds. Skill calls POST /v1/grade/engineer instead of re-deriving.
File services/mbm/internal/grader/engineer_scoring{,_test}.go
Day 8–9 · 2 days

Auto-promote qa → main

NEW
Hourly workflow opens the qa→main PR automatically once QA has been green ≥48h. Closes the biggest cycle-time dark zone.
File g8/.github/workflows/auto-promote-qa-to-main.yml
Day 10 · 3 hr

Bandit blocking flip

EDIT
Remove continue-on-error: true from the bandit job. First in the CI ratchet — smallest blast radius, clearest fixes.
File g8/.github/workflows/code-quality.yml
Day 11–12 · 1 day

mbm_critic agent

NEW
Daily cron classifies PRs where MBM's CHANGES_REQUESTED was dismissed by the merger. Proposes rubric updates to reviewer/prompt.md. The first agent that reads MBM's output.
File tenants/graph8-eng/agents/mbm_critic/prompt.md · crons.yaml
Day 13–14 · 2 days

Skill-mortality job

NEW
Weekly SQL + auto-issue. Skills with <5 fires/month in the ledger → auto-open deprecation issue with the consolidation suggestion from the merge table.
File services/mbm/internal/jobs/skill_mortality.go
Day 14 · 2 hr
Scale

Days 15–30 · webhooks, cross-repo, consolidation, prediction

7 components

Tenant webhook receiver

NEW
Cloudflare Worker that accepts GitHub webhooks, validates HMAC, translates events into Task CR creation. Replaces 2h polling with sub-60s dispatch.
File workers/tenant-webhook/
Day 15–19 · 5 days

TaskSpawner webhook cutover

EDIT
Flip every label entry in triggers.yaml from polling to webhook. Start with 5xx-error; roll the rest after 24h of observation.
File tenants/graph8-eng/triggers.yaml
Day 19–20 · 1 day

knowledge_compactor agent

NEW
Reads context_fetches table. Files fetched ≥10 times this week → propose CLAUDE.md addition via PR. The system writes its own docs.
File tenants/graph8-eng/agents/knowledge_compactor/prompt.md · axon runner edit for fetch logging
Day 21–23 · 3 days

Cross-repo contracts + runner

NEW
JSON schemas for the top 5 cross-repo interfaces (g8 ↔ g8-eda-server first). contract_test_runner agent validates schema changes on every relevant PR.
File tenants/graph8-eng/contracts/*.json · agents/contract_test_runner/prompt.md
Day 24–28 · 5 days

bug_predictor agent

NEW
Reads each PR diff + the last 30 days of Sentry stack traces, flags lines whose patterns historically broke things. Comment-only for first 2 weeks (tune false-positive rate).
File tenants/graph8-eng/agents/bug_predictor/prompt.md + TaskSpawner
Day 24–28 · 5 days

onboarding agent

NEW
Reads joined field from engineer-domains.json. Daily nudges in Slack for engineers in their first 28 days: missing skill usage, unused conventions, relevant feature-level CLAUDE.md.
File tenants/graph8-eng/agents/onboarding/prompt.md · edit engineer-domains.json
Day 25–28 · 4 days

Skill consolidation · 22 → 14

EDIT
One merge per week across .claude/commands/ in both repos. Week 3: /commit/ship. Week 4: /assign-prds+/capacity-check/capacity, then /review+/security-review, then /start branches.
File .claude/commands/ in both repos
Day 15–30 · rolling
3 Data model

Three tables + two API endpoints + one webhook format.

The schema is intentionally tiny. Everything downstream — dashboards, loops, alerts — is a query against these tables. Add a column when you have a reason. Don't add tables.

skill_invocations NEW

One row per skill fire. Token / cost columns are NULL for Max-plan local runs (no metering possible); populated for K8s-agent runs (OAuth pool).

CREATE TABLE skill_invocations (
  id              BIGSERIAL PRIMARY KEY,
  ts              TIMESTAMPTZ NOT NULL DEFAULT now(),
  engineer_id     TEXT NOT NULL,         -- github username
  skill_name      TEXT NOT NULL,         -- e.g. /start, /ship, agent name for K8s runs
  repo            TEXT,                  -- graph8-com/g8, graph8-com/infra
  runtime         TEXT NOT NULL,         -- claude-code-local | codex-local | k8s-job | modal
  model           TEXT,                  -- claude-opus-4-7, claude-haiku-4-5, etc.
  latency_ms      INT,
  status          TEXT NOT NULL,         -- success | abandoned | error
  input_tokens    INT,                   -- NULL for max-plan local runs
  output_tokens   INT,                   -- NULL for max-plan local runs
  cost_usd_cents  INT,                   -- NULL for max-plan local runs
  arg_hash        TEXT                   -- anonymized input fingerprint
);
CREATE INDEX ON skill_invocations (engineer_id, ts);
CREATE INDEX ON skill_invocations (skill_name, ts);
CREATE INDEX ON skill_invocations (runtime, ts);

context_fetches NEW · day 21

Drives the knowledge-compounding loop. Agents POST a row every time they read a file. Daily cron looks for files fetched ≥10 times this week and proposes them for CLAUDE.md.

CREATE TABLE context_fetches (
  id          BIGSERIAL PRIMARY KEY,
  ts          TIMESTAMPTZ NOT NULL DEFAULT now(),
  agent_name  TEXT NOT NULL,
  file_path   TEXT NOT NULL,
  repo        TEXT
);
CREATE INDEX ON context_fetches (file_path, ts);

agent_health_stats NEW · day 7

Daily snapshot, computed by agent_health cron. Source for the agent-economics dashboard and for regression alerts.

CREATE TABLE agent_health_stats (
  snapshot_date    DATE NOT NULL,
  agent_name       TEXT NOT NULL,
  runs_30d         INT NOT NULL,
  prs_opened_30d   INT NOT NULL,
  prs_merged_30d   INT NOT NULL,
  prs_reverted_30d INT NOT NULL,
  survival_rate    NUMERIC(5,3),         -- merged-not-reverted / opened
  token_cost_cents INT,
  PRIMARY KEY (snapshot_date, agent_name)
);

API · POST /v1/trace NEW · day 2

The single ingest endpoint. Every capture path POSTs to this. Authenticated by an org-issued API token (per-machine or per-pod).

Request:
{
  "engineer_id":   "thomas-c",
  "skill_name":    "/start",
  "repo":          "graph8-com/g8",
  "runtime":       "claude-code-local",
  "model":         "claude-opus-4-7",
  "latency_ms":    1240,
  "status":        "success",
  "input_tokens":  null,                  -- max-plan, no metering
  "output_tokens": null,
  "arg_hash":      "sha256:8f3a..."
}

Response 202 Accepted:
{ "id": 847132 }

API · POST /v1/grade/engineer NEW · day 8

Replaces the prompt-derived algorithm in /assign-prds. Same algorithm, but now testable and consistent. Skill calls this endpoint instead of re-deriving.

Request:
{
  "prd": {
    "slug":       "ai-inbox-meetings-admin-settings",
    "domains":    ["ai_inbox", "frontend/features/inbox"],
    "complexity": "medium",
    "gtm_score":  12
  }
}

Response 200 OK:
{
  "scores": [
    { "engineer_id": "hassan-b", "score": 7, "reason": "primary: ai_inbox(+3), primary: inbox(+3), under target(+1)" },
    { "engineer_id": "hamza-n", "score": 4, "reason": "secondary: ai_inbox(+1), 2 PRDs assigned(+3)" }
  ]
}

Webhook · GitHub → /tenants/<id>/webhook NEW · day 15

Cloudflare Worker entry point. Replaces 2h polling. Validates HMAC, translates to Task CR.

GitHub webhook (issues.labeled, pull_request, etc.) →
POST /tenants/graph8-eng/webhook
X-Hub-Signature-256: sha256=...
Content-Type: application/json

Worker validates signature, extracts label, queries triggers.yaml:
{
  "name":   "infra-issues",
  "agent":  "infra",
  "label":  "infra",
  "repo":   "graph8-com/infra"
}

→ POST to MBM creates Task CR → axon controller spawns Job → pod runs.
Target end-to-end p50 latency: under 60 seconds (was ~1 hour with polling).
4 File tree

Every file that gets touched, with status and target day.

Grep this section. If a file isn't here, the spec isn't asking you to touch it. If a file is here, the badge tells you whether to create, edit, or delete it; the day tells you when.

graph8-com/ ├── infra/ │ ├── .github/workflows/ │ │ └── infra-ci.yamlEDITre-enable deploy-k8s post aws-cp · later │ ├── tenants/graph8-eng/ │ │ ├── agents/ │ │ │ ├── pr_fixer/prompt.mdEDITday 1 · attempt-4 rule │ │ │ ├── pr_cleanup/DELETEday 1 (unless wired) │ │ │ ├── rage_click_detector/DELETEday 1 (unless wired) │ │ │ ├── agent_health/prompt.mdNEWday 7 │ │ │ ├── mbm_critic/prompt.mdNEWday 13–14 │ │ │ ├── knowledge_compactor/prompt.mdNEWday 21–23 │ │ │ ├── contract_test_runner/prompt.mdNEWday 24–28 │ │ │ ├── bug_predictor/prompt.mdNEWday 24–28 │ │ │ └── onboarding/prompt.mdNEWday 25–28 │ │ ├── contracts/NEWday 24 · 5 schema files │ │ │ ├── g8-eda-events.jsonNEW │ │ │ └── ...four more...NEW │ │ ├── crons.yamlEDIT+4 entries · days 7, 14, 21, 28 │ │ ├── triggers.yamlEDITpolling → webhook · day 19–20 │ │ └── tenant.yamlEDITregister new agents · rolling │ ├── k8s/monitoring/dashboards/ │ │ ├── lifecycle-engineer.jsonNEWday 5 │ │ └── lifecycle-agent-economics.jsonNEWday 6 │ ├── workers/tenant-webhook/NEWday 15–19 │ │ ├── src/index.tsNEW │ │ ├── wrangler.tomlNEW │ │ └── package.jsonNEW │ ├── services/ │ │ ├── mbm/ │ │ │ ├── migrations/ │ │ │ │ ├── 20260517_skill_invocations.sqlNEWday 2 │ │ │ │ ├── 20260518_context_fetches.sqlNEWday 21 │ │ │ │ └── 20260519_agent_health_stats.sqlNEWday 7 │ │ │ └── internal/ │ │ │ ├── trace/ │ │ │ │ └── handler.goNEWday 2–3 · POST /v1/trace │ │ │ ├── grader/ │ │ │ │ ├── engineer_scoring.goNEWday 8 │ │ │ │ └── engineer_scoring_test.goNEWday 8 · 8 cases │ │ │ └── jobs/ │ │ │ └── skill_mortality.goNEWday 14 │ │ └── lifecycle-trace-mcp/NEWday 3–4 │ │ ├── src/server.tsNEW │ │ └── package.jsonNEW │ └── .claude/ │ ├── mcp.jsonEDITday 3 · register trace MCP │ └── commands/ │ └── assign-prds.mdEDITday 8 · call /v1/grade/engineer │ ├── g8/ │ ├── .github/workflows/ │ │ ├── normalize-mbm-trigger.ymlNEWday 1 │ │ ├── sync-main-to-qa.ymlEDITday 1 · token guard │ │ ├── test-writer-on-pr.ymlNEWday 6 │ │ ├── auto-promote-qa-to-main.ymlNEWday 10 │ │ ├── code-quality.ymlEDITday 11+ · remove continue-on-error │ │ └── mbm-changes-requested-fixer.yamlEDITday 1 · doc attempt-4 rule │ ├── .claude/ │ │ ├── mcp.jsonEDITday 3 · register trace MCP │ │ └── commands/ │ │ ├── start.mdEDITday 30 · merge in analyse-system + investigate-bug │ │ ├── ship.mdEDITday 15 · merge in commit │ │ ├── analyse-system.mdDELETEday 30 │ │ ├── investigate-bug.mdDELETEday 30 │ │ ├── commit.mdDELETEday 15 │ │ ├── security-review.mdDELETEday 26 · merged into /review │ │ ├── capacity-check.mdDELETEday 22 · merged into /capacity │ │ └── ...legacy article/changelog skills...DELETEday 30 │ └── .claude/engineer-domains.jsonEDITday 25 · add joined:YYYY-MM-DD per engineer │ └── g8-eda-server/ └── events/ └── schemas/EDITday 24 · publish contract schemas
Totals: NEW 28 files · EDIT 14 files · DELETE ~10 files (skills + 2 agents) · SETTINGS 2 GitHub branch-protection rules.
5 Build order

The exact sequence, with the command or code you'll write at each step.

Every step has a verification line. If the verification doesn't pass, don't move on. Order matters — later steps depend on earlier ones.

1

Enable branch protection · both repos

Day 1 · 20 min total · human-only · no PR needed
Closes the biggest "anyone can ship anything" hole today. Two clicks per repo. Doesn't slow the autonomous shipping — MBM bot review still satisfies the approval requirement.
# GitHub UI:
graph8-com/infra/settings/branches → Add classic rule for `main`
  ✓ Require a pull request before merging
  ✓ Require approvals (1)
  ✓ Require review from Code Owners
  ✓ Require status checks: terraform, validate-k8s
  ✓ Require linear history

graph8-com/g8/settings/branches → same rule for `main` and `qa`
  ✓ Require status checks: code-quality jobs
Verify: from your laptop, git push origin main fails with "Protected branch."
2

Ship the ledger table

Day 2 · half-day · in mbm-pg
Foundation of everything. One table, three indexes. Lives in the existing mbm-pg deployment.
# Create migration file:
mkdir -p services/mbm/migrations
cat > services/mbm/migrations/20260517_skill_invocations.sql <<'SQL'
-- (paste the CREATE TABLE from §3 Data model)
SQL

# Apply via your existing migration runner (Atlas or Alembic):
make migrate
Verify: psql $DATABASE_URL -c "\d skill_invocations" shows the table.
3

POST /v1/trace endpoint

Day 2–3 · 1 day · in MBM service
The single ingest endpoint. Authenticated by an org-issued API token. Idempotent on (engineer_id, ts, skill_name).
# services/mbm/internal/trace/handler.go
package trace

type TraceRow struct {
  EngineerID    string  `json:"engineer_id"`
  SkillName     string  `json:"skill_name"`
  Repo          string  `json:"repo,omitempty"`
  Runtime       string  `json:"runtime"`
  Model         string  `json:"model,omitempty"`
  LatencyMS     int     `json:"latency_ms,omitempty"`
  Status        string  `json:"status"`
  InputTokens   *int    `json:"input_tokens,omitempty"`
  OutputTokens  *int    `json:"output_tokens,omitempty"`
  CostUSDCents  *int    `json:"cost_usd_cents,omitempty"`
  ArgHash       string  `json:"arg_hash,omitempty"`
}

func Handler(w http.ResponseWriter, r *http.Request) {
  // 1. validate API token from header
  // 2. decode JSON
  // 3. INSERT row → return 202 with id
}
Verify: curl -X POST $MBM_URL/v1/trace -H "Authorization: Bearer ..." -d '{...}' returns 202 and the row exists.
4

MCP wrapper + Claude Code hook

Day 3–4 · 1 day · distribute to every engineer
Every local Claude Code session POSTs to /v1/trace on every tool call. Plus a settings.json hook covers slash-command fires.
# services/lifecycle-trace-mcp/src/server.ts (minimal MCP server)
import { Server } from "@modelcontextprotocol/sdk/server";

const server = new Server({ name: "lifecycle-trace", version: "0.1.0" });

server.setRequestHandler("tools/call", async (req, ctx) => {
  const t0 = Date.now();
  try {
    return await ctx.delegate(req);
  } finally {
    fetch(`${process.env.MBM_URL}/v1/trace`, {
      method: "POST",
      headers: { "Authorization": `Bearer ${process.env.MBM_TOKEN}` },
      body: JSON.stringify({
        engineer_id: process.env.GITHUB_USER,
        skill_name: req.params.name,
        runtime: "claude-code-local",
        latency_ms: Date.now() - t0,
        status: "success",
      }),
    }).catch(() => {}); // fire-and-forget; never block the user
  }
});
# Register in both repos: .claude/mcp.json
{
  "mcpServers": {
    "lifecycle-trace": {
      "command": "npx",
      "args": ["-y", "@graph8/lifecycle-trace-mcp"],
      "env": { "MBM_URL": "https://mbm.graph8.com" }
    }
  }
}
Verify: run any skill locally, then SELECT count(*) FROM skill_invocations WHERE engineer_id='your-username' AND ts > now() - interval '5 min' returns a row.
5

Engineer-utilization Grafana dashboard

Day 5 · 2 hr
First user-visible artifact. One SQL query, one Grafana panel. The thing leadership opens daily.
-- The query that powers the panel:
SELECT engineer_id,
       count(*)                                        AS fires_30d,
       count(*) FILTER (WHERE status = 'success')      AS successful,
       count(DISTINCT skill_name)                      AS unique_skills,
       (SELECT skill_name FROM skill_invocations s2
        WHERE s2.engineer_id = s1.engineer_id
        GROUP BY skill_name ORDER BY count(*) DESC LIMIT 1) AS top_skill,
       CASE
         WHEN count(*) > 300 THEN 'excellent'
         WHEN count(*) > 150 THEN 'strong'
         WHEN count(*) > 50  THEN 'developing'
         WHEN count(*) > 20  THEN 'untapped'
         ELSE 'spinning?'
       END AS utilization
FROM skill_invocations s1
WHERE ts > now() - interval '30 days'
GROUP BY engineer_id
ORDER BY fires_30d DESC;
Verify: Grafana panel renders with at least 5 engineers and grade column matches the heuristic.
6

Agent-economics Grafana dashboard

Day 6 · 2 hr
Other half of the visibility pair. For K8s OAuth-pool runs only — real $/merged-PR per agent.
-- The query:
SELECT skill_name AS agent,
       count(*)                                       AS runs,
       sum(cost_usd_cents) / 100.0                    AS cost_usd,
       (SELECT count(*) FROM github_prs p
        WHERE p.head_branch LIKE 'axon/' || s.skill_name || '/%'
          AND p.merged_at > now() - interval '30 days') AS prs_merged
FROM skill_invocations s
WHERE runtime = 'k8s-job'
  AND ts > now() - interval '30 days'
GROUP BY skill_name
ORDER BY cost_usd DESC;
Verify: g8_5xx_fixer's row visible with $/merged-PR computed. Conditional-format red >$30, yellow >$15, green ≤$15.
7–10

The four day-1 cleanups (parallel to ledger kick-off)

Day 1 · 4 hrs total · parallel to steps 1–2
Six small landmines closed in one morning by whoever isn't building the ledger. Each is a small workflow or prompt edit.
7. Lowercase normalizer · g8/.github/workflows/normalize-mbm-trigger.yml
   Trigger: issue_comment.created
   If body matches /@(?i)mbm\s+review/ but not /@mbm review/ → re-post lowercase

8. Sync token guard · g8/.github/workflows/sync-main-to-qa.yml
   Add step 1: if [ -z "${{ secrets.SYNC_BOT_TOKEN }}" ]; then exit 1; fi

9. pr_fixer attempt-4 · tenants/graph8-eng/agents/pr_fixer/prompt.md
   Append to the Retry-cap section: on attempt 4, label mbm/needs-input,
   post summary comment tagging the author, stop.

10. Retire pr_cleanup + rage_click_detector
    Decide: add TaskSpawner entries to triggers.yaml OR rm -rf the agent dirs.
Verify: all four PRs merged on day 1. Agent count is either 25-fully-wired or 23-clean.
11

Wire test-writer on PR open

Day 4 · 3 hr
Closes the test-coverage loop in one morning by wiring an agent that already exists. Detects new public functions without coverage; dispatches Claude Code k8s job; agent opens follow-up PR.
# g8/.github/workflows/test-writer-on-pr.yml
name: Test-writer on PR
on:
  pull_request:
    types: [opened, synchronize]
    paths: ['**/*.py']
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Find new public funcs without tests
        run: ./scripts/find-untested.sh
      - name: Dispatch test-writer if needed
        if: steps.scan.outputs.untested != ''
        run: gh api repos/${{ github.repository }}/dispatches \
             -f event_type='spawn-test-writer' \
             -f client_payload[pr]='${{ github.event.number }}'
Verify: open a PR adding def public_foo(): with no test. Follow-up test PR appears within 15 min.
12

agent_health cron + dashboard panel

Day 4 · 4 hr · parallel to step 11
First agent that reads the ledger and writes back. Daily snapshot of each axon agent's PR-survival rate. Surfaces unwired agents automatically.
# tenants/graph8-eng/agents/agent_health/prompt.md (excerpt)
Every day at 05:00 UTC:
1. Query agent_health_stats: yesterday's runs, PRs, reverts per axon agent.
2. Compute survival_rate = merged_not_reverted / opened.
3. Compute 7-day moving average.
4. If any agent < 80%, flag in Roam digest.
5. If any agent < 50%, page on-call.
6. Write today's row to agent_health_stats.

# tenants/graph8-eng/crons.yaml — add:
- name: agent-health-daily
  schedule: "0 5 * * *"
  agent: agent_health
Verify: after 2 days, query SELECT * FROM agent_health_stats ORDER BY snapshot_date DESC LIMIT 25 returns one row per agent per day.
13–16

Day 5 batch · scoring, soak SLO, bandit, first skill merge

Day 5 · all four in parallel · ~half-day each
By Friday, the foundation is live and the visibility layer works. Today: four parallel agent dispatches. Engineer reviews each PR as it lands. Bandit flip (task 15) needs your judgement on findings — block ~1 hr for that.
13. engineer_scoring.go (half-day): Go file + 8 tests, POST /v1/grade/engineer.
    Update infra/.claude/commands/assign-prds.md to call the endpoint.

14. auto-promote-qa-to-main.yml (3 hr): hourly check, opens PR if QA green ≥48h.

15. Remove continue-on-error from bandit (half-day, parallel):
    Run bandit on main, fix findings, flip the flag.

16. Skill consolidation round 1 (1 hr, parallel): /commit folded into /ship.
    Skill count: 22 → 21.
Verify: by EOD Friday — scoring API tested, auto-promote workflow exists and has been triggered, bandit blocks at least one PR with a new finding, skill count = 21.
17–20

Day 6 batch · mbm_critic, mortality, Worker scaffold, skill merge 2

Day 6 · 4 tasks parallel · second loop closes
Saturday is the "agent-watching-agents" day. First agent that reads MBM's own decisions ships. Skill-mortality auto-issues open. Cloudflare Worker scaffolded for tomorrow's cutover.
17. mbm_critic agent (4 hr): daily cron classifies override patterns.
    Already-scaffolded prompt at tenants/graph8-eng/agents/mbm_critic/.

18. skill_mortality.go (2 hr, parallel): weekly SQL, auto-open deprecation issues.

19. Cloudflare Worker (5 hr): /tenants/<id>/webhook accepts GH webhooks, HMAC-validates,
    creates Task CRs via MBM API. Don't cut over yet — that's tomorrow.

20. Skill consolidation round 2 (1 hr, parallel):
    /assign-prds + /capacity-check → /capacity. Skill count: 21 → 20.
Verify: mbm_critic completes first daily run; one deprecation issue auto-opens; Worker accepts a staging webhook and returns 200; skill count = 20.
21–23

Day 7 batch · cutover, acceptance, demo

Day 7 · the proof point + the recording
Sunday: prove the latency win on the 5xx-error cutover. Run the 8-point acceptance check. Record a 10-minute demo for board / hiring / Anthropic.
21. Cut over 5xx-error label to webhook (2 hr).
    Edit tenants/graph8-eng/triggers.yaml; measure p50 dispatch latency.
    Expected: ~1 hour → under 60 seconds.

22. Run the 8-metric acceptance grid (1 hr).
    Walk every metric, run every verification command. Target: 6 / 8 green.

23. Record the 10-minute demo (2 hr).
    Loom walking through: vision page → click stat → drill into skill →
    live engineer-utilization dashboard → webhook latency proof →
    one closed loop in action.
Verify: 5xx-error test issue dispatches in <60s; 6+ acceptance metrics green; demo posted to Roam + #engineering. Week 1 shipped.

Week 2 (days 8–14) · the slower-burn loops

After acceptance passes · ~5 components
These need the ledger + webhooks in place first. Roll remaining labels to webhooks. Ship knowledge_compactor. Scaffold the top-5 cross-repo contracts + contract_test_runner. Ship bug_predictor (comment-only mode, to tune false-positive rate). Ship onboarding agent. Skill consolidation rounds 3+4. By day 14: 6/8 loops live, skill count = 14, label→pickup <60s across the board.
Day 8–9   · Roll remaining labels to webhooks (1 day, agent-driven).
Day 9–11  · knowledge_compactor (context_fetches table + agent + daily cron).
Day 10–11 · Ruff blocking flip (next CI ratchet step).
Day 11–13 · Top-5 cross-repo contracts + contract_test_runner agent.
Day 12–14 · bug_predictor agent (comment-only at first).
Day 13–14 · onboarding agent · add joined field to engineer-domains.json.
Day 13    · Skill consolidation round 3 (/review + /security-review).
Day 14    · Skill consolidation round 4 (/start branches).
Day 14 verify: 6/8 loops have a run in the last 24h. Skill count = 14. Label→pickup p50 <60s for all labels.
6 Acceptance

Eight metrics that say "shipped in week 1." Six green = done.

Run this checklist Sunday afternoon (day 7). Each metric has a verification command. If six or more are green, declare success and start week-2 horizon items (knowledge_compactor, contracts, bug_predictor, onboarding).

Day 7 metric 1
Branch protection on both repos
Verify: git push origin main from any non-MBM machine fails with "Protected branch" on infra and g8.
Day 7 metric 2
Ledger ingestion ≥ 200 rows/day
Verify: SELECT count(*) FROM skill_invocations WHERE ts > now() - interval '1 day';
Day 7 metric 3
Your own utilization visible
Verify: your engineer_id appears in the dashboard with a grade. Local /start fires land in the ledger within 5 min.
Day 7 metric 4
5xx-error pickup p50 < 60 s
Verify: webhook live for that label; create a test issue, watch Task CR appear in <60 s.
Day 7 metric 5
Skill count = 20 (was 22)
Verify: rounds 1+2 of consolidation merged · /commit + /assign-prds + /capacity-check gone.
Day 7 metric 6
qa → main auto-promote live
Verify: workflow exists and has fired at least once.
Day 7 metric 7
Bandit blocking
Verify: continue-on-error removed from bandit job in code-quality.yml.
Day 7 metric 8
3 loops live
Verify: test-writer + agent_health + mbm_critic each have a recent run in the ledger or stats table.
Principle #3 · ongoing SLO
CI build time < 60 s
Verify: gh run list --repo graph8-com/g8 --limit 20 --json conclusion,createdAt,updatedAt · median duration under 60 seconds. Rebuild build tooling whenever it slips.
Principle #1 · cultural
Skill fires / engineer / day > 10
Verify: SELECT engineer_id, count(*) FROM skill_invocations WHERE ts > now()-interval '1 day' GROUP BY engineer_id · everyone in the team > 10. Anything less means someone is typing code by hand.
Principle #7 · ongoing
Lint registry shipped
Verify: tenants/graph8-eng/lints/ exists with ≥ 5 bespoke lint rules; each error message is a remediation prompt (names canonical alternative + links to CLAUDE.md). Auto-applied to every PR.
After day 7 · week 2 (days 8–14)

Once the foundation is live, week 2 closes the remaining loops: knowledge_compactor, cross-repo contracts + contract_test_runner, bug_predictor (comment-only mode for tuning), onboarding agent, plus skill-consolidation rounds 3+4. By day 14: 6/8 loops live, skill count = 14, all labels on webhooks. See the original 8 loops →

On this page