Engineering spec · Lifecycle

2 Components

23 things to build, grouped by phase.

Every component has a status badge (NEW · EDIT · DELETE · SETTINGS), a target day, the file path, and a one-line "what" and "depends on."

Foundation

Days 1–7 · capture + visibility + governance

9 components

Branch protection · infra

SETTINGS

GitHub repo setting. Require PR + status checks + CODEOWNERS approval + linear history on main.

Where github.com/graph8-com/infra/settings/branches

Day 1 · 10 min

Branch protection · g8

SETTINGS

Same rule on main and qa. MBM bot review counts as the required approval for non-migration paths.

Where github.com/graph8-com/g8/settings/branches

Day 1 · 10 min

Lowercase `@mbm review` normalizer

NEW

GitHub Action that re-posts the trigger as lowercase if anyone types it wrong-case. Closes a silent-failure landmine.

File g8/.github/workflows/normalize-mbm-trigger.yml

Day 1 · 30 min

Sync token guard

EDIT

Add a fail-loud step at the top of sync-main-to-qa.yml. If SYNC_BOT_TOKEN is missing, fail with a Slack ping. No more silent skipping.

File g8/.github/workflows/sync-main-to-qa.yml

Day 1 · 15 min

`pr_fixer` attempt-4 rule

EDIT

Document escalation: on attempt 4, label mbm/needs-input, post a summary comment tagging the author, stop. Removes "agent stuck forever" mode.

File tenants/graph8-eng/agents/pr_fixer/prompt.md

Day 1 · 15 min

Retire `pr_cleanup` + `rage_click_detector`

DELETE

Both have prompts but no TaskSpawner dispatches them. Decide: wire (add trigger to triggers.yaml) or delete the folder. No "in-between."

Where tenants/graph8-eng/agents/{pr_cleanup,rage_click_detector}/

Day 1 · 30 min

Trace ledger table

NEW

Single Postgres table that captures every skill invocation across local Claude Code, local Codex, K8s agents, and Modal. Foundation of everything that follows.

File services/mbm/migrations/20260517_skill_invocations.sql

Day 2 · half-day · depends on mbm-pg

Trace MCP wrapper + Claude hook

NEW

TypeScript MCP server that wraps tool calls and POSTs trace rows. Settings.json hook covers slash-command fires. Distributed to every engineer.

File services/lifecycle-trace-mcp/ · .claude/mcp.json (both repos)

Day 3–4 · 1 day · depends on ledger

Engineer-utilization dashboard

NEW

Grafana dashboard reading from the ledger. Shows fires / unique-skills / top-skill / utilization-grade per engineer.

File k8s/monitoring/dashboards/lifecycle-engineer.json

Day 5 · 2 hr · depends on MCP wrapper

Agent-economics dashboard

NEW

Grafana dashboard showing per-K8s-agent token spend, PRs opened/merged, $/merged-PR. Where real OAuth-pool $ optimization lives.

File k8s/monitoring/dashboards/lifecycle-agent-economics.json

Day 6 · 2 hr

First loops

Days 6–14 · close the agent-to-agent loops the system can close immediately

7 components

Wire `test-writer` agent

NEW

The agent already exists at g8/.claude/agents/test-writer.md but isn't dispatched. New workflow triggers it on PR open, opens a follow-up PR with the missing tests.

File g8/.github/workflows/test-writer-on-pr.yml

Day 6 · 3 hr

`agent_health` cron

NEW

Daily cron measures each axon agent's PR survival rate. Flags regressions, surfaces unwired agents. The first agent that watches the agents.

File tenants/graph8-eng/agents/agent_health/prompt.md · crons.yaml

Day 7 · 4 hr

Engineer scoring service

NEW

Go module + 8 test cases. Replaces the prompt-derived algorithm in /assign-prds. Skill calls POST /v1/grade/engineer instead of re-deriving.

File services/mbm/internal/grader/engineer_scoring{,_test}.go

Day 8–9 · 2 days

Auto-promote qa → main

NEW

Hourly workflow opens the qa→main PR automatically once QA has been green ≥48h. Closes the biggest cycle-time dark zone.

File g8/.github/workflows/auto-promote-qa-to-main.yml

Day 10 · 3 hr

Bandit blocking flip

EDIT

Remove continue-on-error: true from the bandit job. First in the CI ratchet — smallest blast radius, clearest fixes.

File g8/.github/workflows/code-quality.yml

Day 11–12 · 1 day

`mbm_critic` agent

NEW

Daily cron classifies PRs where MBM's CHANGES_REQUESTED was dismissed by the merger. Proposes rubric updates to reviewer/prompt.md. The first agent that reads MBM's output.

File tenants/graph8-eng/agents/mbm_critic/prompt.md · crons.yaml

Day 13–14 · 2 days

Skill-mortality job

NEW

Weekly SQL + auto-issue. Skills with <5 fires/month in the ledger → auto-open deprecation issue with the consolidation suggestion from the merge table.

File services/mbm/internal/jobs/skill_mortality.go

Day 14 · 2 hr

Scale

Days 15–30 · webhooks, cross-repo, consolidation, prediction

7 components

Tenant webhook receiver

NEW

Cloudflare Worker that accepts GitHub webhooks, validates HMAC, translates events into Task CR creation. Replaces 2h polling with sub-60s dispatch.

File workers/tenant-webhook/

Day 15–19 · 5 days

TaskSpawner webhook cutover

EDIT

Flip every label entry in triggers.yaml from polling to webhook. Start with 5xx-error; roll the rest after 24h of observation.

File tenants/graph8-eng/triggers.yaml

Day 19–20 · 1 day

`knowledge_compactor` agent

NEW

Reads context_fetches table. Files fetched ≥10 times this week → propose CLAUDE.md addition via PR. The system writes its own docs.

File tenants/graph8-eng/agents/knowledge_compactor/prompt.md · axon runner edit for fetch logging

Day 21–23 · 3 days

Cross-repo contracts + runner

NEW

JSON schemas for the top 5 cross-repo interfaces (g8 ↔ g8-eda-server first). contract_test_runner agent validates schema changes on every relevant PR.

File tenants/graph8-eng/contracts/*.json · agents/contract_test_runner/prompt.md

Day 24–28 · 5 days

`bug_predictor` agent

NEW

Reads each PR diff + the last 30 days of Sentry stack traces, flags lines whose patterns historically broke things. Comment-only for first 2 weeks (tune false-positive rate).

File tenants/graph8-eng/agents/bug_predictor/prompt.md + TaskSpawner

Day 24–28 · 5 days

`onboarding` agent

NEW

Reads joined field from engineer-domains.json. Daily nudges in Slack for engineers in their first 28 days: missing skill usage, unused conventions, relevant feature-level CLAUDE.md.

File tenants/graph8-eng/agents/onboarding/prompt.md · edit engineer-domains.json

Day 25–28 · 4 days

Skill consolidation · 22 → 14

EDIT

One merge per week across .claude/commands/ in both repos. Week 3: /commit→/ship. Week 4: /assign-prds+/capacity-check→/capacity, then /review+/security-review, then /start branches.

File .claude/commands/ in both repos

Day 15–30 · rolling

3 Data model

Three tables + two API endpoints + one webhook format.

The schema is intentionally tiny. Everything downstream — dashboards, loops, alerts — is a query against these tables. Add a column when you have a reason. Don't add tables.

`skill_invocations` NEW

One row per skill fire. Token / cost columns are NULL for Max-plan local runs (no metering possible); populated for K8s-agent runs (OAuth pool).

CREATE TABLE skill_invocations (
  id              BIGSERIAL PRIMARY KEY,
  ts              TIMESTAMPTZ NOT NULL DEFAULT now(),
  engineer_id     TEXT NOT NULL,         -- github username
  skill_name      TEXT NOT NULL,         -- e.g. /start, /ship, agent name for K8s runs
  repo            TEXT,                  -- graph8-com/g8, graph8-com/infra
  runtime         TEXT NOT NULL,         -- claude-code-local | codex-local | k8s-job | modal
  model           TEXT,                  -- claude-opus-4-7, claude-haiku-4-5, etc.
  latency_ms      INT,
  status          TEXT NOT NULL,         -- success | abandoned | error
  input_tokens    INT,                   -- NULL for max-plan local runs
  output_tokens   INT,                   -- NULL for max-plan local runs
  cost_usd_cents  INT,                   -- NULL for max-plan local runs
  arg_hash        TEXT                   -- anonymized input fingerprint
);
CREATE INDEX ON skill_invocations (engineer_id, ts);
CREATE INDEX ON skill_invocations (skill_name, ts);
CREATE INDEX ON skill_invocations (runtime, ts);

`context_fetches` NEW · day 21

Drives the knowledge-compounding loop. Agents POST a row every time they read a file. Daily cron looks for files fetched ≥10 times this week and proposes them for CLAUDE.md.

CREATE TABLE context_fetches (
  id          BIGSERIAL PRIMARY KEY,
  ts          TIMESTAMPTZ NOT NULL DEFAULT now(),
  agent_name  TEXT NOT NULL,
  file_path   TEXT NOT NULL,
  repo        TEXT
);
CREATE INDEX ON context_fetches (file_path, ts);

`agent_health_stats` NEW · day 7

Daily snapshot, computed by agent_health cron. Source for the agent-economics dashboard and for regression alerts.

CREATE TABLE agent_health_stats (
  snapshot_date    DATE NOT NULL,
  agent_name       TEXT NOT NULL,
  runs_30d         INT NOT NULL,
  prs_opened_30d   INT NOT NULL,
  prs_merged_30d   INT NOT NULL,
  prs_reverted_30d INT NOT NULL,
  survival_rate    NUMERIC(5,3),         -- merged-not-reverted / opened
  token_cost_cents INT,
  PRIMARY KEY (snapshot_date, agent_name)
);

API · `POST /v1/trace` NEW · day 2

The single ingest endpoint. Every capture path POSTs to this. Authenticated by an org-issued API token (per-machine or per-pod).

Request:
{
  "engineer_id":   "thomas-c",
  "skill_name":    "/start",
  "repo":          "graph8-com/g8",
  "runtime":       "claude-code-local",
  "model":         "claude-opus-4-7",
  "latency_ms":    1240,
  "status":        "success",
  "input_tokens":  null,                  -- max-plan, no metering
  "output_tokens": null,
  "arg_hash":      "sha256:8f3a..."
}

Response 202 Accepted:
{ "id": 847132 }

API · `POST /v1/grade/engineer` NEW · day 8

Replaces the prompt-derived algorithm in /assign-prds. Same algorithm, but now testable and consistent. Skill calls this endpoint instead of re-deriving.

Request:
{
  "prd": {
    "slug":       "ai-inbox-meetings-admin-settings",
    "domains":    ["ai_inbox", "frontend/features/inbox"],
    "complexity": "medium",
    "gtm_score":  12
  }
}

Response 200 OK:
{
  "scores": [
    { "engineer_id": "hassan-b", "score": 7, "reason": "primary: ai_inbox(+3), primary: inbox(+3), under target(+1)" },
    { "engineer_id": "hamza-n", "score": 4, "reason": "secondary: ai_inbox(+1), 2 PRDs assigned(+3)" }
  ]
}

Webhook · GitHub → `/tenants/<id>/webhook` NEW · day 15

Cloudflare Worker entry point. Replaces 2h polling. Validates HMAC, translates to Task CR.

GitHub webhook (issues.labeled, pull_request, etc.) →
POST /tenants/graph8-eng/webhook
X-Hub-Signature-256: sha256=...
Content-Type: application/json

Worker validates signature, extracts label, queries triggers.yaml:
{
  "name":   "infra-issues",
  "agent":  "infra",
  "label":  "infra",
  "repo":   "graph8-com/infra"
}

→ POST to MBM creates Task CR → axon controller spawns Job → pod runs.
Target end-to-end p50 latency: under 60 seconds (was ~1 hour with polling).

4 File tree

Every file that gets touched, with status and target day.

Grep this section. If a file isn't here, the spec isn't asking you to touch it. If a file is here, the badge tells you whether to create, edit, or delete it; the day tells you when.

graph8-com/ ├── infra/ │ ├── .github/workflows/ │ │ └── infra-ci.yamlEDITre-enable deploy-k8s post aws-cp · later │ ├── tenants/graph8-eng/ │ │ ├── agents/ │ │ │ ├── pr_fixer/prompt.mdEDITday 1 · attempt-4 rule │ │ │ ├── pr_cleanup/DELETEday 1 (unless wired) │ │ │ ├── rage_click_detector/DELETEday 1 (unless wired) │ │ │ ├── agent_health/prompt.mdNEWday 7 │ │ │ ├── mbm_critic/prompt.mdNEWday 13–14 │ │ │ ├── knowledge_compactor/prompt.mdNEWday 21–23 │ │ │ ├── contract_test_runner/prompt.mdNEWday 24–28 │ │ │ ├── bug_predictor/prompt.mdNEWday 24–28 │ │ │ └── onboarding/prompt.mdNEWday 25–28 │ │ ├── contracts/NEWday 24 · 5 schema files │ │ │ ├── g8-eda-events.jsonNEW │ │ │ └── ...four more...NEW │ │ ├── crons.yamlEDIT+4 entries · days 7, 14, 21, 28 │ │ ├── triggers.yamlEDITpolling → webhook · day 19–20 │ │ └── tenant.yamlEDITregister new agents · rolling │ ├── k8s/monitoring/dashboards/ │ │ ├── lifecycle-engineer.jsonNEWday 5 │ │ └── lifecycle-agent-economics.jsonNEWday 6 │ ├── workers/tenant-webhook/NEWday 15–19 │ │ ├── src/index.tsNEW │ │ ├── wrangler.tomlNEW │ │ └── package.jsonNEW │ ├── services/ │ │ ├── mbm/ │ │ │ ├── migrations/ │ │ │ │ ├── 20260517_skill_invocations.sqlNEWday 2 │ │ │ │ ├── 20260518_context_fetches.sqlNEWday 21 │ │ │ │ └── 20260519_agent_health_stats.sqlNEWday 7 │ │ │ └── internal/ │ │ │ ├── trace/ │ │ │ │ └── handler.goNEWday 2–3 · POST /v1/trace │ │ │ ├── grader/ │ │ │ │ ├── engineer_scoring.goNEWday 8 │ │ │ │ └── engineer_scoring_test.goNEWday 8 · 8 cases │ │ │ └── jobs/ │ │ │ └── skill_mortality.goNEWday 14 │ │ └── lifecycle-trace-mcp/NEWday 3–4 │ │ ├── src/server.tsNEW │ │ └── package.jsonNEW │ └── .claude/ │ ├── mcp.jsonEDITday 3 · register trace MCP │ └── commands/ │ └── assign-prds.mdEDITday 8 · call /v1/grade/engineer │ ├── g8/ │ ├── .github/workflows/ │ │ ├── normalize-mbm-trigger.ymlNEWday 1 │ │ ├── sync-main-to-qa.ymlEDITday 1 · token guard │ │ ├── test-writer-on-pr.ymlNEWday 6 │ │ ├── auto-promote-qa-to-main.ymlNEWday 10 │ │ ├── code-quality.ymlEDITday 11+ · remove continue-on-error │ │ └── mbm-changes-requested-fixer.yamlEDITday 1 · doc attempt-4 rule │ ├── .claude/ │ │ ├── mcp.jsonEDITday 3 · register trace MCP │ │ └── commands/ │ │ ├── start.mdEDITday 30 · merge in analyse-system + investigate-bug │ │ ├── ship.mdEDITday 15 · merge in commit │ │ ├── analyse-system.mdDELETEday 30 │ │ ├── investigate-bug.mdDELETEday 30 │ │ ├── commit.mdDELETEday 15 │ │ ├── security-review.mdDELETEday 26 · merged into /review │ │ ├── capacity-check.mdDELETEday 22 · merged into /capacity │ │ └── ...legacy article/changelog skills...DELETEday 30 │ └── .claude/engineer-domains.jsonEDITday 25 · add joined:YYYY-MM-DD per engineer │ └── g8-eda-server/ └── events/ └── schemas/EDITday 24 · publish contract schemas

Totals: NEW 28 files · EDIT 14 files · DELETE ~10 files (skills + 2 agents) · SETTINGS 2 GitHub branch-protection rules.

5 Build order

The exact sequence, with the command or code you'll write at each step.

Every step has a verification line. If the verification doesn't pass, don't move on. Order matters — later steps depend on earlier ones.

Enable branch protection · both repos

Day 1 · 20 min total · human-only · no PR needed

Closes the biggest "anyone can ship anything" hole today. Two clicks per repo. Doesn't slow the autonomous shipping — MBM bot review still satisfies the approval requirement.

# GitHub UI:
graph8-com/infra/settings/branches → Add classic rule for `main`
  ✓ Require a pull request before merging
  ✓ Require approvals (1)
  ✓ Require review from Code Owners
  ✓ Require status checks: terraform, validate-k8s
  ✓ Require linear history

graph8-com/g8/settings/branches → same rule for `main` and `qa`
  ✓ Require status checks: code-quality jobs

Verify: from your laptop, git push origin main fails with "Protected branch."

Ship the ledger table

Day 2 · half-day · in mbm-pg

Foundation of everything. One table, three indexes. Lives in the existing mbm-pg deployment.

# Create migration file:
mkdir -p services/mbm/migrations
cat > services/mbm/migrations/20260517_skill_invocations.sql <<'SQL'
-- (paste the CREATE TABLE from §3 Data model)
SQL

# Apply via your existing migration runner (Atlas or Alembic):
make migrate

Verify: psql $DATABASE_URL -c "\d skill_invocations" shows the table.

POST /v1/trace endpoint

Day 2–3 · 1 day · in MBM service

The single ingest endpoint. Authenticated by an org-issued API token. Idempotent on (engineer_id, ts, skill_name).

# services/mbm/internal/trace/handler.go
package trace

type TraceRow struct {
  EngineerID    string  `json:"engineer_id"`
  SkillName     string  `json:"skill_name"`
  Repo          string  `json:"repo,omitempty"`
  Runtime       string  `json:"runtime"`
  Model         string  `json:"model,omitempty"`
  LatencyMS     int     `json:"latency_ms,omitempty"`
  Status        string  `json:"status"`
  InputTokens   *int    `json:"input_tokens,omitempty"`
  OutputTokens  *int    `json:"output_tokens,omitempty"`
  CostUSDCents  *int    `json:"cost_usd_cents,omitempty"`
  ArgHash       string  `json:"arg_hash,omitempty"`
}

func Handler(w http.ResponseWriter, r *http.Request) {
  // 1. validate API token from header
  // 2. decode JSON
  // 3. INSERT row → return 202 with id
}

Verify: curl -X POST $MBM_URL/v1/trace -H "Authorization: Bearer ..." -d '{...}' returns 202 and the row exists.

MCP wrapper + Claude Code hook

Day 3–4 · 1 day · distribute to every engineer

Every local Claude Code session POSTs to /v1/trace on every tool call. Plus a settings.json hook covers slash-command fires.

# services/lifecycle-trace-mcp/src/server.ts (minimal MCP server)
import { Server } from "@modelcontextprotocol/sdk/server";

const server = new Server({ name: "lifecycle-trace", version: "0.1.0" });

server.setRequestHandler("tools/call", async (req, ctx) => {
  const t0 = Date.now();
  try {
    return await ctx.delegate(req);
  } finally {
    fetch(`${process.env.MBM_URL}/v1/trace`, {
      method: "POST",
      headers: { "Authorization": `Bearer ${process.env.MBM_TOKEN}` },
      body: JSON.stringify({
        engineer_id: process.env.GITHUB_USER,
        skill_name: req.params.name,
        runtime: "claude-code-local",
        latency_ms: Date.now() - t0,
        status: "success",
      }),
    }).catch(() => {}); // fire-and-forget; never block the user
  }
});

# Register in both repos: .claude/mcp.json
{
  "mcpServers": {
    "lifecycle-trace": {
      "command": "npx",
      "args": ["-y", "@graph8/lifecycle-trace-mcp"],
      "env": { "MBM_URL": "https://mbm.graph8.com" }
    }
  }
}

Verify: run any skill locally, then SELECT count(*) FROM skill_invocations WHERE engineer_id='your-username' AND ts > now() - interval '5 min' returns a row.

Engineer-utilization Grafana dashboard

Day 5 · 2 hr

First user-visible artifact. One SQL query, one Grafana panel. The thing leadership opens daily.

-- The query that powers the panel:
SELECT engineer_id,
       count(*)                                        AS fires_30d,
       count(*) FILTER (WHERE status = 'success')      AS successful,
       count(DISTINCT skill_name)                      AS unique_skills,
       (SELECT skill_name FROM skill_invocations s2
        WHERE s2.engineer_id = s1.engineer_id
        GROUP BY skill_name ORDER BY count(*) DESC LIMIT 1) AS top_skill,
       CASE
         WHEN count(*) > 300 THEN 'excellent'
         WHEN count(*) > 150 THEN 'strong'
         WHEN count(*) > 50  THEN 'developing'
         WHEN count(*) > 20  THEN 'untapped'
         ELSE 'spinning?'
       END AS utilization
FROM skill_invocations s1
WHERE ts > now() - interval '30 days'
GROUP BY engineer_id
ORDER BY fires_30d DESC;

Verify: Grafana panel renders with at least 5 engineers and grade column matches the heuristic.

Agent-economics Grafana dashboard

Day 6 · 2 hr

Other half of the visibility pair. For K8s OAuth-pool runs only — real $/merged-PR per agent.

-- The query:
SELECT skill_name AS agent,
       count(*)                                       AS runs,
       sum(cost_usd_cents) / 100.0                    AS cost_usd,
       (SELECT count(*) FROM github_prs p
        WHERE p.head_branch LIKE 'axon/' || s.skill_name || '/%'
          AND p.merged_at > now() - interval '30 days') AS prs_merged
FROM skill_invocations s
WHERE runtime = 'k8s-job'
  AND ts > now() - interval '30 days'
GROUP BY skill_name
ORDER BY cost_usd DESC;

Verify: g8_5xx_fixer's row visible with $/merged-PR computed. Conditional-format red >$30, yellow >$15, green ≤$15.

7–10

The four day-1 cleanups (parallel to ledger kick-off)

Day 1 · 4 hrs total · parallel to steps 1–2

Six small landmines closed in one morning by whoever isn't building the ledger. Each is a small workflow or prompt edit.

7. Lowercase normalizer · g8/.github/workflows/normalize-mbm-trigger.yml
   Trigger: issue_comment.created
   If body matches /@(?i)mbm\s+review/ but not /@mbm review/ → re-post lowercase

8. Sync token guard · g8/.github/workflows/sync-main-to-qa.yml
   Add step 1: if [ -z "${{ secrets.SYNC_BOT_TOKEN }}" ]; then exit 1; fi

9. pr_fixer attempt-4 · tenants/graph8-eng/agents/pr_fixer/prompt.md
   Append to the Retry-cap section: on attempt 4, label mbm/needs-input,
   post summary comment tagging the author, stop.

10. Retire pr_cleanup + rage_click_detector
    Decide: add TaskSpawner entries to triggers.yaml OR rm -rf the agent dirs.

Verify: all four PRs merged on day 1. Agent count is either 25-fully-wired or 23-clean.

Wire `test-writer` on PR open

Day 4 · 3 hr

Closes the test-coverage loop in one morning by wiring an agent that already exists. Detects new public functions without coverage; dispatches Claude Code k8s job; agent opens follow-up PR.

# g8/.github/workflows/test-writer-on-pr.yml
name: Test-writer on PR
on:
  pull_request:
    types: [opened, synchronize]
    paths: ['**/*.py']
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Find new public funcs without tests
        run: ./scripts/find-untested.sh
      - name: Dispatch test-writer if needed
        if: steps.scan.outputs.untested != ''
        run: gh api repos/${{ github.repository }}/dispatches \
             -f event_type='spawn-test-writer' \
             -f client_payload[pr]='${{ github.event.number }}'

Verify: open a PR adding def public_foo(): with no test. Follow-up test PR appears within 15 min.

agent_health cron + dashboard panel

Day 4 · 4 hr · parallel to step 11

First agent that reads the ledger and writes back. Daily snapshot of each axon agent's PR-survival rate. Surfaces unwired agents automatically.

# tenants/graph8-eng/agents/agent_health/prompt.md (excerpt)
Every day at 05:00 UTC:
1. Query agent_health_stats: yesterday's runs, PRs, reverts per axon agent.
2. Compute survival_rate = merged_not_reverted / opened.
3. Compute 7-day moving average.
4. If any agent < 80%, flag in Roam digest.
5. If any agent < 50%, page on-call.
6. Write today's row to agent_health_stats.

# tenants/graph8-eng/crons.yaml — add:
- name: agent-health-daily
  schedule: "0 5 * * *"
  agent: agent_health

Verify: after 2 days, query SELECT * FROM agent_health_stats ORDER BY snapshot_date DESC LIMIT 25 returns one row per agent per day.

13–16

Day 5 batch · scoring, soak SLO, bandit, first skill merge

Day 5 · all four in parallel · ~half-day each

By Friday, the foundation is live and the visibility layer works. Today: four parallel agent dispatches. Engineer reviews each PR as it lands. Bandit flip (task 15) needs your judgement on findings — block ~1 hr for that.

13. engineer_scoring.go (half-day): Go file + 8 tests, POST /v1/grade/engineer.
    Update infra/.claude/commands/assign-prds.md to call the endpoint.

14. auto-promote-qa-to-main.yml (3 hr): hourly check, opens PR if QA green ≥48h.

15. Remove continue-on-error from bandit (half-day, parallel):
    Run bandit on main, fix findings, flip the flag.

16. Skill consolidation round 1 (1 hr, parallel): /commit folded into /ship.
    Skill count: 22 → 21.

Verify: by EOD Friday — scoring API tested, auto-promote workflow exists and has been triggered, bandit blocks at least one PR with a new finding, skill count = 21.

17–20

Day 6 batch · mbm_critic, mortality, Worker scaffold, skill merge 2

Day 6 · 4 tasks parallel · second loop closes

Saturday is the "agent-watching-agents" day. First agent that reads MBM's own decisions ships. Skill-mortality auto-issues open. Cloudflare Worker scaffolded for tomorrow's cutover.

17. mbm_critic agent (4 hr): daily cron classifies override patterns.
    Already-scaffolded prompt at tenants/graph8-eng/agents/mbm_critic/.

18. skill_mortality.go (2 hr, parallel): weekly SQL, auto-open deprecation issues.

19. Cloudflare Worker (5 hr): /tenants/<id>/webhook accepts GH webhooks, HMAC-validates,
    creates Task CRs via MBM API. Don't cut over yet — that's tomorrow.

20. Skill consolidation round 2 (1 hr, parallel):
    /assign-prds + /capacity-check → /capacity. Skill count: 21 → 20.

Verify: mbm_critic completes first daily run; one deprecation issue auto-opens; Worker accepts a staging webhook and returns 200; skill count = 20.

21–23

Day 7 batch · cutover, acceptance, demo

Day 7 · the proof point + the recording

Sunday: prove the latency win on the 5xx-error cutover. Run the 8-point acceptance check. Record a 10-minute demo for board / hiring / Anthropic.

21. Cut over 5xx-error label to webhook (2 hr).
    Edit tenants/graph8-eng/triggers.yaml; measure p50 dispatch latency.
    Expected: ~1 hour → under 60 seconds.

22. Run the 8-metric acceptance grid (1 hr).
    Walk every metric, run every verification command. Target: 6 / 8 green.

23. Record the 10-minute demo (2 hr).
    Loom walking through: vision page → click stat → drill into skill →
    live engineer-utilization dashboard → webhook latency proof →
    one closed loop in action.

Verify: 5xx-error test issue dispatches in <60s; 6+ acceptance metrics green; demo posted to Roam + #engineering. Week 1 shipped.

→

Week 2 (days 8–14) · the slower-burn loops

After acceptance passes · ~5 components

These need the ledger + webhooks in place first. Roll remaining labels to webhooks. Ship knowledge_compactor. Scaffold the top-5 cross-repo contracts + contract_test_runner. Ship bug_predictor (comment-only mode, to tune false-positive rate). Ship onboarding agent. Skill consolidation rounds 3+4. By day 14: 6/8 loops live, skill count = 14, label→pickup <60s across the board.

Day 8–9   · Roll remaining labels to webhooks (1 day, agent-driven).
Day 9–11  · knowledge_compactor (context_fetches table + agent + daily cron).
Day 10–11 · Ruff blocking flip (next CI ratchet step).
Day 11–13 · Top-5 cross-repo contracts + contract_test_runner agent.
Day 12–14 · bug_predictor agent (comment-only at first).
Day 13–14 · onboarding agent · add joined field to engineer-domains.json.
Day 13    · Skill consolidation round 3 (/review + /security-review).
Day 14    · Skill consolidation round 4 (/start branches).

Day 14 verify: 6/8 loops have a run in the last 24h. Skill count = 14. Label→pickup p50 <60s for all labels.

6 Acceptance

Eight metrics that say "shipped in week 1." Six green = done.

Run this checklist Sunday afternoon (day 7). Each metric has a verification command. If six or more are green, declare success and start week-2 horizon items (knowledge_compactor, contracts, bug_predictor, onboarding).

Day 7 metric 1

Branch protection on both repos

Verify: git push origin main from any non-MBM machine fails with "Protected branch" on infra and g8.

Day 7 metric 2

Ledger ingestion ≥ 200 rows/day

Verify: SELECT count(*) FROM skill_invocations WHERE ts > now() - interval '1 day';

Day 7 metric 3

Your own utilization visible

Verify: your engineer_id appears in the dashboard with a grade. Local /start fires land in the ledger within 5 min.

Day 7 metric 4

5xx-error pickup p50 < 60 s

Verify: webhook live for that label; create a test issue, watch Task CR appear in <60 s.

Day 7 metric 5

Skill count = 20 (was 22)

Verify: rounds 1+2 of consolidation merged · /commit + /assign-prds + /capacity-check gone.

Day 7 metric 6

qa → main auto-promote live

Verify: workflow exists and has fired at least once.

Day 7 metric 7

Bandit blocking

Verify: continue-on-error removed from bandit job in code-quality.yml.

Day 7 metric 8

3 loops live

Verify: test-writer + agent_health + mbm_critic each have a recent run in the ledger or stats table.

Principle #3 · ongoing SLO

CI build time < 60 s

Verify: gh run list --repo graph8-com/g8 --limit 20 --json conclusion,createdAt,updatedAt · median duration under 60 seconds. Rebuild build tooling whenever it slips.

Principle #1 · cultural

Skill fires / engineer / day > 10

Verify: SELECT engineer_id, count(*) FROM skill_invocations WHERE ts > now()-interval '1 day' GROUP BY engineer_id · everyone in the team > 10. Anything less means someone is typing code by hand.

Principle #7 · ongoing

Lint registry shipped

Verify: tenants/graph8-eng/lints/ exists with ≥ 5 bespoke lint rules; each error message is a remediation prompt (names canonical alternative + links to CLAUDE.md). Auto-applied to every PR.

After day 7 · week 2 (days 8–14)

Once the foundation is live, week 2 closes the remaining loops: knowledge_compactor, cross-repo contracts + contract_test_runner, bug_predictor (comment-only mode for tuning), onboarding agent, plus skill-consolidation rounds 3+4. By day 14: 6/8 loops live, skill count = 14, all labels on webhooks. See the original 8 loops →

Setting up Lifecycle across graph8-com/infra and graph8-com/g8 — file by file, query by query, day by day.

The whole system on one screen.

23 things to build, grouped by phase.

Days 1–7 · capture + visibility + governance

Branch protection · infra

Branch protection · g8

Lowercase @mbm review normalizer

Sync token guard

pr_fixer attempt-4 rule

Retire pr_cleanup + rage_click_detector

Trace ledger table

Trace MCP wrapper + Claude hook

Engineer-utilization dashboard

Agent-economics dashboard

Days 6–14 · close the agent-to-agent loops the system can close immediately

Wire test-writer agent

agent_health cron

Engineer scoring service

Auto-promote qa → main

Bandit blocking flip

mbm_critic agent

Skill-mortality job

Days 15–30 · webhooks, cross-repo, consolidation, prediction

Tenant webhook receiver

TaskSpawner webhook cutover

knowledge_compactor agent

Cross-repo contracts + runner

bug_predictor agent

onboarding agent

Skill consolidation · 22 → 14

Three tables + two API endpoints + one webhook format.

skill_invocations NEW

context_fetches NEW · day 21

agent_health_stats NEW · day 7

API · POST /v1/trace NEW · day 2

API · POST /v1/grade/engineer NEW · day 8

Webhook · GitHub → /tenants/<id>/webhook NEW · day 15

Every file that gets touched, with status and target day.

The exact sequence, with the command or code you'll write at each step.

Enable branch protection · both repos

Ship the ledger table

POST /v1/trace endpoint

MCP wrapper + Claude Code hook

Engineer-utilization Grafana dashboard

Agent-economics Grafana dashboard

The four day-1 cleanups (parallel to ledger kick-off)

Wire test-writer on PR open

agent_health cron + dashboard panel

Day 5 batch · scoring, soak SLO, bandit, first skill merge

Day 6 batch · mbm_critic, mortality, Worker scaffold, skill merge 2

Day 7 batch · cutover, acceptance, demo

Week 2 (days 8–14) · the slower-burn loops

Eight metrics that say "shipped in week 1." Six green = done.

Lowercase `@mbm review` normalizer

`pr_fixer` attempt-4 rule

Retire `pr_cleanup` + `rage_click_detector`

Wire `test-writer` agent

`agent_health` cron

`mbm_critic` agent

`knowledge_compactor` agent

`bug_predictor` agent

`onboarding` agent

`skill_invocations` NEW

`context_fetches` NEW · day 21

`agent_health_stats` NEW · day 7

API · `POST /v1/trace` NEW · day 2

API · `POST /v1/grade/engineer` NEW · day 8

Webhook · GitHub → `/tenants/<id>/webhook` NEW · day 15

Wire `test-writer` on PR open