Settings + cleanup. Two clicks close the biggest hole.
Dispatch four agents in parallel for the four file-edit tasks. None of them need the ledger. The two GitHub settings tasks are you, in the browser, ~10 min each. By lunch: branch protection on, four PRs open. By EOD: all merged.
Branch protection · graph8-com/infra · main
Two clicks close the biggest "anyone can ship anything" hole. MBM bot review still satisfies the approval requirement — autonomous shipping not slowed.
github.com/graph8-com/infra/settings/branches · require PR · status checks · CODEOWNERS · linear historyBranch protection · graph8-com/g8 · main + qa
Same rule, bigger blast radius. MBM bot approval counts everywhere except migrations (already gated by CODEOWNERS).
github.com/graph8-com/g8/settings/branchesLowercase normalizer for @mbm review
@MBM review currently silently dropped. One small workflow re-posts as lowercase and counts dropped forms.
graph8-com/g8/.github/workflows/normalize-mbm-trigger.yml@MBM review on a test PR produces an auto-followup @mbm review.Retire pr_cleanup + rage_click_detector
Both have prompts but no TaskSpawner. Today: wire (add to triggers.yaml) or delete. No in-between. You decide which; agent executes the file changes.
tenants/graph8-eng/agents/{pr_cleanup,rage_click_detector}/Sync token guard on sync-main-to-qa.yml
Today silently no-ops if SYNC_BOT_TOKEN is missing. Add a fail-loud check at the top.
graph8-com/g8/.github/workflows/sync-main-to-qa.ymlDocument attempt-4 behavior in pr_fixer
Hard-capped at 3 today; attempt 4 is undefined. Append escalation rule: label mbm/needs-input, tag the author, stop.
tenants/graph8-eng/agents/pr_fixer/prompt.mdLedger schema + ingest API. Single biggest unlock of the week.
By EOD: a Postgres table exists, an authenticated POST /v1/trace endpoint accepts rows, and the MBM service is the single ingest point. Every loop after this depends on it.
services/mbm. Both target the same package — review for conflicts before merging. The third agent (task 9, scaffolding axon prompts) is already done in this PR — your job there is read + refine. EOD: the table exists, the endpoint accepts test rows.
Migration: skill_invocations table
Single table is the foundation of everything — utilization dashboard, agent economics, mortality loop. Token columns NULL for Max-plan local runs; populated for K8s OAuth-pool runs.
services/mbm/migrations/20260518_skill_invocations.sql (template scaffolded in this repo)psql -c "\d skill_invocations" shows the table with 3 indexes.POST /v1/trace endpoint
Single ingest endpoint, API-token authenticated. Idempotent on (engineer_id, ts, skill_name). Returns 202 with the inserted id.
services/mbm/internal/trace/handler.gocurl -X POST $MBM_URL/v1/trace -d '...' returns 202 and the row appears.Scaffold the new axon agent prompts
Agent prompts can land before the agents are dispatched — they're just markdown. Land agent_health, mbm_critic, and knowledge_compactor today so they're ready to wire on day 4+. (Already scaffolded in this PR — review and refine.)
tenants/graph8-eng/agents/{agent_health,mbm_critic,knowledge_compactor}/prompt.mdMCP wrapper + Claude hook + first two dashboards.
By EOD: every local Claude Code session POSTs to the ledger automatically. Two Grafana dashboards (engineer utilization + agent economics) are live. The first numbers populate by Thursday morning.
Trace MCP wrapper
TypeScript MCP server wraps tool calls and POSTs trace rows. Settings.json SessionStart hook covers slash-command fires too.
services/lifecycle-trace-mcp/src/server.ts · Edit: .claude/mcp.json in both infra + g8SELECT count(*) FROM skill_invocations WHERE engineer_id='you' AND ts > now() - interval '5 min' > 0 after running any local skill.Install MCP locally · commit to both .claude/mcp.json
You install on your own machine. Commit the MCP entry to .claude/mcp.json in both infra and g8 so any future engineer (or remote agent run that loads from .claude config) inherits it. One Slack post in case anyone is paying attention.
.claude/mcp.json in both infra + g8 · Local: your ~/.claude/mcp.json/start.Engineer-utilization Grafana dashboard
First user-visible artifact. SQL query + one Grafana panel. Lives at grafana.graph8.com/d/lifecycle-engineer.
k8s/monitoring/dashboards/lifecycle-engineer.jsonK8s-agent economics Grafana dashboard
The other half — for K8s OAuth-pool runs only. Real $/merged-PR per agent.
k8s/monitoring/dashboards/lifecycle-agent-economics.jsong8_5xx_fixer's row visible with $/merged-PR. Red >$30 conditional formatting.test-writer wired · agent_health daily · first agent-reads-agent.
The ledger has 24 hours of data. Wire two agents that close loops today. test-writer already exists at g8/.claude/agents/test-writer.md — just needs PR-trigger plumbing. agent_health is the first agent that reads ledger output.
agent_health's prompt is already in this PR — task 15 is wiring the cron, not writing the prompt. By EOD: the first agent that reads the ledger is alive.
Wire test-writer agent to PR open
Closing the test-coverage loop in one morning by wiring an existing agent. New workflow scans diffs for new public functions without tests; dispatches Claude Code k8s job to write them.
graph8-com/g8/.github/workflows/test-writer-on-pr.yml · Reads: existing g8/.claude/agents/test-writer.mddef public_foo(): with no test produces a follow-up test PR within 15 min.agent_health cron live
Daily snapshot of each axon agent's PR survival rate. First agent that reads the ledger and writes back to it. Foundation for "agent A → agent B" pattern.
tenants/graph8-eng/crons.yaml (entry added in this PR) · prompt at tenants/graph8-eng/agents/agent_health/prompt.md (already scaffolded)agent_health_stats table; agent-economics dashboard now updates from it instead of from raw queries.Migration: agent_health_stats table
Storage for the daily agent-health snapshots. Schema lives in this repo template; engineer applies in mbm-pg.
services/mbm/migrations/20260520_agent_health_stats.sql (template scaffolded in this PR)agent_health cron writes its first row.Three medium tasks, one daring CI flip.
Get the PRD scoring out of the prompt and into Go (with tests). Open the first auto-promoted qa→main PR. Flip bandit to blocking. Four agents in parallel, you review as they land. Bandit needs your judgement on findings — block ~1 hr for that.
nosec-comment — block 1–2 hours in the afternoon to walk the findings list. The other three you mostly review. By EOD: first blocking CI gate, qa→main auto-promotes, scoring is in code.
Engineer-scoring service
Move /assign-prds algorithm out of the prompt into code. 8 test cases. Skill calls POST /v1/grade/engineer instead of re-deriving.
services/mbm/internal/grader/engineer_scoring{,_test}.go · Edit: infra/.claude/commands/assign-prds.md/assign-prds on the same input produce identical output.Auto-promote qa → main on soak SLO
Hourly workflow opens the qa→main PR automatically once QA has been green ≥48h. Closes the biggest cycle-time dark zone.
graph8-com/g8/.github/workflows/auto-promote-qa-to-main.ymlRemove continue-on-error from bandit
First CI ratchet. Smallest blast radius. Agent runs bandit on main, files candidate fixes; you walk the list and accept/nosec each. Then agent flips the flag and opens the PR.
graph8-com/g8/.github/workflows/code-quality.ymlshell=True subprocess fails CI on bandit.Skill consolidation · round 1 · /commit into /ship
First merge of the four. /commit becomes a sub-step of /ship. Removes one skill from the catalog.
g8/.claude/commands/ship.md · Delete: g8/.claude/commands/commit.mdmbm_critic + skill-mortality + Cloudflare Worker scaffold.
Two more loops close. The first agent that reads MBM's own decisions ships today. Skill-mortality auto-issues open for low-fire skills. Cloudflare Worker scaffolded; cutover lands Sunday.
mbm_critic agent live
Daily cron classifies PRs where MBM's CHANGES_REQUESTED was dismissed. Proposes rubric updates to reviewer/prompt.md. First agent that reads MBM's own output.
tenants/graph8-eng/agents/mbm_critic/prompt.md (this PR) · Add cron to: tenants/graph8-eng/crons.yaml (entry added)Skill-mortality job
Weekly SQL + auto-issue. Skills with <5 fires/month in the ledger → deprecation issue with the consolidation suggestion from the merge table.
services/mbm/internal/jobs/skill_mortality.goCloudflare Worker for tenant webhooks
Replaces 2h polling with sub-60s dispatch. Validates GitHub HMAC, translates events into Task CR creation.
workers/tenant-webhook/src/index.ts · workers/tenant-webhook/wrangler.tomlSkill consolidation · round 2 · /assign-prds + /capacity-check → /capacity
Second merge. Both read the same data; splitting forces two queries for one decision.
infra/.claude/commands/capacity.md · Delete: the two source skills5xx-error label goes webhook · acceptance review · demo for the board.
Cut over the first label to the Worker; prove the latency win (~1h → <60s). Run the 8-point acceptance check. Record a 10-minute demo of the dashboards + a journey-walk for stakeholders.
Cut over 5xx-error label to webhook
First proof point. Flip the trigger from polling to webhook; observe the p50 dispatch latency drop from ~1h to sub-60s.
tenants/graph8-eng/triggers.yaml — change g8-issues-5xx-error entry from polling to webhook5xx-error label produces a Task CR within 60 seconds.Run the day-7 acceptance check (§ below)
8 metrics. 6 green = shipped. Walk the list, run each verification command, mark green/red.
Record the 10-minute demo
Loom or screen recording. Walk through: a journey on lifecycle.html → click a stat card → drill into a skill → open the engineer-utilization dashboard with real data → show the webhook cutover latency improvement → show a closed loop (test-writer or agent_health digest).
Did we ship it? 6 green = yes.
Run this checklist Sunday afternoon. If 6+ are green, declare success and start the week-2 horizon items (knowledge_compactor, contracts, bug_predictor, onboarding). If fewer, the next 7 days close the gap before adding new scope.
git push origin main from any non-MBM machine fails.SELECT count(*) FROM skill_invocations WHERE ts > now() - interval '1 day'engineer_id appears in the Grafana panel with a grade. Local /start fires from your laptop land in the ledger within 5 min.continue-on-error removed from bandit job in code-quality.yml.Roll remaining labels to webhooks · ratchet ruff to blocking · ship knowledge_compactor · scaffold contract schemas for cross-repo (top 5) · ship contract_test_runner · ship bug_predictor (comment-only mode) · ship onboarding agent for first new engineer. Skill consolidation rounds 3+4. By day 14: 6/8 loops live, skill count = 14.