Add rebuild-skill

2026-06-02 13:25:50 -07:00
parent 4bd469fd17
commit 240304b302
2 changed files with 268 additions and 0 deletions
@@ -0,0 +1,74 @@
+---
+name: rebuild-skill
+description: Overhaul an existing skill by becoming an expert in what it does, then rewriting it from verified ground truth. Use whenever the user wants to rebuild, improve, fix, refresh, or modernize a specific skill — e.g. "/rebuild-skill billing-model", "this skill is stale, fix it", "make the deploy skill correct again", "the X skill's instructions are wrong". Kicks off a dynamic multi-agent workflow that researches the skill's domain before touching its prose, so improvements rest on what's actually true rather than a copyedit of what was already there.
+---
+
+# Rebuild a skill
+
+You are handed the name of an existing skill and asked to make it genuinely better. The argument is the skill name (e.g. `billing-model` from `/rebuild-skill billing-model`). If no name was given, ask which skill to rebuild before going further.
+
+The trap to avoid: treating this as a writing task. A skill encodes how to do something well, so you can only improve it as far as you actually understand the thing it does. A rewrite that polishes the prose but never checks the claims against reality usually makes the skill *worse* — more confident and more wrong. So the order here is deliberate: gain real expertise and gather ground truth first, rewrite from that position second, and verify every claim against its source third.
+
+The bulk of that work fans out across many agents, so this skill runs as a **dynamic Workflow** — "dynamic" because the prerequisite research is shaped by what the skill actually does, which you discover by reading it. You scout inline to find the skill and figure out *what expertise the rebuild needs*, then hand that work-list to a Workflow. Invoking the Workflow tool here is expected — these instructions are your opt-in.
+
+## Step 0 — Locate the skill
+
+Find the skill directory before anything else. Skills live under `.agents/skills`:
+
+```bash
+NAME="<the-name-argument>"
+find .agents/skills -maxdepth 2 -type d -name "$NAME" 2>/dev/null
+```
+
+If exactly one match, that's your target. If none, stop and ask the user — rebuilding the wrong copy wastes the whole run. Note the directory and the `name:` in its frontmatter; both are preserved unchanged through the rebuild.
+
+## Step 1 — Understand it (inline)
+
+Read the target `SKILL.md` end to end, plus every bundled resource (`scripts/`, `references/`, `assets/`). Build a clear picture of:
+
+- **Purpose** — what capability is this supposed to give Claude?
+- **Triggering** — when is it meant to fire? Is the description tuned for that, or does it over/under-trigger?
+- **Claims** — what factual or procedural assertions does it make? (API behaviors, commands, file layouts, domain rules, step sequences.) These are what you'll verify.
+- **Domain** — what does Claude need to be expert in to do this task well? This is the seed for the prerequisite work.
+
+## Step 2 — Scope the prerequisite work (inline)
+
+Decide *what gaining expertise looks like for this particular skill*, and break it into independent facets — one investigable chunk each. The right facets depend entirely on the skill's type:
+
+- A skill that **documents a codebase domain** (like `billing-model`) → trace the actual code paths it describes; confirm each documented rule still matches the implementation.
+- A skill that **runs a procedure** (deploy, release, migration) → walk the real procedure and its tools; find steps that are stale, reordered, or missing.
+- A skill that **relies on an external API or library** → fetch the authoritative docs and verify every behavior the skill claims. The user expects third-party behavior backed by docs, never asserted from memory.
+- A skill that **produces an artifact** (a doc, chart, config) → actually produce one with its current instructions and note where it breaks or falls short.
+
+Most skills are a blend. Aim for 3–6 facets that together cover the skill's claims and its domain. Each facet should be answerable by one agent reading real sources (code, docs, tools) — not by guessing.
+
+## Step 3 — Run the dynamic workflow
+
+Read [references/workflow-template.js](references/workflow-template.js), adapt the `facets` and prompts to the skill you scoped, and run it with the Workflow tool. Pass the skill path, name, and facets via the `args` field so the script stays generic.
+
+The workflow runs these phases:
+
+1. **Investigate** (fan-out, one agent per facet) — each agent reads real sources to gather ground truth: what is actually true, what the skill currently claims, and exactly where the two diverge. Every finding cites a file path or URL.
+2. **Synthesize** — one agent folds the findings and the current skill into a concrete rebuild plan (what to fix, add, cut, restructure), grounded in the verified facts and in skill-writing principles.
+3. **Draft** — one agent rewrites `SKILL.md` and any resources in place, preserving the directory and the `name`. It returns the list of claims a reviewer should check, each with its supporting evidence.
+4. **Verify** (fan-out, one agent per claim) — each agent adversarially tries to *refute* its claim by going back to the cited source. Anything unsupported, outdated, or contradicted is flagged.
+5. **Correct** (only if anything was refuted) — one agent fixes the flagged claims in the files: correct them to match ground truth, or cut them.
+
+The workflow returns the files written and the verification verdicts. This is "ensuring correctness" made concrete: nothing survives in the rebuilt skill that an independent skeptic couldn't confirm against its source.
+
+## Step 4 — Review with the user
+
+Snapshot the original before the workflow writes anything (`cp -r <skill-dir> /tmp/<name>-before`) so you can show a clean diff. After the workflow returns:
+
+- Show `git diff` (or a diff against the snapshot) of the skill files.
+- Summarize what changed and the ground truth behind each substantive change — especially anything the verify phase had to correct.
+- Surface any claim the workflow *couldn't* confirm rather than quietly keeping it.
+
+If the user wants quantitative confidence that the rebuild triggers and performs better, offer to hand off to the `skill-creator` skill, which runs the eval/benchmark loop. This skill's job is the correctness-grounded rewrite; `skill-creator` is the measurement loop.
+
+## Principles to preserve
+
+- **Expertise before prose.** Never edit a claim you haven't grounded in a real source. The investigate phase exists so the rewrite comes from knowledge, not from rephrasing the old text.
+- **Cite or cut.** Every technical claim in the rebuilt skill traces to a file path or URL. If verification can't confirm it, fix it or remove it — don't ship confident guesses.
+- **Preserve identity.** Keep the directory name and the frontmatter `name` so the skill stays the same skill. Improve the description for triggering; don't rename the thing.
+- **Apply skill-writing craft.** Progressive disclosure (lean `SKILL.md`, detail in `references/`), explain the *why* rather than piling on MUSTs, and a description that states both what it does and when to fire.
@@ -0,0 +1,194 @@
+// Dynamic workflow template for the rebuild-skill skill.
+//
+// Adapt `facets` to the skill being rebuilt (see SKILL.md Step 2), then run with the Workflow
+// tool. Keep the script generic and pass per-run values through `args`:
+//
+//   Workflow({
+//     scriptPath: "<this file, or a copy you edited>",
+//     args: {
+//       skillPath: "/abs/path/to/skills/<name>",
+//       skillName: "<name>",
+//       facets: [
+//         { key: "code-paths",  prompt: "Trace the billing event-replay code this skill describes ..." },
+//         { key: "api-docs",    prompt: "Fetch the Lightning/LNbits docs and verify the payment-cascade claims ..." },
+//         // 3–6 facets that together cover the skill's claims and its domain
+//       ],
+//     },
+//   })
+//
+// Phase shape: Investigate (fan-out) -> Synthesize -> Draft -> Verify (fan-out) -> Correct (conditional).
+// The point of the structure: gather ground truth before rewriting, then let independent skeptics
+// confirm every claim against its source. Nothing unverifiable survives.
+
+export const meta = {
+  name: 'rebuild-skill',
+  description: 'Investigate a skill\'s domain from real sources, rewrite it from verified ground truth, and adversarially verify every claim',
+  phases: [
+    { title: 'Investigate', detail: 'one agent per facet gathers ground truth from code/docs/tools' },
+    { title: 'Synthesize', detail: 'fold findings + current skill into a grounded rebuild plan' },
+    { title: 'Draft', detail: 'rewrite SKILL.md and resources in place' },
+    { title: 'Verify', detail: 'one skeptic per claim tries to refute it against its source' },
+    { title: 'Correct', detail: 'fix or cut any refuted claim (only if needed)' },
+  ],
+}
+
+const { skillPath, skillName, facets } = args
+
+const FINDINGS = {
+  type: 'object',
+  additionalProperties: false,
+  required: ['truths', 'divergences'],
+  properties: {
+    truths: {
+      type: 'array',
+      description: 'Ground-truth facts established by reading real sources.',
+      items: {
+        type: 'object',
+        additionalProperties: false,
+        required: ['fact', 'source'],
+        properties: {
+          fact: { type: 'string' },
+          source: { type: 'string', description: 'file path with line, or URL' },
+        },
+      },
+    },
+    divergences: {
+      type: 'array',
+      description: 'Places where the skill\'s current text is stale, vague, wrong, or missing.',
+      items: {
+        type: 'object',
+        additionalProperties: false,
+        required: ['kind', 'skillSays', 'realitySays', 'source'],
+        properties: {
+          kind: { type: 'string', enum: ['stale', 'vague', 'wrong', 'missing'] },
+          skillSays: { type: 'string' },
+          realitySays: { type: 'string' },
+          source: { type: 'string' },
+        },
+      },
+    },
+  },
+}
+
+const PLAN = {
+  type: 'object',
+  additionalProperties: false,
+  required: ['changes', 'descriptionAdvice'],
+  properties: {
+    changes: {
+      type: 'array',
+      items: {
+        type: 'object',
+        additionalProperties: false,
+        required: ['action', 'what', 'groundedOn'],
+        properties: {
+          action: { type: 'string', enum: ['fix', 'add', 'cut', 'restructure'] },
+          what: { type: 'string' },
+          groundedOn: { type: 'string', description: 'the verified fact + source this change rests on' },
+        },
+      },
+    },
+    descriptionAdvice: { type: 'string', description: 'how to tune the frontmatter description for correct triggering' },
+  },
+}
+
+const DRAFT = {
+  type: 'object',
+  additionalProperties: false,
+  required: ['files', 'claims'],
+  properties: {
+    files: { type: 'array', items: { type: 'string' }, description: 'absolute paths written' },
+    claims: {
+      type: 'array',
+      description: 'Every technical claim in the rebuilt skill a reviewer should check.',
+      items: {
+        type: 'object',
+        additionalProperties: false,
+        required: ['id', 'claim', 'evidence'],
+        properties: {
+          id: { type: 'string' },
+          claim: { type: 'string' },
+          evidence: { type: 'string', description: 'file path with line, or URL, supporting the claim' },
+        },
+      },
+    },
+  },
+}
+
+const VERDICT = {
+  type: 'object',
+  additionalProperties: false,
+  required: ['refuted', 'reason'],
+  properties: {
+    refuted: { type: 'boolean', description: 'true if the claim is unsupported, outdated, or contradicted by its source' },
+    reason: { type: 'string' },
+    shouldSayInstead: { type: 'string', description: 'the correct statement, if refuted' },
+  },
+}
+
+phase('Investigate')
+// Fan out: each agent gains the expertise for one facet by reading real sources, never guessing.
+const findings = (await parallel(
+  facets.map(f => () =>
+    agent(
+      `You are gaining the expertise needed to judge and rewrite the "${skillName}" skill at ${skillPath}.\n` +
+        `Facet: ${f.prompt}\n\n` +
+        `Read the ACTUAL code, docs, and tools — do not speculate or rely on memory for third-party behavior. ` +
+        `Gather ground truth: what is genuinely true, what the skill currently claims, and every place the two diverge ` +
+        `(stale, vague, wrong, or missing). Cite a file path (with line) or URL for every fact and every divergence.`,
+      { label: `investigate:${f.key}`, phase: 'Investigate', schema: FINDINGS },
+    ).then(r => ({ facet: f.key, ...r })),
+  ),
+)).filter(Boolean)
+
+log(`Investigated ${findings.length}/${facets.length} facets; ${findings.reduce((n, f) => n + f.divergences.length, 0)} divergences found.`)
+
+phase('Synthesize')
+// One agent needs ALL findings + the current skill together to plan a coherent rebuild — a real barrier.
+const plan = await agent(
+  `Read the current skill at ${skillPath} and these investigation findings:\n${JSON.stringify(findings)}\n\n` +
+    `Produce a concrete plan to rebuild the skill: what to fix, add, cut, and restructure, with the verified fact ` +
+    `(and its source) each change rests on. Apply skill-writing principles: progressive disclosure (lean SKILL.md, ` +
+    `detail in references/), explain the WHY instead of stacking MUSTs, and a description tuned for correct triggering. ` +
+    `Preserve the directory and the frontmatter \`name\`.`,
+  { phase: 'Synthesize', schema: PLAN },
+)
+
+phase('Draft')
+// Single writer, so no worktree isolation needed. It enumerates the claims it makes for verification.
+const draft = await agent(
+  `Rebuild the "${skillName}" skill at ${skillPath} per this plan:\n${JSON.stringify(plan)}\n\n` +
+    `Edit SKILL.md and any bundled resources in place; preserve the directory and the frontmatter \`name\`. ` +
+    `Every technical claim must trace to ground truth from the findings — cite or cut. ` +
+    `Return the files you wrote and the full set of factual claims a reviewer should check, each with its supporting evidence.`,
+  { phase: 'Draft', schema: DRAFT },
+)
+
+phase('Verify')
+// Fan out: each claim gets an independent skeptic prompted to refute it against its own source.
+const verdicts = (await parallel(
+  draft.claims.map(c => () =>
+    agent(
+      `Adversarially check this claim from the rebuilt "${skillName}" skill: "${c.claim}".\n` +
+        `Evidence cited: ${c.evidence}\n\n` +
+        `Go read the cited source yourself and TRY TO REFUTE the claim. Default to refuted=true if you cannot ` +
+        `independently confirm it from the source. If it is unsupported, outdated, or contradicted, say what the skill should say instead.`,
+      { label: `verify:${c.id}`, phase: 'Verify', schema: VERDICT },
+    ).then(v => ({ ...c, ...v })),
+  ),
+)).filter(Boolean)
+
+const broken = verdicts.filter(v => v.refuted)
+log(`Verified ${verdicts.length} claims; ${broken.length} refuted.`)
+
+if (broken.length) {
+  phase('Correct')
+  await agent(
+    `These claims in the rebuilt skill at ${skillPath} were refuted during verification:\n${JSON.stringify(broken)}\n\n` +
+      `Fix each one in the skill files: correct the statement to match ground truth (use \`shouldSayInstead\`), or cut it. ` +
+      `Do not introduce any new unverified claim.`,
+    { phase: 'Correct', schema: { type: 'object', additionalProperties: false, required: ['fixed'], properties: { fixed: { type: 'array', items: { type: 'string' } } } } },
+  )
+}
+
+return { skillPath, filesWritten: draft.files, claimsChecked: verdicts.length, refuted: broken }