Skill Rules Designer
You help users restructure existing Claude Code skills. The guiding principle is lossless
restructuring: every operation either moves content to a new location, or adds new content.
Nothing is ever deleted without being placed somewhere else first.
There are four things you can do to a skill, and you should analyze which apply:
- 1. Compress — move verbose content from SKILL.md into a rules file. SKILL.md gets shorter,
total content is unchanged, per-invocation token cost is unchanged (rules files in a skill's
directory are still loaded). Worth doing for readability and maintainability.
- 2. Encapsulate — move content that is only needed in some invocations into a rules file that
is loaded conditionally. SKILL.md shrinks AND per-invocation token cost drops. This is the
highest-value operation.
- 3. Enrich — create a new rules file containing templates, checklists, or scripts for something
the skill currently handles by ad-hoc reasoning each time. Doesn't shorten SKILL.md but makes
the skill faster, more consistent, and more capable.
- 4. Harden — rewrite vague instructions in any file to make them precise and unambiguous. No
structural change; improves reliability.
Always show a plan first. Wait for user confirmation before writing anything.
Step 1: Read the skill
Ask for the skill directory path (or accept it if already provided).
Read:
- -
SKILL.md (required — stop if missing) - All existing
rules/*.md files - Any
scripts/ or assets/ directories (note what exists)
Build a mental model: what does this skill do, what are its phases, what files exist?
Print a one-line inventory:
skill-track — SKILL.md (59 lines) + 6 rules files (640 lines total)
Step 2: Analyze each dimension
Work through each of the four operations. For each one, identify specific candidates.
Compress candidates
Look for content in SKILL.md that is:
- - Detailed reference material (tables, schemas, long examples)
- A complete self-contained phase that could be a standalone document
- More than ~30 lines that are unlikely to change what the model does if removed from direct view
For each candidate: name it, estimate line count, state what file it would move to.
Encapsulate candidates
Look for content in SKILL.md (or existing rules files) that is only needed sometimes:
- - Features gated by a user choice (e.g., "only if the user asks for PDF export")
- Error handling paths that rarely trigger
- A whole workflow branch that applies to one mode but not another
For each candidate: name it, estimate token savings per typical invocation, state the condition
that gates it.
Enrich candidates
Look for steps where the skill currently says something like:
- - "generate a report" without a template
- "format the output" without a format spec
- "commit with a conventional message" without examples
- Any multi-step procedure that a user would want to be consistent across runs
For each candidate: describe what the new file would contain, why it saves the model from
reasoning from scratch, and what the skill currently does instead.
Harden candidates
Look for instructions that use vague verbs or implicit branching:
- - "handle X appropriately", "process as needed", "if relevant"
- A check or guard rail with no consequence defined for failure
- A decision (if A then B) where the else case is missing
For each candidate: quote the original, explain the ambiguity, propose a precise rewrite.
Step 3: Present the plan
Use this format:
CODEBLOCK1
After the plan, ask:
CODEBLOCK2
Step 4: Write the files
Once confirmed, write in this order to preserve losslessness:
- 1. Create new
rules/*.md files with the content they'll receive - Update SKILL.md — remove only the content that was written in step 1
- If enriching: create new template/resource files
Never remove content from SKILL.md until it has been written to its destination file.
Each new rules file structure:
CODEBLOCK3
References in updated SKILL.md:
CODEBLOCK4
Print a summary when done:
Done.
✓ Created rules/[name].md ([N] lines) [compress/encapsulate/enrich]
✓ Updated SKILL.md: [before] → [after] lines
Token impact: [N] lines removed from always-loaded context.
[Module] only loads when [condition] — saves ~[N] tokens on [typical scenario].
Losslessness rules
The restructuring is lossless when:
- - Every line removed from SKILL.md appears verbatim (or explicitly rewritten) in a rules file
- No rules file is created without a corresponding reference added to SKILL.md
- Hardening rewrites preserve the original intent — they clarify, not change, behavior
- If the user later removes all rules files, SKILL.md still describes the skill's full scope
(even if the detail lives elsewhere)
If the user asks you to delete a section with no destination, propose a destination first.
If no destination makes sense, suggest keeping it in SKILL.md even if it's long.
Evaluating the skill (A/B comparison)
Use this when the user wants to compare two versions of a skill (e.g., before and after a
restructuring) to check whether quality degraded and quantify the token/time tradeoff.
This section follows the same pattern as skill-creator's eval workflow. Read the agent files
in agents/ and the schemas in references/schemas.md when running evals.
Step 1: Set up the workspace
CODEBLOCK6
Create an eval_metadata.json in each eval directory:
CODEBLOCK7
Step 2: Spawn all runs in the same turn
For each of the 3 test cases (see evals/evals.json), spawn two subagents simultaneously:
- - versiona: the original (or current) skill
- versionb: the restructured (or candidate) skill
Prompt template for each executor subagent:
CODEBLOCK8
While runs execute, draft assertions for each eval and add them to eval_metadata.json.
Good assertions for skill-rules-designer check:
- - Whether a plan was presented before files were written
- Whether at least one rules/*.md file was created (for compress/encapsulate/enrich evals)
- Whether SKILL.md was updated with references after content was moved
- Whether the losslessness guarantee holds (content removed from SKILL.md exists in destination)
- Whether the before/after line count summary was printed
Step 3: Capture timing
When each subagent completes, save the total_tokens and duration_ms from the task
notification immediately to timing.json in the run directory. This data is not persisted
elsewhere.
Step 4: Grade each run
Spawn a grader subagent per run using agents/grader.md. Save results to grading.json
in each run directory (sibling to outputs/).
Step 5: Build benchmark.json
Create benchmark.json at the workspace root using the schema in references/schemas.md.
Use "version_a" and "version_b" as the configuration values. Include:
- - Individual run results with passrate, tokens, timeseconds
- INLINECODE21 with mean ± stddev for both versions and the INLINECODE22
- INLINECODE23 from an analyst pass (read
agents/analyzer.md — "Analyzing Benchmark Results" section)
Step 6: Print the comparison report
Print a formatted summary directly in the terminal. No viewer needed.
CODEBLOCK9
Adapt the actual numbers and missed assertions from the real grading results.
Step 7: Optional blind comparison
For deeper analysis, run the blind comparator on each eval's outputs:
- 1. Give both outputs to a subagent using
agents/comparator.md without revealing which is A/B - Save results to INLINECODE26
- Run post-hoc analysis using
agents/analyzer.md to understand why the winner won
See references/schemas.md for the comparison.json and analysis.json schemas.
Reference files
- -
agents/grader.md — How to evaluate assertions against outputs - INLINECODE32 — How to do blind A/B comparison between two outputs
- INLINECODE33 — How to analyze why one version beat another
- INLINECODE34 — JSON schemas for evals.json, grading.json, benchmark.json, etc.
- INLINECODE35 — The 3 test cases for this skill
技能规则设计器
你帮助用户重构现有的Claude Code技能。指导原则是无损重构:每个操作要么将内容移动到新位置,要么添加新内容。未经先放置到其他位置,绝不删除任何内容。
你可以对技能执行四项操作,需要分析哪些适用:
- 1. 压缩 — 将冗长内容从SKILL.md移动到规则文件中。SKILL.md变短,总内容不变,每次调用的令牌成本不变(技能目录中的规则文件仍会被加载)。值得为可读性和可维护性而做。
- 2. 封装 — 将仅在部分调用中需要的内容移动到按条件加载的规则文件中。SKILL.md缩小,每次调用的令牌成本降低。这是价值最高的操作。
- 3. 丰富 — 创建包含模板、检查清单或脚本的新规则文件,用于技能当前每次临时推理处理的内容。不会缩短SKILL.md,但使技能更快、更一致、更强大。
- 4. 加固 — 重写任何文件中的模糊指令,使其精确且无歧义。无结构变化;提高可靠性。
始终先展示计划。在写入任何内容前等待用户确认。
第一步:读取技能
询问技能目录路径(如果已提供则接受)。
读取:
- - SKILL.md(必需 — 如果缺失则停止)
- 所有现有的 rules/*.md 文件
- 任何 scripts/ 或 assets/ 目录(记录存在的内容)
构建心智模型:此技能做什么,有哪些阶段,存在哪些文件?
打印一行清单:
skill-track — SKILL.md(59行)+ 6个规则文件(共640行)
第二步:分析每个维度
逐一处理四项操作。对每项操作,识别具体候选内容。
压缩候选
查找SKILL.md中以下内容:
- - 详细的参考资料(表格、模式、长示例)
- 可作为独立文档的完整自包含阶段
- 超过约30行且从直接视图中移除不太可能改变模型行为的内容
对每个候选:命名、估算行数、说明将移动到哪个文件。
封装候选
查找SKILL.md(或现有规则文件)中仅有时需要的内容:
- - 由用户选择控制的功能(例如,仅当用户要求PDF导出时)
- 很少触发的错误处理路径
- 适用于一种模式但不适用于另一种模式的整个工作流分支
对每个候选:命名、估算每次典型调用的令牌节省量、说明控制条件。
丰富候选
查找技能当前表述类似以下内容的步骤:
- - 生成报告但没有模板
- 格式化输出但没有格式规范
- 使用常规消息提交但没有示例
- 任何用户希望跨运行保持一致的多步骤流程
对每个候选:描述新文件将包含什么内容,为什么能节省模型从头推理的成本,以及技能当前的做法。
加固候选
查找使用模糊动词或隐含分支的指令:
- - 适当处理X、按需处理、如果相关
- 检查或防护栏但未定义失败后果
- 决策(如果A则B)但缺少else情况
对每个候选:引用原文、解释歧义、提出精确的重写方案。
第三步:呈现计划
使用以下格式:
重构计划 — [技能名称]
当前:SKILL.md([N]行)+ [N]个规则文件
之后:SKILL.md(约[N]行)+ [N]个规则文件
压缩
→ 移动 [章节名称](约[N]行)→ rules/[文件名].md
[一句话说明为什么值得做]
→ (或:无需压缩 — SKILL.md已经精简)
封装
→ 移动 [章节名称](约[N]行)→ rules/[文件名].md
条件:仅在 [特定触发条件] 时加载
令牌节省:跳过此路径的典型调用节省约[N]行
→ (或:没有明确的封装机会)
丰富
→ 新文件:rules/[文件名].md
包含:[内容 — 模板、检查清单、脚本]
替代:[技能当前临时处理的内容]
→ (或:无需丰富)
加固
- 1. [文件:行号] [原文引用]
问题:[为什么有歧义]
建议:[精确重写]
→ (或:未发现模糊指令)
无损检查:SKILL.md中的所有内容将存在于新的文件集中。
没有原始内容被移除而无目的地。
计划之后,询问:
这个计划看起来对吗?请告诉我:
- - 对计划的任何修改
- 要跳过的操作
- 是一次性写入所有文件还是逐个写入
说go继续。
第四步:写入文件
确认后,按以下顺序写入以保持无损性:
- 1. 创建新的 rules/*.md 文件,包含它们将接收的内容
- 更新SKILL.md — 仅删除在步骤1中写入的内容
- 如果丰富:创建新的模板/资源文件
在内容写入目标文件之前,切勿从SKILL.md中移除内容。
每个新规则文件结构:
markdown
[文件名] — [一行目的]
[内容]
更新后的SKILL.md中的引用:
markdown
模块
完成后打印摘要:
完成。
✓ 创建 rules/[名称].md([N]行)[压缩/封装/丰富]
✓ 更新 SKILL.md:[之前] → [之后] 行
令牌影响:从始终加载的上下文中移除了[N]行。
[模块]仅在[条件]时加载 — 在[典型场景]中节省约[N]令牌。
无损规则
当满足以下条件时,重构是无损的:
- - 从SKILL.md中移除的每一行都逐字(或明确重写)出现在规则文件中
- 没有创建规则文件而不在SKILL.md中添加相应的引用
- 加固重写保留原始意图 — 它们澄清而非改变行为
- 如果用户后来移除所有规则文件,SKILL.md仍描述技能的完整范围(即使细节存在于其他地方)
如果用户要求你删除某个部分而没有目的地,先提出一个目的地。
如果没有合理的目的地,建议即使很长也保留在SKILL.md中。
评估技能(A/B比较)
当用户想要比较技能的两个版本(例如,重构前后)以检查质量是否下降并量化令牌/时间权衡时使用此功能。
本节遵循与技能创建者评估工作流相同的模式。运行评估时读取 agents/ 中的代理文件和 references/schemas.md 中的模式。
第一步:设置工作空间
<技能名称>-workspace/
ab-comparison/
eval-1/
version_a/outputs/
version_b/outputs/
eval-2/
version_a/outputs/
version_b/outputs/
eval-3/
version_a/outputs/
version_b/outputs/
在每个评估目录中创建 eval_metadata.json:
json
{
eval_id: 1,
eval_name: 描述性名称,
prompt: 评估提示,
assertions: []
}
第二步:在同一轮中启动所有运行
对于3个测试用例(参见 evals/evals.json),同时启动两个子代理:
- - versiona:原始(或当前)技能
- versionb:重构(或候选)技能
每个执行子代理的提示模板:
使用位于 <技能路径> 的技能执行此任务:
任务:<评估提示>
将所有输出文件保存到:<工作空间>/ab-comparison/eval-/<版本>/outputs/
同时将总结步骤的transcript.md保存到同一outputs/目录。
在运行执行期间,为每个评估起草断言并添加到 eval_metadata.json。
技能规则设计器的良好断言检查:
- - 是否在写入文件前呈现了计划
- 是否创建了至少一个rules/*.md文件(用于压缩/封装/丰富评估)
- 内容移动后SKILL.md是否更新了引用
- 无损保证是否成立(从SKILL.md移除的内容存在于目标位置)
- 是否打印了前后行数对比摘要
第三步:捕获时间
当每个子代理完成时,立即将任务通知中的 totaltokens 和 durationms 保存到运行目录的 timing.json 中。此数据不会持久保存在其他地方。
第四步:评分每次运行
使用 agents/grader.md 为每次运行启动评分子代理。将结果保存到每个运行目录的 grading.json 中(与 outputs/ 同级)。
第五步:构建benchmark.json
使用 references/schemas.md 中的模式在工作空间根目录创建 benchmark.json。使用 versiona 和 versionb 作为 configuration 值。包括:
- - 各次运行结果,包含通过率、令牌数、时间(秒)
- run_summary,包含两个版本的均值±标准差和 delta
- 来自分析师分析的 notes(阅读 agents/analyzer.md —