TokenKiller (Universal Throttling)

Goal

Systematically reduce token consumption without noticeably lowering success rate, applicable to agents with multiple capabilities (search/coding/debugging/testing/docs).

Task Complexity Assessment

Before setting budgets, assess task complexity:

Complexity	Criteria	Tool Budget	Output Budget
Simple	Single file modification, single-point localization, clear requirements	≤3 calls	≤50 lines
Medium

Extension Mechanism (Soft Warning): When budget is about to run out but task is incomplete:

1. Output warning: INLINECODE0
Continue execution, but switch to more conservative strategy
User can interrupt or request more detailed output at any time

Default Working Mode (Balanced)

Global Hard Rules (Must Follow)

- Goal First, Evidence Later: State the goal in one sentence (L0) first, then decide if evidence is needed (L2/L3).
Three-Question Limit: When clarification is needed, ask at most 3 questions at a time; otherwise proceed with "default assumptions" and mark replaceable points.
Progressive Disclosure: By default, only fetch "minimum necessary information"; never dump large files/full logs directly into context.
Diff-First: Prioritize outputting patches/changes/command and result summaries; avoid reposting entire files.
Deduped References: Information already seen should only be briefly referenced, not pasted again.

Budget Gate (Budget + Gate)

At the start of each task, assess complexity and set corresponding budget (see above "Task Complexity Assessment"), then execute gates:

- Tool Call Budget: Set by complexity (Simple ≤3, Medium ≤6, Complex ≤10).
Read Budget: Single files read in full by default; large files >200 lines only read hit segments or in sections.
Output Budget: Set by complexity (Simple ≤50 lines, Medium ≤120 lines, Complex ≤200 lines).

If any gate is exceeded:

- First narrow scope (path/file/module) → Then switch search strategy → Finally expand reading and output.

Token Consumption Self-Check

High-Consumption Behaviors (Avoid)

- Reading >500 line files in full
Outputting complete file contents (should output diff)
Repeatedly pasting the same code/log
Listing entire directory trees
Outputting lengthy explanatory text

Self-Check Timing

After every 3 tool calls, quickly self-check:

- Am I currently at L0-L2 level?
Is there duplicate information?
Is output exceeding necessary length?

Information Layers (L0-L3)

- L0: One-sentence goal (required)
L1: At most 3 hard constraints (required)
L2: Evidence summary (file path + line number / key command output lines / key config items)
L3: Full long content (only pull in specific scenarios, see below "L3 Pull Scenarios")

Default output and context stay at L0-L2.

L3 Pull Scenarios (Explicit)

Only pull L3 (full content) in these scenarios:

1. Code Modification: When exact indentation/format matching is needed, read target function's complete code
Config Debugging: When config items are interdependent, need to see complete config block
Error Analysis: When error message is incomplete, need complete stack trace or context
User Explicit Request: User requests to see full content

Decision Flow:
L2 Evidence → Attempt to proceed → Fail → Determine if L3 is needed → Pull minimum necessary range

Multi-Skill Collaboration

When this Skill is activated alongside other Skills:

Priority Rules

- Functional Skills First: Specific rules of functional skills like pdf, xlsx take precedence
TokenKiller as Constraint Layer: During other skill execution, continuously apply budget and layer rules
User Priority on Conflict: User's explicitly requested output format/content takes precedence over throttling rules

Collaboration Mode

CODEBLOCK0

Workflow (General)

1) Task Entry (Any Domain)

1. Produce L0 + L1 (quickly infer if user didn't provide)
Choose strategy (search/direct modification/verify first)
Execute minimal action
Immediately verify (cheapest verification first)
Summarize: only key conclusion + 1 next step

2) Search/Exploration (Priority Domain)

Priority:

1. Filename/Path (Glob)
Exact String (Grep)
Semantic Search (SemanticSearch)
Read File (Read, by sections/line ranges)

Rules:

- Only read near hit points (±20 lines) or target function/component related paragraphs
Don't read through entire repository without localization

3) Coding/Refactoring

Rules:

- Minimal change surface first: if 1 file can be changed, don't change 5
Avoid "rewrite everything"; prioritize reusing existing structure
After modification, immediately run cheapest verification (tsc/build/lint)
Only show key diffs (at most 1-3 code references)

4) Debugging/Troubleshooting

Rules:

- First list 3 highest probability hypotheses (sorted by information gain)
Each time verify only 1 hypothesis, and only collect necessary evidence
Logs only take: error line, stack top, related config, reproduction command (rest summarized)

5) Testing/Verification

Priority (from cheap to expensive):

1. lint / typecheck
build
unit test
e2e / browser automation

When failed, only append "diff information", don't repost full output.

6) Docs/Summary

Rules:

- Default to "short summary + next steps"
Don't restate user's original words; use structured point references
When docs are needed, use progressive disclosure: outline/points first, then expand details

Output Template (Default)

Use the following structure, unless user explicitly requests other format:

- Conclusion: One sentence
Evidence: 2-5 items (path/line number/key command output)
Changes/Actions: What was done (at most 5 items)
Next Step: 1 item (most valuable next step)

Trigger Words (Recommended Auto-Enable)

Force enable this Skill when user mentions any of the following keywords/scenarios:

- "waste token / save token / cost / context too long / log too long / repo too large / multi-step / agent"

TokenKiller（通用节流机制）

目标

在不明显降低成功率的前提下，系统性地减少令牌消耗，适用于具备多种能力（搜索/编码/调试/测试/文档）的智能体。

任务复杂度评估

在设定预算前，先评估任务复杂度：

复杂度	判定标准	工具调用预算	输出预算
简单	单文件修改、单点定位、需求明确	≤3次调用	≤50行
中等

涉及2-3个文件、需简单探索、需求相对明确 | ≤6次调用 | ≤120行 |
| 复杂 | 跨模块重构、多步调试、需求不明确 | ≤10次调用 | ≤200行 |

扩展机制（软警告）：当预算即将耗尽但任务未完成时：

1. 输出警告：[TokenKiller] 预算即将耗尽，当前进度 X/Y，剩余工作：...
继续执行，但切换至更保守的策略
用户可随时中断或要求更详细的输出

默认工作模式（均衡型）

全局硬性规则（必须遵守）

- 目标优先，证据在后：先一句话说明目标（L0），再决定是否需要证据（L2/L3）。
三问限制：需要澄清时，一次最多问3个问题；否则按默认假设继续，并标记可替换点。
渐进式披露：默认只获取最小必要信息；绝不将大文件/完整日志直接倒入上下文。
差异优先：优先输出补丁/变更/命令及结果摘要；避免重新贴出整个文件。
去重引用：已看到的信息仅简要引用，不重复粘贴。

预算门控（预算 + 门控）

每个任务开始时，评估复杂度并设定相应预算（见上方任务复杂度评估），然后执行门控：

- 工具调用预算：按复杂度设定（简单≤3，中等≤6，复杂≤10）。
读取预算：单文件默认完整读取；超过200行的大文件仅读取命中片段或分段读取。
输出预算：按复杂度设定（简单≤50行，中等≤120行，复杂≤200行）。

若超出任一预算门控：

- 先缩小范围（路径/文件/模块）→ 再切换搜索策略 → 最后扩大读取和输出。

令牌消耗自查

高消耗行为（需避免）

- 完整读取超过500行的文件
输出完整的文件内容（应输出差异）
重复粘贴相同的代码/日志
列出完整的目录树
输出冗长的解释性文本

自查时机

每3次工具调用后，快速自查：

- 我当前处于L0-L2层级吗？
是否存在重复信息？
输出是否超出必要长度？

信息层级（L0-L3）

- L0：一句话目标（必需）
L1：最多3条硬性约束（必需）
L2：证据摘要（文件路径+行号/关键命令输出行/关键配置项）
L3：完整长内容（仅在特定场景下拉取，见下方L3拉取场景）

默认输出和上下文保持在L0-L2。

L3拉取场景（明确）

仅在以下场景下拉取L3（完整内容）：

1. 代码修改：需要精确匹配缩进/格式时，读取目标函数的完整代码
配置调试：配置项相互依赖时，需查看完整配置块
错误分析：错误信息不完整时，需完整堆栈跟踪或上下文
用户明确要求：用户要求查看完整内容

决策流程：
L2证据 → 尝试推进 → 失败 → 判断是否需要L3 → 拉取最小必要范围

多技能协作

当此技能与其他技能同时激活时：

优先级规则

- 功能性技能优先：如pdf、xlsx等技能的具体规则优先
TokenKiller作为约束层：其他技能执行期间，持续应用预算和层级规则
用户优先级高于冲突：用户明确要求的输出格式/内容优先于节流规则

协作模式

[用户请求] → [功能性技能处理] → [TokenKiller约束输出]

工作流程（通用）

1）任务入口（任何领域）

1. 生成L0 + L1（若用户未提供则快速推断）
选择策略（搜索/直接修改/先验证）
执行最小操作
立即验证（先做最便宜的验证）
总结：仅关键结论 + 1个下一步

2）搜索/探索（优先领域）

优先级：

1. 文件名/路径（Glob）
精确字符串（Grep）
语义搜索（SemanticSearch）
读取文件（Read，按段落/行范围）

规则：

- 仅读取命中点附近（±20行）或目标函数/组件相关段落
未定位前不完整读取整个仓库

3）编码/重构

规则：

- 优先最小变更面：若能改1个文件，就不改5个
避免重写一切；优先复用现有结构
修改后立即运行最便宜的验证（tsc/build/lint）
仅展示关键差异（最多1-3处代码引用）

4）调试/故障排查

规则：

- 先列出3个最高概率假设（按信息增益排序）
每次仅验证1个假设，且仅收集必要证据
日志仅取：错误行、堆栈顶部、相关配置、复现命令（其余总结）

5）测试/验证

优先级（从便宜到昂贵）：

1. lint / typecheck
build
单元测试
e2e / 浏览器自动化

失败时，仅附加差异信息，不重新贴出完整输出。

6）文档/总结

规则：

- 默认简短总结 + 下一步
不重复用户原话；使用结构化要点引用
需要文档时，采用渐进式披露：先大纲/要点，再展开细节

输出模板（默认）

除非用户明确要求其他格式，否则使用以下结构：

- 结论：一句话
证据：2-5项（路径/行号/关键命令输出）
变更/操作：已完成的操作（最多5项）
下一步：1项（最有价值的下一步）

触发词（建议自动启用）

当用户提及以下任一关键词/场景时，强制启用此技能：

- 浪费令牌 / 节省令牌 / 成本 / 上下文太长 / 日志太长 / 仓库太大 / 多步 / 智能体

tokenkiller令牌削减器

tokenkiller

TokenKiller (Universal Throttling)

Goal

Task Complexity Assessment

Default Working Mode (Balanced)

Global Hard Rules (Must Follow)

Budget Gate (Budget + Gate)

Token Consumption Self-Check

High-Consumption Behaviors (Avoid)

Self-Check Timing

Information Layers (L0-L3)

L3 Pull Scenarios (Explicit)

Multi-Skill Collaboration

Priority Rules

Collaboration Mode

Workflow (General)

1) Task Entry (Any Domain)

2) Search/Exploration (Priority Domain)

3) Coding/Refactoring

4) Debugging/Troubleshooting

5) Testing/Verification

6) Docs/Summary

Output Template (Default)

Trigger Words (Recommended Auto-Enable)

TokenKiller（通用节流机制）

目标

任务复杂度评估

默认工作模式（均衡型）

全局硬性规则（必须遵守）

预算门控（预算 + 门控）

令牌消耗自查

高消耗行为（需避免）

自查时机

信息层级（L0-L3）

L3拉取场景（明确）

多技能协作

优先级规则

协作模式

工作流程（通用）

1）任务入口（任何领域）

2）搜索/探索（优先领域）

3）编码/重构

4）调试/故障排查

5）测试/验证

6）文档/总结

输出模板（默认）

触发词（建议自动启用）

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement