TokenSaver

A token cost optimization skill that helps you save 50-80% on AI token usage without sacrificing response quality.

When to Use

Use TokenSaver when:

- You have long conversations that consume many tokens
You want to reduce AI API costs
You're working with technical discussions that accumulate context
You notice token usage growing rapidly in long sessions

Core Capabilities

1. Smart Context Compression

Automatically compresses conversation history based on message importance.

How it works:

- Recent messages (last 3-5) kept fully intact
Older messages summarized based on importance score
Code blocks and critical decisions never compressed

Savings: 50-70% reduction in context tokens

2. Semantic Cache

Caches responses to similar queries to avoid reprocessing.

How it works:

- L1: Exact query match → 100% savings
L2: Semantic similarity > 85% → 80% savings
L3: Pattern match → 50% savings

3. Adaptive Optimization

Automatically adjusts compression based on token pressure.

Stages:

- < 3K tokens: No compression
3-6K tokens: Light compression
6-10K tokens: Medium compression
> 10K tokens: Heavy compression + suggest new chat

Natural Language Commands

When user asks about TokenSaver in natural language, interpret and execute:

Settings & Configuration

User says: "Configure TokenSaver" / "TokenSaver settings" / "Setup TokenSaver"
Action: Show current configuration and available options
CODEBLOCK0

User says: "Use aggressive mode" / "Maximize savings" / "Set to save mode"
Action: Execute /tokensave command
Response: "✅ TokenSaver switched to aggressive save mode. This provides maximum token savings (up to 80%) with slight quality trade-off."

User says: "Use balanced mode" / "Default settings" / "Set to balanced"
Action: Execute /tokenbalance command
Response: "✅ TokenSaver switched to balanced mode. Good savings (50-70%) with quality preserved."

User says: "Prioritize quality" / "Keep full context" / "Set to quality mode"
Action: Execute /tokenquality command
Response: "✅ TokenSaver switched to quality priority mode. Minimal compression, maximum context retention for precision-critical tasks."

User says: "Disable TokenSaver" / "Turn off token saving" / "Stop TokenSaver"
Action: Execute /tokenoff command
Response: "⏸️ TokenSaver temporarily disabled. Re-enable anytime with /tokens or by asking me to turn it back on."

Usage Reports & Analytics

User says: "Show me token usage report" / "TokenSaver statistics" / "Token report"
Action: Execute /tokenreport command
Response format:
CODEBLOCK1

User says: "TokenSaver status" / "How much am I saving?" / "Token status"
Action: Execute /tokens command
Response format:
CODEBLOCK2

User says: "Clear token cache" / "Reset cache" / "Clear TokenSaver cache"
Action: Execute /tokencache clear command
Response: "✅ TokenSaver cache cleared. Fresh responses will be generated for subsequent queries."

Explanation & Help

User says: "What is TokenSaver" / "How does token saving work?" / "Explain TokenSaver"
Response:
CODEBLOCK3

User says: "TokenSaver not working" / "Why no savings?" / "TokenSaver not saving"
Response:
CODEBLOCK4

Best Practices

User says: "TokenSaver best practices" / "How to use TokenSaver effectively?" / "TokenSaver tips"
Response:
CODEBLOCK5

Slash Commands

For direct command access:

/tokens

Show current status and statistics CODEBLOCK6

/tokensave

Enable aggressive save mode

- Maximum compression
Best for very long technical discussions
Slight quality trade-off possible

/tokenbalance

Balanced mode (default)

- Good savings with quality preserved
Recommended for most use cases

/tokenquality

Quality priority mode

- Minimal compression
Maximum context retention
Use when precision is critical

/tokenreport

Generate detailed usage report CODEBLOCK7

/tokencache clear

Clear all cached responses

/tokenoff

Temporarily disable optimization

Usage Examples

Example 1: Long coding session
CODEBLOCK8

Example 2: Repeated questions
CODEBLOCK9

Example 3: Topic switching
CODEBLOCK10

Safety Features

TokenSaver never compresses:

- Code blocks (always kept intact)
Error messages and stack traces
User-marked important messages
Messages with high cross-references

Quality Guard:

- Auto-rollback if quality drops > 15%
One-click restore to uncompressed version
Snapshots for every compression

Configuration

Default configuration:
CODEBLOCK11

Expected Results

Conversation Type	Tokens Saved	Quality Impact
Technical discussion (50 rounds)	70%	Minimal
Code review

Limitations

- Requires conversation to exceed 3K tokens before compression starts
First-time queries cannot be cached
Very short conversations (< 10 messages) see minimal benefit
Code-heavy conversations benefit most from smart referencing

Related Skills

- shieldclaw: For security scanning
browservisible: For web browsing
filereader: For reading local files

TokenSaver

一项令牌成本优化技能，帮助你在不牺牲回复质量的前提下节省50%-80%的AI令牌用量。

使用场景

在以下情况下使用TokenSaver：

- 进行消耗大量令牌的长对话时
希望降低AI API成本时
处理积累上下文的专业技术讨论时
在长时间会话中发现令牌用量快速增长时

核心能力

1. 智能上下文压缩

根据消息重要性自动压缩对话历史。

工作原理：

- 最近消息（最后3-5条）完整保留
较旧消息根据重要性评分进行摘要
代码块和关键决策永不压缩

节省效果： 上下文令牌减少50%-70%

2. 语义缓存

缓存相似查询的回复，避免重复处理。

工作原理：

- L1：精确查询匹配 → 100%节省
L2：语义相似度 > 85% → 80%节省
L3：模式匹配 → 50%节省

3. 自适应优化

根据令牌压力自动调整压缩程度。

阶段：

- < 3K令牌：不压缩
3-6K令牌：轻度压缩
6-10K令牌：中度压缩
> 10K令牌：重度压缩 + 建议开启新对话

自然语言指令

当用户用自然语言询问TokenSaver时，进行解释并执行：

设置与配置

用户说： 配置TokenSaver / TokenSaver设置 / 设置TokenSaver
操作： 显示当前配置和可用选项

当前TokenSaver设置：

- 模式：自适应（根据令牌压力自动调整）
压缩：平衡
缓存：已启用
质量阈值：85%

您可以更改模式：

- /tokensave - 激进模式（最大节省）
/tokenbalance - 平衡模式（默认）
/tokenquality - 质量优先（最小压缩）

用户说： 使用激进模式 / 最大化节省 / 设置为节省模式
操作： 执行/tokensave命令
回复： ✅ TokenSaver已切换至激进节省模式。此模式可提供最大令牌节省（高达80%），但会略微牺牲质量。

用户说： 使用平衡模式 / 默认设置 / 设置为平衡
操作： 执行/tokenbalance命令
回复： ✅ TokenSaver已切换至平衡模式。在保持质量的前提下实现良好节省（50%-70%）。

用户说： 优先质量 / 保留完整上下文 / 设置为质量模式
操作： 执行/tokenquality命令
回复： ✅ TokenSaver已切换至质量优先模式。最小压缩，最大上下文保留，适用于精度关键型任务。

用户说： 禁用TokenSaver / 关闭令牌节省 / 停止TokenSaver
操作： 执行/tokenoff命令
回复： ⏸️ TokenSaver已临时禁用。随时通过/tokens或要求我重新开启来重新启用。

使用报告与分析

用户说： 显示令牌使用报告 / TokenSaver统计数据 / 令牌报告
操作： 执行/tokenreport命令
回复格式：

📈 TokenSaver使用报告
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
本次会话：
• 原始令牌总数：12,450
• 优化后令牌总数：4,358
• 节省令牌数：8,092（65%）
• 预估节省费用：$0.016
• 缓存命中次数：5
• 压缩应用次数：12

主要节省来源：

1. 上下文压缩：-6,200令牌
语义缓存：-1,500令牌
回复优化：-392令牌

用户说： TokenSaver状态 / 我节省了多少？ / 令牌状态
操作： 执行/tokens命令
回复格式：

💰 TokenSaver状态
━━━━━━━━━━━━━━━━━━━━
当前会话：
• 节省比例：65%
• 原始令牌数：12,450
• 优化后令牌数：4,358
• 缓存命中次数：5

当前模式：平衡（自适应）
• 第三阶段压缩已激活
• 质量评分：95%

用户说： 清除令牌缓存 / 重置缓存 / 清除TokenSaver缓存
操作： 执行/tokencache clear命令
回复： ✅ TokenSaver缓存已清除。后续查询将生成全新回复。

解释与帮助

用户说： 什么是TokenSaver / 令牌节省如何工作？ / 解释TokenSaver
回复：

TokenSaver是一项智能令牌成本优化技能，帮助您节省50%-80%的AI令牌用量。

主要功能：

1. 智能上下文压缩 - 总结旧对话，同时保留最近消息完整
语义缓存 - 复用相似问题的回复
自适应优化 - 根据对话长度自动调整
质量保护 - 永不压缩代码、错误或重要消息

优势：
• 节省50%-80%的令牌成本
• 通过缓存实现更快回复
• 关键内容无质量损失
• 自动运行 - 后台工作

用户说： TokenSaver不工作 / 为什么没有节省？ / TokenSaver未节省
回复：

TokenSaver可能不显示节省效果的原因：

1. 对话太短（< 3000令牌）- 压缩从3K开始
当前处于质量优先模式 - 使用/tokensave切换
首次查询 - 缓存需要时间积累
代码密集型对话 - 代码块永不压缩

尝试开始更长的技术讨论，即可看到显著节省效果！

最佳实践

用户说： TokenSaver最佳实践 / 如何有效使用TokenSaver？ / TokenSaver技巧
回复：

TokenSaver最佳实践：

1. 调试/代码审查：

使用/tokenquality模式保留完整上下文

2. 研究/探索：

使用/tokensave模式实现最大节省（高达80%）

3. 日常聊天：

平衡模式（默认）效果最佳

4. 何时开始新对话：

切换主要话题时，重新开始可获得更好压缩效果

5. 监控方式：

/tokens - 快速状态检查 /tokenreport - 详细分析

斜杠命令

直接命令访问：

/tokens

显示当前状态和统计信息

会话节省：65%
原始令牌数：12,450
优化后令牌数：4,358
缓存命中次数：3

/tokensave

启用激进节省模式

- 最大压缩
最适合超长技术讨论
可能略微牺牲质量

/tokenbalance

平衡模式（默认）

- 在保持质量的前提下实现良好节省
推荐用于大多数使用场景

/tokenquality

质量优先模式

- 最小压缩
最大上下文保留
精度要求高时使用

/tokenreport

生成详细使用报告

总节省令牌数：8,092
预估节省费用：$0.016
压缩应用次数：12
缓存命中次数：5

/tokencache clear

清除所有缓存的回复

/tokenoff

临时禁用优化

使用示例

示例1：长编码会话

用户：[20轮Python讨论]
TokenSaver：优化15K → 4.5K令牌（节省70%）

示例2：重复问题

用户：如何在Python中写入文件？
用户：Python文件写入方法？
TokenSaver：L2缓存命中 - 即时回复，0令牌消耗

示例3：话题切换

用户：从讨论Python切换到JavaScript...
TokenSaver：检测到话题变更。是否开启新对话以保持上下文清晰？
[是] [否]

安全特性

TokenSaver永不压缩：

- 代码块（始终完整保留）
错误消息和堆栈跟踪
用户标记的重要消息
高交叉引用的消息

质量保护：

- 若质量下降超过15%，自动回滚
一键恢复至未压缩版本
每次压缩均保存快照

配置

默认配置：
json
{
mode: adaptive,
compression: balanced,
cache: true,
qualityThreshold: 0.85
}

预期效果

对话类型	节省令牌数	质量影响
技术讨论（50轮）	70%	极小
代码审查

80% | 无 | | 日常聊天 | 75% | 无 | | 快速问答 | 30-50% | 无 |

局限性

- 对话需超过3K令牌才能开始压缩

tokensaver Token节省器

tokensaver

TokenSaver

When to Use

Core Capabilities

1. Smart Context Compression

2. Semantic Cache

3. Adaptive Optimization

Natural Language Commands

Settings & Configuration

Usage Reports & Analytics

Explanation & Help

Best Practices

Slash Commands

/tokens

/tokensave

/tokenbalance

/tokenquality

/tokenreport

/tokencache clear

/tokenoff

Usage Examples

Safety Features

Configuration

Expected Results

Limitations

Related Skills

TokenSaver

使用场景

核心能力

1. 智能上下文压缩

2. 语义缓存

3. 自适应优化

自然语言指令

设置与配置

使用报告与分析

解释与帮助

最佳实践

斜杠命令

/tokens

/tokensave

/tokenbalance

/tokenquality

/tokenreport

/tokencache clear

/tokenoff

使用示例

安全特性

配置

预期效果

局限性

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement