ContextSlim Context Window Profiler & Optimizer
See exactly what's eating your context window. Analyzes prompts, conversations, and system instructions to show where every token goes. Actionable compression suggestions. All local.
Stop guessing why your AI forgot something. See exactly what's eating your context window.
ContextSlim analyzes your prompts, conversations, and system instructions to show you where every token goes. Get actionable compression suggestions and visual breakdowns — all without sending anything to external APIs.
The Problem
You're talking to an AI and suddenly it forgets critical information. Or your carefully crafted system prompt keeps getting cut off. Why? Because context windows aren't infinite, and most people have no idea how much space they're actually using.
Token counting is confusing. Different providers use different models. You don't want to install heavyweight tokenizer libraries just to get a ballpark estimate. And even if you could count tokens, you still don't know where they're being wasted.
What ContextSlim Does
1. Token Estimation (context_slim.py)
Estimates token usage using word-based heuristics. No external dependencies, no API calls, no tokenizer libraries. Accurate within 10-15% for most English text.
- - Provider-specific estimation (OpenAI, Anthropic, Google, or generic)
- Per-section breakdown (system prompt vs. user messages vs. tool definitions)
- Real context limits for major models (GPT-4: 128k, Claude: 200k, etc.)
- Truncation risk assessment (none/low/medium/high/critical)
- "Why did it forget?" diagnostic mode
2. Compression Suggestions (context_compress.py)
Analyzes your text and tells you exactly what you can cut, tighten, or simplify.
- - Finds redundant phrases ("in order to" → "to")
- Identifies verbose language ("has the ability to" → "can")
- Detects excessive examples (5 examples → suggest 2-3)
- Spots formatting inefficiencies (excessive newlines, long separator lines)
- Flags repetitive instructions
- Estimates tokens saved per suggestion
- Confidence ratings (high/medium/low)
3. Visual Reports (context_report.py)
Generates beautiful HTML reports with CSS-based bar charts (zero JavaScript).
- - Token usage breakdown by section
- Color-coded utilization meters
- Before/after compression comparisons
- Risk indicators (green → red)
- Works offline, mobile-friendly
Quick Start
CODEBLOCK0
Basic Usage
CODEBLOCK1
Analyze Conversations
ContextSlim understands conversation JSON format:
CODEBLOCK2
CODEBLOCK3
Use Cases
1. Prompt Engineering
Before you deploy that 10,000-word system prompt, see how much space it actually takes. Find what you can cut without losing functionality.
2. Debugging "Forgetting" Issues
AI stopped following your instructions? See if your prompt is getting truncated. ContextSlim shows you exactly where the cutoff happens.
3. Cost Optimization
Tokens = money. Compress your prompts, reduce costs. See exactly how many tokens each compression saves.
4. Multi-Provider Workflows
Switching between GPT-4 (128k) and Claude (200k)? See how your prompts fit in each context window.
5. Agent System Optimization
Running an AI agent with tons of tools and memory? Profile which components are eating the most tokens.
6. Team Standardization
Enforce context budgets across your team. "System prompts must be under 5k tokens" — now you can actually measure it.
How It Works
Token Estimation Strategy
ContextSlim uses word-based heuristics instead of external tokenizers:
- - GPT models: ~0.75 tokens per word
- Claude models: ~0.80 tokens per word
- Generic average: ~0.77 tokens per word
Plus adjustments for:
- - Newlines (add ~0.3 tokens each)
- Code blocks (add ~2 tokens per block marker)
- Special formatting
Why not use real tokenizers?
They require heavyweight dependencies (transformers, tiktoken) and still vary between models. Word-based estimation is "good enough" for profiling and costs zero dependencies.
Compression Detection
ContextSlim scans for:
- - Redundant phrases: "in order to", "due to the fact that", etc.
- Verbose constructions: "is able to" → "can"
- Excessive examples: More than 3-4 examples in one list
- Formatting bloat: Too many newlines, overly long separators
- Repetitive instructions: Similar sentences that could be consolidated
Each suggestion includes:
- - What to change
- Estimated tokens saved
- Confidence level (high/medium/low)
Configuration
Copy config_example.py to config.py and customize:
CODEBLOCK4
See config_example.py for full options.
Examples
Example 1: System Prompt Analysis
CODEBLOCK5
Example 2: Compression Suggestions
CODEBLOCK6
Example 3: Full HTML Report
CODEBLOCK7
Open report.html in a browser to see:
- - Total tokens and utilization
- Visual breakdown by message
- All compression suggestions with before/after
- Color-coded risk indicators
What's Included
| File | Purpose |
|---|
| INLINECODE7 | Main analysis engine (CLI + library) |
| INLINECODE8 |
Compression suggestion engine |
|
context_report.py | HTML report generator |
|
config_example.py | Configuration template |
|
README.md | This file |
|
LIMITATIONS.md | Honest limitations |
|
LICENSE | MIT License |
Requirements
- - Python 3.7+
- Zero external dependencies (stdlib only)
- Works on Linux, macOS, Windows
Python API
Use ContextSlim in your own scripts:
CODEBLOCK8
quality-verified
FAQ
Q: How accurate is the token estimation?
A: Within 10-15% for English text. Good enough for profiling, not perfect. If you need exact counts, use the provider's official tokenizer.
Q: Does it work for non-English text?
A: Estimation accuracy drops for non-English. Word-to-token ratios vary by language. You can adjust ratios in config.py.
Q: Does it send my data anywhere?
A: No. Everything runs locally. Zero network calls, zero external APIs.
Q: Can I use it for code?
A: Yes, but code has different token patterns than prose. Estimates may be less accurate for heavily formatted code.
Q: What about multimodal contexts (images, audio)?
A: Text-only for now. See LIMITATIONS.md.
License
MIT — See LICENSE file.
Author
Shadow Rose
Built for AI users who want to understand and optimize their context windows without needing a PhD in tokenization.
⚠️ Disclaimer
This software is provided "AS IS", without warranty of any kind, express or implied.
USE AT YOUR OWN RISK.
- - The author(s) are NOT liable for any damages, losses, or consequences arising from
the use or misuse of this software — including but not limited to financial loss,
data loss, security breaches, business interruption, or any indirect/consequential damages.
- - This software does NOT constitute financial, legal, trading, or professional advice.
- Users are solely responsible for evaluating whether this software is suitable for
their use case, environment, and risk tolerance.
- - No guarantee is made regarding accuracy, reliability, completeness, or fitness
for any particular purpose.
- - The author(s) are not responsible for how third parties use, modify, or distribute
this software after purchase.
By downloading, installing, or using this software, you acknowledge that you have read
this disclaimer and agree to use the software entirely at your own risk.
DATA DISCLAIMER: This software processes and stores data locally on your system.
The author(s) are not responsible for data loss, corruption, or unauthorized access
resulting from software bugs, system failures, or user error. Always maintain
independent backups of important data. This software does not transmit data externally
unless explicitly configured by the user.
Support & Links
| |
|---|
| 🐛 Bug Reports | TheShadowyRose@proton.me |
| ☕ Ko-fi |
ko-fi.com/theshadowrose |
| 🛒
Gumroad |
shadowyrose.gumroad.com |
| 🐦
Twitter |
@TheShadowyRose |
| 🐙
GitHub |
github.com/TheShadowRose |
| 🧠
PromptBase |
promptbase.com/profile/shadowrose |
Built with OpenClaw — thank you for making this possible.
🛠️
Need something custom? Custom OpenClaw agents & skills starting at $500. If you can describe it, I can build it. →
Hire me on Fiverr
ContextSlim 上下文窗口分析器与优化器
精确查看是什么占用了你的上下文窗口。分析提示词、对话和系统指令,显示每个token的去向。提供可操作的压缩建议。全部本地运行。
别再猜测AI为什么忘记信息了。精确查看是什么占用了你的上下文窗口。
ContextSlim分析你的提示词、对话和系统指令,显示每个token的去向。获取可操作的压缩建议和可视化分解——全程无需向外部API发送任何数据。
问题
你正在与AI对话,突然它忘记了关键信息。或者你精心设计的系统提示词不断被截断。为什么?因为上下文窗口不是无限的,而大多数人根本不知道他们实际使用了多少空间。
Token计数令人困惑。不同的提供商使用不同的模型。你不想仅仅为了获得粗略估算就安装庞大的分词器库。即使你能计算token,你仍然不知道它们在哪里被浪费了。
ContextSlim的功能
1. Token估算 (context_slim.py)
使用基于单词的启发式方法估算token使用量。无需外部依赖、无需API调用、无需分词器库。对于大多数英文文本,准确率在10-15%以内。
- - 特定提供商的估算(OpenAI、Anthropic、Google或通用)
- 按部分分解(系统提示词 vs. 用户消息 vs. 工具定义)
- 主要模型的真实上下文限制(GPT-4:128k、Claude:200k等)
- 截断风险评估(无/低/中/高/严重)
- 为什么会忘记?诊断模式
2. 压缩建议 (context_compress.py)
分析你的文本,精确告诉你哪些内容可以删减、精简或简化。
- - 查找冗余短语(in order to → to)
- 识别冗长表达(has the ability to → can)
- 检测过多示例(5个示例 → 建议2-3个)
- 发现格式低效(过多换行、过长的分隔线)
- 标记重复指令
- 估算每条建议节省的token数
- 置信度评级(高/中/低)
3. 可视化报告 (context_report.py)
生成基于CSS柱状图的精美HTML报告(零JavaScript)。
- - 按部分的token使用分解
- 颜色编码的使用率仪表
- 压缩前后的对比
- 风险指示器(绿色→红色)
- 离线可用,移动端友好
快速开始
bash
分析文本文件
python3 context
slim.py myprompt.txt
获取压缩建议
python3 context
compress.py myprompt.txt
生成包含建议的完整HTML报告
python3 context
report.py myprompt.txt --compress --output report.html
基本用法
bash
针对特定提供商进行分析
python3 context_slim.py --provider anthropic --model claude-3-opus prompt.txt
仅获取高置信度建议
python3 context_compress.py --min-confidence high prompt.txt
从标准输入读取
cat system
prompt.txt | python3 contextslim.py
输出JSON格式供脚本使用
python3 context_slim.py --output json prompt.txt > analysis.json
分析对话
ContextSlim理解对话JSON格式:
json
[
{role: system, content: 你是一个有用的助手...},
{role: user, content: 告诉我关于...},
{role: assistant, content: 当然!这是...}
]
bash
python3 context_slim.py conversation.json
使用场景
1. 提示词工程
在部署那个10000字的系统提示词之前,看看它实际占用多少空间。找出哪些内容可以删减而不影响功能。
2. 调试遗忘问题
AI不再遵循你的指令?检查你的提示词是否被截断。ContextSlim精确显示截断发生的位置。
3. 成本优化
Token = 金钱。压缩提示词,降低成本。精确查看每条压缩节省了多少token。
4. 多提供商工作流
在GPT-4(128k)和Claude(200k)之间切换?查看你的提示词在每个上下文窗口中的适配情况。
5. 代理系统优化
运行一个带有大量工具和内存的AI代理?分析哪些组件消耗了最多的token。
6. 团队标准化
在团队中强制执行上下文预算。系统提示词必须控制在5k token以内——现在你可以实际测量了。
工作原理
Token估算策略
ContextSlim使用基于单词的启发式方法,而非外部分词器:
- - GPT模型: 约0.75 token/单词
- Claude模型: 约0.80 token/单词
- 通用平均值: 约0.77 token/单词
加上以下调整:
- - 换行(每个增加约0.3 token)
- 代码块(每个块标记增加约2 token)
- 特殊格式
为什么不使用真正的分词器?
它们需要庞大的依赖(transformers、tiktoken),而且在不同模型之间仍有差异。基于单词的估算对于分析来说足够好,且零依赖成本。
压缩检测
ContextSlim扫描以下内容:
- - 冗余短语: in order to、due to the fact that等
- 冗长结构: is able to → can
- 过多示例: 一个列表中超过3-4个示例
- 格式臃肿: 过多换行、过长的分隔符
- 重复指令: 可以合并的相似句子
每条建议包括:
- - 要更改的内容
- 估算节省的token数
- 置信度级别(高/中/低)
配置
将config_example.py复制为config.py并进行自定义:
python
设置默认提供商
PROVIDER = anthropic
MODEL = claude-3-opus
调整截断风险阈值
TRUNCATION_THRESHOLDS = {
none: 50,
low: 70,
medium: 85,
high: 95,
critical: 100
}
控制压缩建议
MIN_CONFIDENCE = medium
MAX
SUGGESTIONSPER_CATEGORY = 5
完整选项请参见config_example.py。
示例
示例1:系统提示词分析
bash
$ python3 contextslim.py systemprompt.txt --provider openai --model gpt-4
=== ContextSlim分析 ===
提供商:openai(限制:128,000 token)
总token数:8,432
使用率:6.59%
截断风险:无
部分分解:
[文件] 8,432 token(11,234单词)
示例2:压缩建议
bash
$ python3 contextcompress.py systemprompt.txt --min-confidence high
=== ContextSlim压缩分析 ===
发现7条建议
潜在节省:127 token
- 1. [冗余] 将in order to替换为to
置信度:高 | 节省:约3 token
原文:...in order to provide accurate responses...
建议:...to provide accurate responses...
- 2. [冗长] 简化冗长短语
置信度:高 | 节省:约5 token
原文:...has the ability to process...
建议:...can process...
示例3:完整HTML报告
bash
$ python3 context_report.py conversation.json --compress --output report.html
✅ 报告已生成:report.html
在浏览器中打开report.html查看:
- - 总token数和使用率
- 按消息的可视化分解
- 所有压缩建议及前后对比
- 颜色编码的风险指示器
包含文件
| 文件 | 用途 |
|---|
| contextslim.py | 主分析引擎(CLI + 库) |
| contextcompress.py |
压缩建议引擎 |
| context_report.py | HTML报告生成器 |
| config_example.py | 配置模板 |
| README.md | 本文件 |
| LIMITATIONS.md | 诚实说明的限制 |
| LICENSE | MIT许可证 |
系统要求
- - Python 3.7+
- 零外部依赖(仅标准库)
- 支持Linux、macOS、Windows
Python API
在你自己的脚本中使用ContextSlim:
python
from context_slim import ContextAnalyzer, TokenEstimator
from context_compress import CompressionAnalyzer
from context_report import ReportGenerator
分析文本
analyzer = ContextAnalyzer(provider=anthropic, model=claude-3-opus)
profile = analyzer.analyze_text(你的提示词...)
print(fToken数:{profile.total_tokens})
print(f风险:{profile.truncation_risk})
获取压缩建议
compressor = CompressionAnalyzer(provider=anthropic)
suggestions = compressor.analy