ContextSlim Context Window Profiler & Optimizer

See exactly what's eating your context window. Analyzes prompts, conversations, and system instructions to show where every token goes. Actionable compression suggestions. All local.

Stop guessing why your AI forgot something. See exactly what's eating your context window.

ContextSlim analyzes your prompts, conversations, and system instructions to show you where every token goes. Get actionable compression suggestions and visual breakdowns — all without sending anything to external APIs.

The Problem

You're talking to an AI and suddenly it forgets critical information. Or your carefully crafted system prompt keeps getting cut off. Why? Because context windows aren't infinite, and most people have no idea how much space they're actually using.

Token counting is confusing. Different providers use different models. You don't want to install heavyweight tokenizer libraries just to get a ballpark estimate. And even if you could count tokens, you still don't know where they're being wasted.

What ContextSlim Does

1. Token Estimation (`context_slim.py`)

Estimates token usage using word-based heuristics. No external dependencies, no API calls, no tokenizer libraries. Accurate within 10-15% for most English text.

- Provider-specific estimation (OpenAI, Anthropic, Google, or generic)
Per-section breakdown (system prompt vs. user messages vs. tool definitions)
Real context limits for major models (GPT-4: 128k, Claude: 200k, etc.)
Truncation risk assessment (none/low/medium/high/critical)
"Why did it forget?" diagnostic mode

2. Compression Suggestions (`context_compress.py`)

Analyzes your text and tells you exactly what you can cut, tighten, or simplify.

- Finds redundant phrases ("in order to" → "to")
Identifies verbose language ("has the ability to" → "can")
Detects excessive examples (5 examples → suggest 2-3)
Spots formatting inefficiencies (excessive newlines, long separator lines)
Flags repetitive instructions
Estimates tokens saved per suggestion
Confidence ratings (high/medium/low)

3. Visual Reports (`context_report.py`)

Generates beautiful HTML reports with CSS-based bar charts (zero JavaScript).

- Token usage breakdown by section
Color-coded utilization meters
Before/after compression comparisons
Risk indicators (green → red)
Works offline, mobile-friendly

Quick Start

CODEBLOCK0

Basic Usage

CODEBLOCK1

Analyze Conversations

ContextSlim understands conversation JSON format:

CODEBLOCK2

CODEBLOCK3

Use Cases

1. Prompt Engineering

Before you deploy that 10,000-word system prompt, see how much space it actually takes. Find what you can cut without losing functionality.

2. Debugging "Forgetting" Issues

AI stopped following your instructions? See if your prompt is getting truncated. ContextSlim shows you exactly where the cutoff happens.

3. Cost Optimization

Tokens = money. Compress your prompts, reduce costs. See exactly how many tokens each compression saves.

4. Multi-Provider Workflows

Switching between GPT-4 (128k) and Claude (200k)? See how your prompts fit in each context window.

5. Agent System Optimization

Running an AI agent with tons of tools and memory? Profile which components are eating the most tokens.

6. Team Standardization

Enforce context budgets across your team. "System prompts must be under 5k tokens" — now you can actually measure it.

How It Works

Token Estimation Strategy

ContextSlim uses word-based heuristics instead of external tokenizers:

- GPT models: ~0.75 tokens per word
Claude models: ~0.80 tokens per word
Generic average: ~0.77 tokens per word

Plus adjustments for:

- Newlines (add ~0.3 tokens each)
Code blocks (add ~2 tokens per block marker)
Special formatting

Why not use real tokenizers?
They require heavyweight dependencies (transformers, tiktoken) and still vary between models. Word-based estimation is "good enough" for profiling and costs zero dependencies.

Compression Detection

ContextSlim scans for:

- Redundant phrases: "in order to", "due to the fact that", etc.
Verbose constructions: "is able to" → "can"
Excessive examples: More than 3-4 examples in one list
Formatting bloat: Too many newlines, overly long separators
Repetitive instructions: Similar sentences that could be consolidated

Each suggestion includes:

- What to change
Estimated tokens saved
Confidence level (high/medium/low)

Configuration

Copy config_example.py to config.py and customize:

CODEBLOCK4

See config_example.py for full options.

Examples

Example 1: System Prompt Analysis

CODEBLOCK5

Example 2: Compression Suggestions

CODEBLOCK6

Example 3: Full HTML Report

CODEBLOCK7

Open report.html in a browser to see:

- Total tokens and utilization
Visual breakdown by message
All compression suggestions with before/after
Color-coded risk indicators

What's Included

File	Purpose
INLINECODE7	Main analysis engine (CLI + library)
INLINECODE8

Requirements

- Python 3.7+
Zero external dependencies (stdlib only)
Works on Linux, macOS, Windows

Python API

Use ContextSlim in your own scripts:

CODEBLOCK8

quality-verified

FAQ

Q: How accurate is the token estimation?
A: Within 10-15% for English text. Good enough for profiling, not perfect. If you need exact counts, use the provider's official tokenizer.

Q: Does it work for non-English text?
A: Estimation accuracy drops for non-English. Word-to-token ratios vary by language. You can adjust ratios in config.py.

Q: Does it send my data anywhere?
A: No. Everything runs locally. Zero network calls, zero external APIs.

Q: Can I use it for code?
A: Yes, but code has different token patterns than prose. Estimates may be less accurate for heavily formatted code.

Q: What about multimodal contexts (images, audio)?
A: Text-only for now. See LIMITATIONS.md.

License

MIT — See LICENSE file.

Author

Shadow Rose

Built for AI users who want to understand and optimize their context windows without needing a PhD in tokenization.

⚠️ Disclaimer

This software is provided "AS IS", without warranty of any kind, express or implied.

USE AT YOUR OWN RISK.

- The author(s) are NOT liable for any damages, losses, or consequences arising from

the use or misuse of this software — including but not limited to financial loss, data loss, security breaches, business interruption, or any indirect/consequential damages.

- This software does NOT constitute financial, legal, trading, or professional advice.
Users are solely responsible for evaluating whether this software is suitable for

their use case, environment, and risk tolerance.

- No guarantee is made regarding accuracy, reliability, completeness, or fitness

for any particular purpose.

- The author(s) are not responsible for how third parties use, modify, or distribute

this software after purchase.

By downloading, installing, or using this software, you acknowledge that you have read
this disclaimer and agree to use the software entirely at your own risk.

DATA DISCLAIMER: This software processes and stores data locally on your system.
The author(s) are not responsible for data loss, corruption, or unauthorized access
resulting from software bugs, system failures, or user error. Always maintain
independent backups of important data. This software does not transmit data externally
unless explicitly configured by the user.

Support & Links


🐛 Bug Reports	TheShadowyRose@proton.me
☕ Ko-fi

Built with OpenClaw — thank you for making this possible.

🛠️ Need something custom? Custom OpenClaw agents & skills starting at $500. If you can describe it, I can build it. → Hire me on Fiverr

ContextSlim 上下文窗口分析器与优化器

精确查看是什么占用了你的上下文窗口。分析提示词、对话和系统指令，显示每个token的去向。提供可操作的压缩建议。全部本地运行。

别再猜测AI为什么忘记信息了。精确查看是什么占用了你的上下文窗口。

ContextSlim分析你的提示词、对话和系统指令，显示每个token的去向。获取可操作的压缩建议和可视化分解——全程无需向外部API发送任何数据。

问题

你正在与AI对话，突然它忘记了关键信息。或者你精心设计的系统提示词不断被截断。为什么？因为上下文窗口不是无限的，而大多数人根本不知道他们实际使用了多少空间。

Token计数令人困惑。不同的提供商使用不同的模型。你不想仅仅为了获得粗略估算就安装庞大的分词器库。即使你能计算token，你仍然不知道它们在哪里被浪费了。

ContextSlim的功能

1. Token估算 (context_slim.py)

使用基于单词的启发式方法估算token使用量。无需外部依赖、无需API调用、无需分词器库。对于大多数英文文本，准确率在10-15%以内。

- 特定提供商的估算（OpenAI、Anthropic、Google或通用）
按部分分解（系统提示词 vs. 用户消息 vs. 工具定义）
主要模型的真实上下文限制（GPT-4：128k、Claude：200k等）
截断风险评估（无/低/中/高/严重）
为什么会忘记？诊断模式

2. 压缩建议 (context_compress.py)

分析你的文本，精确告诉你哪些内容可以删减、精简或简化。

- 查找冗余短语（in order to → to）
识别冗长表达（has the ability to → can）
检测过多示例（5个示例 → 建议2-3个）
发现格式低效（过多换行、过长的分隔线）
标记重复指令
估算每条建议节省的token数
置信度评级（高/中/低）

3. 可视化报告 (context_report.py)

生成基于CSS柱状图的精美HTML报告（零JavaScript）。

- 按部分的token使用分解
颜色编码的使用率仪表
压缩前后的对比
风险指示器（绿色→红色）
离线可用，移动端友好

快速开始

bash

分析文本文件

python3 contextslim.py myprompt.txt

获取压缩建议

python3 contextcompress.py myprompt.txt

生成包含建议的完整HTML报告

python3 contextreport.py myprompt.txt --compress --output report.html

基本用法

bash

针对特定提供商进行分析

python3 context_slim.py --provider anthropic --model claude-3-opus prompt.txt

仅获取高置信度建议

python3 context_compress.py --min-confidence high prompt.txt

从标准输入读取

cat systemprompt.txt | python3 contextslim.py

输出JSON格式供脚本使用

python3 context_slim.py --output json prompt.txt > analysis.json

分析对话

ContextSlim理解对话JSON格式：

json
[
{role: system, content: 你是一个有用的助手...},
{role: user, content: 告诉我关于...},
{role: assistant, content: 当然！这是...}
]

bash
python3 context_slim.py conversation.json

使用场景

1. 提示词工程

在部署那个10000字的系统提示词之前，看看它实际占用多少空间。找出哪些内容可以删减而不影响功能。

2. 调试遗忘问题

AI不再遵循你的指令？检查你的提示词是否被截断。ContextSlim精确显示截断发生的位置。

3. 成本优化

Token = 金钱。压缩提示词，降低成本。精确查看每条压缩节省了多少token。

4. 多提供商工作流

在GPT-4（128k）和Claude（200k）之间切换？查看你的提示词在每个上下文窗口中的适配情况。

5. 代理系统优化

运行一个带有大量工具和内存的AI代理？分析哪些组件消耗了最多的token。

6. 团队标准化

在团队中强制执行上下文预算。系统提示词必须控制在5k token以内——现在你可以实际测量了。

工作原理

Token估算策略

ContextSlim使用基于单词的启发式方法，而非外部分词器：

- GPT模型： 约0.75 token/单词
Claude模型： 约0.80 token/单词
通用平均值： 约0.77 token/单词

加上以下调整：

- 换行（每个增加约0.3 token）
代码块（每个块标记增加约2 token）
特殊格式

为什么不使用真正的分词器？
它们需要庞大的依赖（transformers、tiktoken），而且在不同模型之间仍有差异。基于单词的估算对于分析来说足够好，且零依赖成本。

压缩检测

ContextSlim扫描以下内容：

- 冗余短语： in order to、due to the fact that等
冗长结构： is able to → can
过多示例： 一个列表中超过3-4个示例
格式臃肿： 过多换行、过长的分隔符
重复指令： 可以合并的相似句子

每条建议包括：

- 要更改的内容
估算节省的token数
置信度级别（高/中/低）

配置

将config_example.py复制为config.py并进行自定义：

python

设置默认提供商

PROVIDER = anthropic
MODEL = claude-3-opus

调整截断风险阈值

TRUNCATION_THRESHOLDS = { none: 50, low: 70, medium: 85, high: 95, critical: 100 }

控制压缩建议

MIN_CONFIDENCE = medium MAXSUGGESTIONSPER_CATEGORY = 5

完整选项请参见config_example.py。

示例

示例1：系统提示词分析

bash
$ python3 contextslim.py systemprompt.txt --provider openai --model gpt-4

=== ContextSlim分析 ===
提供商：openai（限制：128,000 token）
总token数：8,432
使用率：6.59%
截断风险：无

部分分解：
[文件] 8,432 token（11,234单词）

示例2：压缩建议

bash
$ python3 contextcompress.py systemprompt.txt --min-confidence high

=== ContextSlim压缩分析 ===
发现7条建议
潜在节省：127 token

1. [冗余] 将in order to替换为to

置信度：高 | 节省：约3 token 原文：...in order to provide accurate responses... 建议：...to provide accurate responses...

2. [冗长] 简化冗长短语

置信度：高 | 节省：约5 token 原文：...has the ability to process... 建议：...can process...

示例3：完整HTML报告

bash
$ python3 context_report.py conversation.json --compress --output report.html

✅ 报告已生成：report.html

在浏览器中打开report.html查看：

- 总token数和使用率
按消息的可视化分解
所有压缩建议及前后对比
颜色编码的风险指示器

包含文件

文件	用途
contextslim.py	主分析引擎（CLI + 库）
contextcompress.py

系统要求

- Python 3.7+
零外部依赖（仅标准库）
支持Linux、macOS、Windows

Python API

在你自己的脚本中使用ContextSlim：

python
from context_slim import ContextAnalyzer, TokenEstimator
from context_compress import CompressionAnalyzer
from context_report import ReportGenerator

分析文本

analyzer = ContextAnalyzer(provider=anthropic, model=claude-3-opus) profile = analyzer.analyze_text(你的提示词...)

print(fToken数：{profile.total_tokens})
print(f风险：{profile.truncation_risk})

获取压缩建议

compressor = CompressionAnalyzer(provider=anthropic) suggestions = compressor.analy

ContextSlim Context Window Profiler & Optimizer上下文窗口优化器