context-compactor

# Context Compactor Automatic context compaction for OpenClaw when using local models that don't properly report token limits or context overflow errors. ## The Problem Cloud APIs (Anthropic, OpenAI) report context overflow errors, allowing OpenClaw's built-in compaction to trigger. Local models (MLX, llama.cpp, Ollama) often: - Silently truncate context - Return garbage when context is exceeded - Don't report accurate token counts This leaves you with broken conversations when context gets too long. ## The Solution Context Compactor estimates tokens client-side and proactively summarizes older messages before hitting the model's limit. ## How It Works ``` ┌─────────────────────────────────────────────────────────────┐ │ 1. Message arrives │ │ 2. before_agent_start hook fires │ │ 3. Plugin estimates total context tokens │ │ 4. If over maxTokens: │ │ a. Split into "old" and "recent" messages │ │ b. Summarize old messages (LLM or fallback) │ │ c. Inject summary as compacted context │ │ 5. Agent sees: summary + recent + new message │ └─────────────────────────────────────────────────────────────┘ ``` ## Installation ```bash # One command setup (recommended) npx jasper-context-compactor setup # Restart gateway openclaw gateway restart ``` The setup command automatically: - Copies plugin files to `~/.openclaw/extensions/context-compactor/` - Adds plugin config to `openclaw.json` with sensible defaults ## Configuration Add to `openclaw.json`: ```json { "plugins": { "entries": { "context-compactor": { "enabled": true, "config": { "maxTokens": 8000, "keepRecentTokens": 2000, "summaryMaxTokens": 1000, "charsPerToken": 4 } } } } } ``` ### Options | Option | Default | Description | |--------|---------|-------------| | `enabled` | `true` | Enable/disable the plugin | | `maxTokens` | `8000` | Max context tokens before compaction | | `keepRecentTokens` | `2000` | Tokens to preserve from recent messages | | `summaryMaxTokens` | `1000` | Max tokens for the summary | | `charsPerToken` | `4` | Token estimation ratio | | `summaryModel` | (session model) | Model to use for summarization | ### Tuning for Your Model **MLX (8K context models):** ```json { "maxTokens": 6000, "keepRecentTokens": 1500, "charsPerToken": 4 } ``` **Larger context (32K models):** ```json { "maxTokens": 28000, "keepRecentTokens": 4000, "charsPerToken": 4 } ``` **Small context (4K models):** ```json { "maxTokens": 3000, "keepRecentTokens": 800, "charsPerToken": 4 } ``` ## Commands ### `/compact-now` Force clear the summary cache and trigger fresh compaction on next message. ``` /compact-now ``` ### `/context-stats` Show current context token usage and whether compaction would trigger. ``` /context-stats ``` Output: ``` 📊 Context Stats Messages: 47 total - User: 23 - Assistant: 24 - System: 0 Estimated Tokens: ~6,234 Limit: 8,000 Usage: 77.9% ✅ Within limits ``` ## How Summarization Works When compaction triggers: 1. **Split messages** into "old" (to summarize) and "recent" (to keep) 2. **Generate summary** using the session model (or configured `summaryModel`) 3. **Cache the summary** to avoid regenerating for the same content 4. **Inject context** with the summary prepended If the LLM runtime isn't available (e.g., during startup), a fallback truncation-based summary is used. ## Differences from Built-in Compaction | Feature | Built-in | Context Compactor | |---------|----------|-------------------| | Trigger | Model reports overflow | Token estimate threshold | | Works with local models | ❌ (need overflow error) | ✅ | | Persists to transcript | ✅ | ❌ (session-only) | | Summarization | Pi runtime | Plugin LLM call | Context Compactor is **complementary** — it catches cases before they hit the model's hard limit. ## Troubleshooting **Summary quality is poor:** - Try a better `summaryModel` - Increase `summaryMaxTokens` - The fallback truncation is used if LLM runtime isn't available **Compaction triggers too often:** - Increase `maxTokens` - Decrease `keepRecentTokens` (keeps less, summarizes earlier) **Not compacting when expected:** - Check `/context-stats` to see current usage - Verify `enabled: true` in config - Check logs for `[context-compactor]` messages **Characters per token wrong:** - Default of 4 works for English - Try 3 for CJK languages - Try 5 for highly technical content ## Logs Enable debug logging: ```json { "plugins": { "entries": { "context-compactor": { "config": { "logLevel": "debug" } } } } } ``` Look for: - `[context-compactor] Current context: ~XXXX tokens` - `[context-compactor] Compacted X messages → summary` ## Links - **GitHub**: https://github.com/E-x-O-Entertainment-Studios-Inc/openclaw-context-compactor - **OpenClaw Docs**: https://docs.openclaw.ai/concepts/compaction

context-compactor

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

context-compactor