返回顶部
c

context-compactor

Token-based context compaction for local models (MLX, llama.cpp, Ollama) that don't report context limits.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 0.3.8
安全检测
已通过
1,559
下载量
0
收藏
概述
安装方式
版本历史

context-compactor

# Context Compactor Automatic context compaction for OpenClaw when using local models that don't properly report token limits or context overflow errors. ## The Problem Cloud APIs (Anthropic, OpenAI) report context overflow errors, allowing OpenClaw's built-in compaction to trigger. Local models (MLX, llama.cpp, Ollama) often: - Silently truncate context - Return garbage when context is exceeded - Don't report accurate token counts This leaves you with broken conversations when context gets too long. ## The Solution Context Compactor estimates tokens client-side and proactively summarizes older messages before hitting the model's limit. ## How It Works ``` ┌─────────────────────────────────────────────────────────────┐ │ 1. Message arrives │ │ 2. before_agent_start hook fires │ │ 3. Plugin estimates total context tokens │ │ 4. If over maxTokens: │ │ a. Split into "old" and "recent" messages │ │ b. Summarize old messages (LLM or fallback) │ │ c. Inject summary as compacted context │ │ 5. Agent sees: summary + recent + new message │ └─────────────────────────────────────────────────────────────┘ ``` ## Installation ```bash # One command setup (recommended) npx jasper-context-compactor setup # Restart gateway openclaw gateway restart ``` The setup command automatically: - Copies plugin files to `~/.openclaw/extensions/context-compactor/` - Adds plugin config to `openclaw.json` with sensible defaults ## Configuration Add to `openclaw.json`: ```json { "plugins": { "entries": { "context-compactor": { "enabled": true, "config": { "maxTokens": 8000, "keepRecentTokens": 2000, "summaryMaxTokens": 1000, "charsPerToken": 4 } } } } } ``` ### Options | Option | Default | Description | |--------|---------|-------------| | `enabled` | `true` | Enable/disable the plugin | | `maxTokens` | `8000` | Max context tokens before compaction | | `keepRecentTokens` | `2000` | Tokens to preserve from recent messages | | `summaryMaxTokens` | `1000` | Max tokens for the summary | | `charsPerToken` | `4` | Token estimation ratio | | `summaryModel` | (session model) | Model to use for summarization | ### Tuning for Your Model **MLX (8K context models):** ```json { "maxTokens": 6000, "keepRecentTokens": 1500, "charsPerToken": 4 } ``` **Larger context (32K models):** ```json { "maxTokens": 28000, "keepRecentTokens": 4000, "charsPerToken": 4 } ``` **Small context (4K models):** ```json { "maxTokens": 3000, "keepRecentTokens": 800, "charsPerToken": 4 } ``` ## Commands ### `/compact-now` Force clear the summary cache and trigger fresh compaction on next message. ``` /compact-now ``` ### `/context-stats` Show current context token usage and whether compaction would trigger. ``` /context-stats ``` Output: ``` 📊 Context Stats Messages: 47 total - User: 23 - Assistant: 24 - System: 0 Estimated Tokens: ~6,234 Limit: 8,000 Usage: 77.9% ✅ Within limits ``` ## How Summarization Works When compaction triggers: 1. **Split messages** into "old" (to summarize) and "recent" (to keep) 2. **Generate summary** using the session model (or configured `summaryModel`) 3. **Cache the summary** to avoid regenerating for the same content 4. **Inject context** with the summary prepended If the LLM runtime isn't available (e.g., during startup), a fallback truncation-based summary is used. ## Differences from Built-in Compaction | Feature | Built-in | Context Compactor | |---------|----------|-------------------| | Trigger | Model reports overflow | Token estimate threshold | | Works with local models | ❌ (need overflow error) | ✅ | | Persists to transcript | ✅ | ❌ (session-only) | | Summarization | Pi runtime | Plugin LLM call | Context Compactor is **complementary** — it catches cases before they hit the model's hard limit. ## Troubleshooting **Summary quality is poor:** - Try a better `summaryModel` - Increase `summaryMaxTokens` - The fallback truncation is used if LLM runtime isn't available **Compaction triggers too often:** - Increase `maxTokens` - Decrease `keepRecentTokens` (keeps less, summarizes earlier) **Not compacting when expected:** - Check `/context-stats` to see current usage - Verify `enabled: true` in config - Check logs for `[context-compactor]` messages **Characters per token wrong:** - Default of 4 works for English - Try 3 for CJK languages - Try 5 for highly technical content ## Logs Enable debug logging: ```json { "plugins": { "entries": { "context-compactor": { "config": { "logLevel": "debug" } } } } } ``` Look for: - `[context-compactor] Current context: ~XXXX tokens` - `[context-compactor] Compacted X messages → summary` ## Links - **GitHub**: https://github.com/E-x-O-Entertainment-Studios-Inc/openclaw-context-compactor - **OpenClaw Docs**: https://docs.openclaw.ai/concepts/compaction

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 context-compactor-1776419983 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 context-compactor-1776419983 技能

通过命令行安装

skillhub install context-compactor-1776419983

下载 Zip 包

⬇ 下载 context-compactor v0.3.8

文件大小: 14.91 KB | 发布时间: 2026-4-17 19:37

v0.3.8 最新 2026-4-17 19:37
v0.3.8: Enhanced Ollama detection

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部