context-compactor
# Context Compactor
Automatic context compaction for OpenClaw when using local models that don't properly report token limits or context overflow errors.
## The Problem
Cloud APIs (Anthropic, OpenAI) report context overflow errors, allowing OpenClaw's built-in compaction to trigger. Local models (MLX, llama.cpp, Ollama) often:
- Silently truncate context
- Return garbage when context is exceeded
- Don't report accurate token counts
This leaves you with broken conversations when context gets too long.
## The Solution
Context Compactor estimates tokens client-side and proactively summarizes older messages before hitting the model's limit.
## How It Works
```
┌─────────────────────────────────────────────────────────────┐
│ 1. Message arrives │
│ 2. before_agent_start hook fires │
│ 3. Plugin estimates total context tokens │
│ 4. If over maxTokens: │
│ a. Split into "old" and "recent" messages │
│ b. Summarize old messages (LLM or fallback) │
│ c. Inject summary as compacted context │
│ 5. Agent sees: summary + recent + new message │
└─────────────────────────────────────────────────────────────┘
```
## Installation
```bash
# One command setup (recommended)
npx jasper-context-compactor setup
# Restart gateway
openclaw gateway restart
```
The setup command automatically:
- Copies plugin files to `~/.openclaw/extensions/context-compactor/`
- Adds plugin config to `openclaw.json` with sensible defaults
## Configuration
Add to `openclaw.json`:
```json
{
"plugins": {
"entries": {
"context-compactor": {
"enabled": true,
"config": {
"maxTokens": 8000,
"keepRecentTokens": 2000,
"summaryMaxTokens": 1000,
"charsPerToken": 4
}
}
}
}
}
```
### Options
| Option | Default | Description |
|--------|---------|-------------|
| `enabled` | `true` | Enable/disable the plugin |
| `maxTokens` | `8000` | Max context tokens before compaction |
| `keepRecentTokens` | `2000` | Tokens to preserve from recent messages |
| `summaryMaxTokens` | `1000` | Max tokens for the summary |
| `charsPerToken` | `4` | Token estimation ratio |
| `summaryModel` | (session model) | Model to use for summarization |
### Tuning for Your Model
**MLX (8K context models):**
```json
{
"maxTokens": 6000,
"keepRecentTokens": 1500,
"charsPerToken": 4
}
```
**Larger context (32K models):**
```json
{
"maxTokens": 28000,
"keepRecentTokens": 4000,
"charsPerToken": 4
}
```
**Small context (4K models):**
```json
{
"maxTokens": 3000,
"keepRecentTokens": 800,
"charsPerToken": 4
}
```
## Commands
### `/compact-now`
Force clear the summary cache and trigger fresh compaction on next message.
```
/compact-now
```
### `/context-stats`
Show current context token usage and whether compaction would trigger.
```
/context-stats
```
Output:
```
📊 Context Stats
Messages: 47 total
- User: 23
- Assistant: 24
- System: 0
Estimated Tokens: ~6,234
Limit: 8,000
Usage: 77.9%
✅ Within limits
```
## How Summarization Works
When compaction triggers:
1. **Split messages** into "old" (to summarize) and "recent" (to keep)
2. **Generate summary** using the session model (or configured `summaryModel`)
3. **Cache the summary** to avoid regenerating for the same content
4. **Inject context** with the summary prepended
If the LLM runtime isn't available (e.g., during startup), a fallback truncation-based summary is used.
## Differences from Built-in Compaction
| Feature | Built-in | Context Compactor |
|---------|----------|-------------------|
| Trigger | Model reports overflow | Token estimate threshold |
| Works with local models | ❌ (need overflow error) | ✅ |
| Persists to transcript | ✅ | ❌ (session-only) |
| Summarization | Pi runtime | Plugin LLM call |
Context Compactor is **complementary** — it catches cases before they hit the model's hard limit.
## Troubleshooting
**Summary quality is poor:**
- Try a better `summaryModel`
- Increase `summaryMaxTokens`
- The fallback truncation is used if LLM runtime isn't available
**Compaction triggers too often:**
- Increase `maxTokens`
- Decrease `keepRecentTokens` (keeps less, summarizes earlier)
**Not compacting when expected:**
- Check `/context-stats` to see current usage
- Verify `enabled: true` in config
- Check logs for `[context-compactor]` messages
**Characters per token wrong:**
- Default of 4 works for English
- Try 3 for CJK languages
- Try 5 for highly technical content
## Logs
Enable debug logging:
```json
{
"plugins": {
"entries": {
"context-compactor": {
"config": {
"logLevel": "debug"
}
}
}
}
}
```
Look for:
- `[context-compactor] Current context: ~XXXX tokens`
- `[context-compactor] Compacted X messages → summary`
## Links
- **GitHub**: https://github.com/E-x-O-Entertainment-Studios-Inc/openclaw-context-compactor
- **OpenClaw Docs**: https://docs.openclaw.ai/concepts/compaction
标签
skill
ai