Agent Resilience
Patterns for surviving context loss, capturing corrections, and continuously improving.
WAL Protocol (Write-Ahead Logging)
The Law: Chat history is a buffer, not storage. Files survive; context doesn't.
Trigger — scan every message for:
- - ✏️ Corrections — "It's X, not Y" / "Actually..." / "No, I meant..."
- 📍 Proper nouns — names, places, companies, products
- 🎨 Preferences — styles, approaches, "I like/don't like"
- 📋 Decisions — "Let's do X" / "Go with Y"
- 🔢 Specific values — numbers, dates, IDs, URLs
If any appear:
- 1. WRITE FIRST → update INLINECODE0
- THEN respond
The urge to respond is the enemy. Write before replying.
SESSION-STATE.md
Active working memory for the current task. Create at memory/SESSION-STATE.md:
CODEBLOCK0
Reset when starting a new unrelated task.
Working Buffer (Danger Zone)
When context reaches ~60%, start logging every exchange to memory/working-buffer.md:
CODEBLOCK1
Clear the buffer at the START of the next 60% threshold (not continuously).
Compaction Recovery
Auto-trigger when session starts with a summary tag, or human says "where were we?":
- 1. Read
memory/working-buffer.md — raw danger-zone exchanges - Read
memory/SESSION-STATE.md — active task state - Read today's + yesterday's daily notes
- Extract key context back into SESSION-STATE.md
- Respond: "Recovered from buffer. Last task was X. Continue?"
Never ask "what were we discussing?" — read the buffer first.
Verify Before Reporting
Before saying "done", "complete", "finished":
- 1. STOP
- Actually test from the user's perspective
- Verify the outcome, not just that code exists
- Only THEN report complete
Text changes ≠ behavior changes. When changing how something works, identify the architectural component and change the actual mechanism.
Relentless Resourcefulness
Try 10 approaches before asking for help or saying "can't":
- - Different CLI flags, tool, API endpoint
- Check memory: "Have I done this before?"
- Spawn a research sub-agent
- Grep logs for past successes
"Can't" = exhausted all options. Not "first try failed."
Self-Improvement Guardrails
When updating behavior/config based on a lesson:
Score the change first (skip if < 50 weighted points):
- - High frequency (daily use?) → 3×
- Reduces failures → 3×
- Saves user effort → 2×
- Saves future-agent tokens/time → 2×
Ask: "Does this let future-me solve more problems with less cost?" If no, skip it.
Forbidden: complexity for its own sake, changes you can't verify worked, vague justifications.
Quick Start Checklist
For long/complex tasks:
- - [ ] Create
memory/SESSION-STATE.md with task + context - [ ] Apply WAL: write corrections/decisions before responding
- [ ] At ~60% context: start working buffer
- [ ] After any compaction: read buffer before asking questions
- [ ] Before reporting done: verify actual outcome
Agent Resilience
在上下文丢失中存活的模式,捕获修正,并持续改进。
WAL协议(预写日志)
法则: 聊天历史是缓冲区,而非存储。文件持久存在;上下文则不然。
触发条件——扫描每条消息:
- - ✏️ 修正——是X,不是Y/实际上……/不,我的意思是……
- 📍 专有名词——姓名、地点、公司、产品
- 🎨 偏好——风格、方法、我喜欢/不喜欢
- 📋 决策——我们做X/选Y
- 🔢 具体数值——数字、日期、ID、URL
如果出现任何一项:
- 1. 先写入 → 更新 memory/SESSION-STATE.md
- 再 回复
回复的冲动是敌人。回复前先写入。
SESSION-STATE.md
当前任务的活跃工作记忆。在 memory/SESSION-STATE.md 创建:
markdown
会话状态
任务: [我们正在做什么]
关键决策: [已做出的决策]
细节: [通过WAL捕获的修正、名称、数值]
下一步: [接下来做什么]
开始新的无关任务时重置。
工作缓冲区(危险区域)
当上下文达到约60%时,开始将每次交流记录到 memory/working-buffer.md:
markdown
工作缓冲区
状态: 活跃——开始于 [时间戳]
[时间] 人类
[他们的消息]
[时间] 智能体
[1-2句摘要 + 关键细节]
在下一个60%阈值开始时清除缓冲区(而非持续清除)。
压缩恢复
当会话以摘要标签开始,或人类说我们说到哪了?时自动触发:
- 1. 读取 memory/working-buffer.md —— 原始危险区域交流记录
- 读取 memory/SESSION-STATE.md —— 活跃任务状态
- 读取今天和昨天的每日笔记
- 将关键上下文提取回 SESSION-STATE.md
- 回复:已从缓冲区恢复。上一个任务是X。继续吗?
永远不要问我们刚才在讨论什么?——先读取缓冲区。
报告前验证
在说完成、完毕、结束之前:
- 1. 停下
- 从用户角度实际测试
- 验证结果,而不仅仅是代码存在
- 然后才报告完成
文本变化 ≠ 行为变化。当改变某事物的工作方式时,识别架构组件并更改实际机制。
不懈的资源fulness
在寻求帮助或说做不到之前尝试10种方法:
- - 不同的CLI标志、工具、API端点
- 检查记忆:我以前做过这个吗?
- 生成一个研究子智能体
- 在日志中搜索过去的成功案例
做不到 = 已穷尽所有选项。而非第一次尝试失败。
自我改进护栏
当基于经验教训更新行为/配置时:
首先对变更评分(加权分数低于50则跳过):
- - 高频率(日常使用?)→ 3倍
- 减少失败 → 3倍
- 节省用户精力 → 2倍
- 节省未来智能体的令牌/时间 → 2倍
问:这能让未来的我以更低的成本解决更多问题吗?如果否,跳过。
禁止:为复杂而复杂、无法验证是否生效的变更、模糊的理由。
快速启动清单
对于长/复杂任务:
- - [ ] 创建包含任务和上下文的 memory/SESSION-STATE.md
- [ ] 应用WAL:在回复前写入修正/决策
- [ ] 在上下文约60%时:启动工作缓冲区
- [ ] 任何压缩后:在提问前读取缓冲区
- [ ] 报告完成前:验证实际结果