midos-self-improver
An agent learning system that captures what goes wrong, what gets corrected, and what works — then promotes the best learnings into permanent project memory. With quality gates that prevent noise from polluting your knowledge base.
Most self-improving agents dump everything into a flat file. Over time, that file becomes a graveyard of one-off notes that never get cleaned up. midos-self-improver solves this with a capture → quality gate → staging → scoring → promotion pipeline where every learning must prove its value through recurrence before it becomes permanent.
Architecture
CODEBLOCK0
The 5 Detection Triggers
| Trigger | What It Captures | Example |
|---|
| Correction | User corrects agent behavior | "Don't use git add ., use specific files" |
| Error |
Tool call fails or returns unexpected result | ImportError, test failure, API timeout |
|
Knowledge Gap | Agent didn't know something it should have | "The config file moved to /new/path" |
|
Best Practice | Successful pattern worth repeating | "Running preflight before publish prevented 3 issues" |
|
Pattern | Recurring code structure or workflow | "Every MCP tool needs tier guard + handler separation" |
Detection hooks
CODEBLOCK1
Quality Gate (Deterministic)
Before any learning enters the staging area, it passes through a quality gate:
Deduplication
CODEBLOCK2
Decision Check
CODEBLOCK3
Only entries that pass both checks advance to the staging area.
4-Axis Scoring
Every staged learning gets scored on 4 axes:
| Axis | Weight | What It Measures |
|---|
| Recurrence | 0.35 | How many times this same issue/pattern appeared |
| Freshness |
0.25 | How recent (exponential decay, half-life 14 days) |
|
Specificity | 0.20 | Concrete file paths/functions vs vague advice |
|
Impact | 0.20 | Breadth of effect (multi-domain > single file) |
Scoring formulas
CODEBLOCK4
Promotion thresholds
CODEBLOCK5
Quick Start
Standalone Mode (zero dependencies)
Add to your CLAUDE.md or agent instructions:
CODEBLOCK6
With the capture hooks
CODEBLOCK7
Triggering promotion
CODEBLOCK8
Usage Patterns
Pattern 1: Correction Loop
CODEBLOCK9
Pattern 2: Error Prevention
CODEBLOCK10
Pattern 3: Noise Rejection
CODEBLOCK11
How It Compares
| Feature | midos-self-improver | self-improving-agent (101K) | proactive-agent (54K) |
|---|
| Promotion tiers | 4 (entry → staging → chunks → rules) | 2 (.learnings → CLAUDE.md) | 1 (WAL → manual) |
| Quality gate |
Deterministic (dedup + decision check) | None | None |
| Deduplication | SHA-256 + trigram similarity | None | None |
| Scoring | 4-axis composite (recurrence, freshness, specificity, impact) | Manual review | VFM scoring (manual) |
| Promotion trigger | Automatic at threshold | Manual (activator.sh) | Manual |
| Noise rejection | Yes (quality gate rejects non-decisions) | No (logs everything) | No |
| Categories | 5 types with domain tagging | 3 files (LEARNINGS, ERRORS, FEATURES) | 1 file (WAL) |
| Maturation | Staging area with aging | None | None |
| Archival | Auto-prune at score < 0.3 | None | None |
| Hook integration | PostToolUse + UserPromptSubmit | PostToolUse + UserPromptSubmit | Manual |
| Works without LLM | Yes (all deterministic) | Yes | Yes |
Entry Format
CODEBLOCK12
MidOS-Connected Mode
When running inside the MidOS ecosystem, the self-improver gains:
- - GEPA coherence scoring validates promoted chunks against the knowledge base
- L2R reranker helps find truly similar existing patterns (prevents subtle duplicates)
- Vector dedup via LanceDB cosine similarity (catches semantic duplicates, not just textual)
- Auto-promotion pipeline with MC-2 deliverable gates (frontmatter, length, coherence)
- Pattern harvester hook wired to every Write|Edit operation
- Scheduled assessment via
your cron/scheduler system (runs every 2 hours) - MCP tools:
learning_log, learning_search, learning_stats exposed via MCP server
The standalone mode handles 80% of learning scenarios. The ecosystem adds deeper dedup, quality scoring, and integration with the 6-layer knowledge pipeline.
Built with
MidOS — MCP Community Library.
This is 1 of 200+ skills in the MidOS ecosystem.
Free MCP access: midos.dev/dev (500 queries/mo)
Full ecosystem: midos.dev/pro ($20/mo)
技能名称: midos-self-improver
详细描述:
midos-self-improver
一个智能体学习系统,能够捕获出错、被纠正以及有效的内容,然后将最佳学习成果提升为永久项目记忆。配备质量关卡,防止噪音污染知识库。
大多数自我改进型智能体将所有内容倾倒进一个平面文件。随着时间的推移,这个文件会变成一个从未清理的一次性笔记的坟场。midos-self-improver 通过一个 捕获 → 质量关卡 → 暂存区 → 评分 → 提升 流水线解决了这个问题,其中每一条学习成果在成为永久记忆之前,都必须通过重复出现来证明其价值。
架构
智能体会话
↓
[检测器] — 5 种触发类型
↓
.learnings/entries/{category}/{timestamp}.json
↓
[质量关卡] — 去重 + 决策检查
↓
.patterns/{domain}_pattern.md (暂存区)
↓
[4 轴评分器] — 重复性、新鲜度、具体性、影响度
↓
.knowledge/ (永久) ← 仅当评分 >= 0.7
↓
CLAUDE.md / AGENTS.md (提升后的规则)
5 种检测触发器
| 触发器 | 捕获内容 | 示例 |
|---|
| 纠正 | 用户纠正智能体行为 | 不要使用 git add .,要使用具体文件 |
| 错误 |
工具调用失败或返回意外结果 | ImportError, 测试失败, API 超时 |
|
知识缺口 | 智能体不知道它应该知道的内容 | 配置文件已移至 /new/path |
|
最佳实践 | 值得重复的成功模式 | 发布前运行预检避免了 3 个问题 |
|
模式 | 重复出现的代码结构或工作流 | 每个 MCP 工具都需要层级守卫 + 处理器分离 |
检测钩子
bash
纠正检测器 — 当检测到纠正性语言时,在 UserPromptSubmit 上触发
模式: 不,改为做 X, 那是错的, 实际上, 我说过, 不要那样做
错误检测器 — 当工具返回错误时,在 PostToolUse 上触发
捕获: 退出码 != 0, 异常追踪, 输出中的 Error:
缺口检测器 — 当智能体说 我不知道 或对同一事物搜索超过 3 次时触发
模式检测器 — 在 PostToolUse Write|Edit 上触发
分析: 做出了哪些决策,考虑了哪些权衡
质量关卡 (确定性)
在任何学习成果进入暂存区之前,它必须通过一个质量关卡:
去重
- 1. 对规范化内容进行 SHA-256 哈希(小写,去除空白)
- 与过去 30 天内的所有条目进行比较
- 如果哈希存在 → 增加重复计数器,跳过创建
- 如果相似(>85% 三元组重叠)→ 合并到现有条目
决策检查
规则(无需 LLM):
1. 从模式中提取 >= 2 个决策 → 通过
2. 跨 >= 2 个领域的 >= 3 个文件 → 通过(跨领域)
3. 仅有文档字符串,无决策 → 失败(记录,非模式)
4. 所有文件属于同一琐碎编辑 → 失败(维护,非学习)
只有通过这两项检查的条目才能进入暂存区。
4 轴评分
每个暂存的学习成果在 4 个轴上获得评分:
| 轴 | 权重 | 衡量内容 |
|---|
| 重复性 | 0.35 | 同一问题/模式出现的次数 |
| 新鲜度 |
0.25 | 多近发生(指数衰减,半衰期 14 天) |
|
具体性 | 0.20 | 具体的文件路径/函数 vs 模糊建议 |
|
影响度 | 0.20 | 影响范围(多领域 > 单个文件) |
评分公式
recurrence_score = min(count / 5, 1.0) # 在 5 次出现时饱和
freshnessscore = exp(-0.693 * dayssince / 14) # 半衰期 14 天
specificityscore = (haspath 0.4) + (hasfunction 0.3) + (hasexample * 0.3)
impactscore = min(ndomains / 3, 1.0) 0.6 + min(n_files / 5, 1.0) 0.4
composite = (recurrence 0.35) + (freshness 0.25) +
(specificity 0.20) + (impact 0.20)
提升阈值
composite >= 0.7 → 提升至永久知识库
composite < 0.3 → 修剪(归档并停止跟踪)
0.3 <= c < 0.7 → 保留在暂存区(让其在更多数据中成熟)
快速开始
独立模式(零依赖)
添加到您的 CLAUDE.md 或智能体指令中:
markdown
自我改进协议
关于纠正
当用户纠正您时:
- 1. 将纠正记录到 .learnings/corrections/{date}.md
- 包括:您做错了什么,正确的行为是什么,哪个文件/函数
- 如果同一纠正出现第 3 次或更多次 → 提升为 CLAUDE.md 规则
关于错误
当工具调用失败时:
- 1. 记录到 .learnings/errors/{date}.md
- 包括:命令,错误消息,根本原因,应用的修复
- 如果同一错误类型出现 3 次或更多次 → 创建预防规则
关于模式
当您注意到一个有效的重复方法时:
- 1. 记录到 .learnings/patterns/{domain}/{date}.md
- 包括:做了什么决策,为什么选择这个而非其他方案,有效的证据
- 模式必须包含 >= 2 个具体决策才能被记录(不仅仅是描述)
提升规则
- - 重复性 >= 3 且综合评分 >= 0.6 → 提升至永久记忆
- 没有重复价值的证据绝不提升
- 去重:在写入新条目之前检查 SHA-256
- 归档超过 30 天且评分 < 0.3 的条目
使用捕获钩子
python
纠正捕获(连接到 UserPromptSubmit)
from hooks.learning
capture import capturecorrection
capture_correction(
user_message=不,在 git add 中始终使用具体文件,
agent_response=我将使用 git add file1 file2 而不是 git add .,
context={file: CLAUDE.md, function: commit_protocol}
)
错误捕获(连接到 PostToolUse)
from hooks.learning
capture import captureerror
capture_error(
tool=Bash,
command=python -m pytest tests/,
error=ImportError: cannot import name AuthMiddleware,
fix=更改为绝对导入:from modules.community_mcp.auth import AuthMiddleware
)
评估所有暂存模式
from hooks.pattern
harvester import assesspattern_value
results = assess
patternvalue()
返回:[{file: ..., score: 0.82, action: PROMOTE}, ...]
触发提升
bash
对所有暂存模式运行评估
python -c from hooks.pattern
harvester import assesspattern
value; assesspattern_value()
检查暂存区内容
ls docs/patterns/
检查已提升的内容
ls .knowledge/ | grep pattern
检查已丢弃的内容
cat knowledge/_discarded/LOG.md
使用模式
模式 1:纠正循环
用户:不要读取整个文件,先使用 grep
↓
检测器:检测到纠正性语言(不要,祈使句)
↓
条目:.learnings/corrections/2026-03-04T10:23:45.json
{
type: correction,
wrong: 使用 cat/Read 读取整个文件,
right: 先用 grep 搜索模式,然后用 Read 加偏移量,
context: {domain: efficiency},
recurrence: 1
}
↓
(同一纠正 3 天内又出现 2 次)
↓
recurrence_score: 0.6 (3/5)
freshness_score: 0.95 (近期)
specificity_score: 0.7 (有具体工具名称)
impact_score: 0.8 (影响所有文件操作)
com