MidOS Memory Cascade
A self-tuning, auto-escalating search engine that tries each memory tier from fastest to slowest, stopping as soon as it finds a high-confidence answer.
What It Does
Instead of the agent deciding which storage layer to query, the cascade tries each tier automatically:
| Tier | Storage | Latency | Strategy |
|---|
| T0 | In-memory session cache | <1ms | Exact + fuzzy key match |
| T1 |
JSON state files | <5ms | Filename + key match |
| T2 | SQLite (pipeline_synergy.db) | <5ms | Structured SQL LIKE |
| T3 | SQLite FTS5 | <1ms | Full-text keyword on 22K rows |
| T4 | Grep over 46K chunks | ~3s | Brute-force ripgrep fallback |
| T5 | LanceDB keyword (BM25) | slow | 670K vector rows, no embeddings |
| T5b | LanceDB semantic | 3–30s | Embedding similarity, last resort |
Question routing: Queries starting with how/what/why/etc. skip keyword tiers and route directly to semantic search.
Self-learning: The cascade records which tier resolves each query. After enough history, evolve() learns shortcuts (skip directly to the winning tier) and marks consistently-empty tiers for skip.
Usage
Python API
CODEBLOCK0
CLI
CODEBLOCK1
recall() Options
CODEBLOCK2
Returns:
CODEBLOCK3
Requirements
- - Python 3.10+ (stdlib only for core cascade logic)
- Optional:
hive_commons for LanceDB tiers (T5/T5b) - Optional:
tools.memory.memory_router for store() routing
The cascade degrades gracefully — if LanceDB is unavailable, it stops at grep (T4). All stdlib tiers (T0–T4) work with zero dependencies.
Architecture Notes
- - Thread-safe: Session cache uses
threading.Lock; stats writes use separate locks - Cross-process safe: JSONL writes use OS-level file locking (
msvcrt on Windows, fcntl on Unix) - Confidence scoring: Term overlap × score × content richness → normalized 0–1
- Stats persistence:
knowledge/SYSTEM/cascade_stats.json accumulates hit rates per tier
Built with MidOS. 1 of 200+ skills. Full ecosystem at midos.dev/pro
MidOS 内存级联
一个自调优、自动升级的搜索引擎,从最快到最慢依次尝试每个内存层级,一旦找到高置信度的答案即停止。
功能说明
该级联机制并非由智能体决定查询哪个存储层,而是自动尝试每个层级:
| 层级 | 存储方式 | 延迟 | 策略 |
|---|
| T0 | 内存会话缓存 | <1ms | 精确匹配 + 模糊键匹配 |
| T1 |
JSON状态文件 | <5ms | 文件名 + 键匹配 |
| T2 | SQLite (pipeline_synergy.db) | <5ms | 结构化SQL LIKE查询 |
| T3 | SQLite FTS5 | <1ms | 对22K行数据进行全文关键词搜索 |
| T4 | 对46K数据块进行Grep搜索 | ~3s | 暴力ripgrep回退方案 |
| T5 | LanceDB关键词搜索 (BM25) | 慢 | 670K向量行,无嵌入 |
| T5b | LanceDB语义搜索 | 3–30s | 嵌入相似度,最后手段 |
问题路由: 以how/what/why等开头的查询会跳过关键词层级,直接路由到语义搜索。
自学习: 级联机制会记录每个查询由哪个层级解决。积累足够历史数据后,evolve() 会学习快捷方式(直接跳转到胜出层级)并标记持续为空的层级以便跳过。
使用方法
Python API
python
from tools.memory.memory_cascade import recall, store
跨所有层级搜索
result = recall(adaptive alpha reranking)
→ {answer: {...}, tier: T5:lancedb, latency_ms: 340, confidence: 0.87}
自动写入合适的存储
store(pattern, content=..., tags=[ml, reranking])
命令行界面
bash
搜索
python memory_cascade.py recall 在此输入查询
查看层级解析统计
python memory_cascade.py stats
运行自进化(学习快捷方式 + 层级跳过)
python memory_cascade.py evolve
recall() 参数选项
python
recall(
query: str,
min_confidence: float = 0.5, # 在此阈值停止升级
max_tier: int = 6 # 0=仅T0,6=所有层级
)
返回结果:
json
{
answer: { source: ..., text: ... },
confidence: 0.87,
latency_ms: 340.2,
tiers_tried: 3,
resolved_at: T5:lancedb,
shortcut: null,
question_routed: false,
escalation: [...]
}
系统要求
- - Python 3.10+(核心级联逻辑仅需标准库)
- 可选:LanceDB层级(T5/T5b)需要 hivecommons
- 可选:store() 路由需要 tools.memory.memoryrouter
级联机制具有优雅降级特性——如果LanceDB不可用,则停止在grep(T4)。所有标准库层级(T0–T4)零依赖即可运行。
架构说明
- - 线程安全: 会话缓存使用 threading.Lock;统计写入使用独立锁
- 跨进程安全: JSONL写入使用操作系统级文件锁定(Windows上使用 msvcrt,Unix上使用 fcntl)
- 置信度评分: 术语重叠度 × 分数 × 内容丰富度 → 归一化0–1
- 统计持久化: knowledge/SYSTEM/cascade_stats.json 累积每个层级的命中率
由MidOS构建。200+技能之一。完整生态系统请访问 midos.dev/pro