Multi-Agent Protocol v2

OpenClaw-native multi-agent protocol. Keep the good parts from v1:

- spec-first
review gates
retry with circuit breaker

Replace the brittle parts from v1:

- no fixed sessionKey memory contract
no INLINECODE1
no undeclared beads or git dependency
no prompt-only state machine
no LangGraph

Architecture

Use the stack below and do not silently swap layers:

1. INLINECODE4

Defines protocol, roles, dependency expectations, and non-negotiable rules.

2. INLINECODE5

Owns orchestration flow and agent dispatch.

3. INLINECODE6

Owns approval, pause/resume, and side-effect recovery templates.

4. task-store plugin

Owns authoritative task state via typed tools + SQLite event log.

5. INLINECODE8

Connects external coding harnesses such as Codex.

Source Of Truth

The source of truth is the task-store plugin, not prompts and not reviewer output.

- Canonical task phase lives in SQLite.
Every phase change is an event.
Reviewers append findings and verdicts.
Reviewers do not finalize phase transitions.
The orchestrator is the only actor that decides phase movement.

Required Dependencies

Declare these dependencies explicitly in the skill or the workflow setup:

- OpenClaw runtime with INLINECODE10
INLINECODE11 runtime for approval/resume
INLINECODE12 plugin enabled
local SQLite availability
ACP bridge when external harnesses are involved

Optional, but explicit when used:

- INLINECODE13
browser/runtime plugins
language-specific build tools

Do not assume:

- INLINECODE14
INLINECODE15
INLINECODE16
persistent role memory through fixed sessions
INLINECODE17

Core Rules

1. Spec-first

No execution phase starts before a spec record exists in task-store.

Minimum spec payload:

- INLINECODE19
INLINECODE20
INLINECODE21
INLINECODE22
INLINECODE23
INLINECODE24
INLINECODE25

If acceptance criteria are weak or missing, the orchestrator keeps the task in spec_draft.

2. Phase transitions are explicit

Use a stored phase enum. Recommended baseline:

CODEBLOCK0

All transitions must be written through task_transition.

3. Reviewers are evidence producers

Reviewer output is evidence, not authority.

- Spec reviewer answers: "Does the artifact satisfy the spec?"
Quality reviewer answers: "Is the implementation acceptable for maintainability and risk?"
Reviewers write findings via task_append_review.
The orchestrator reads review state and decides the next phase.

4. Retry and circuit breaker are stored state

Retries are not tracked in free text.

- attempt counters live in SQLite
retry reasons are evented
circuit state is explicit

Recommended policy:

- attempt 1-2: retry same phase with bounded backoff
INLINECODE30: optional stronger model/runtime
INLINECODE31: INLINECODE32

5. Side effects require Lobster gates

Any real-world effect should pass through Lobster:

- writing to external systems
approvals
irreversible file mutations outside the declared sandbox
deployments
notifications
merges

Lobster pauses, requests approval, and resumes from persisted state.

6. ACP is the bridge for external harnesses

When using Codex or another external coding harness:

- launch work through ACP, not prompt-only relays
pass task_id, attempt_id, workspace, and allowed capabilities
capture external session metadata as non-authoritative references

Practical note inferred from the local OpenClaw installation: parent streaming features such
as streamTo are tied to runtime=acp, not generic subagent runtime. Design the workflow
accordingly.

Role Model

Orchestrator

The orchestrator:

- creates the task record
validates spec completeness
dispatches agents
reads stored findings
decides phase transitions
triggers Lobster when approval or recovery is needed
opens the circuit when retries are exhausted

The orchestrator does not become a passive message relay or free-form blackboard parser.

Executor

The executor may be:

- a local OpenClaw worker
an ACP-backed external harness such as Codex
a read-only research agent

Executor responsibilities:

- produce artifacts
record attempt heartbeat/checkpoints through typed tools
report structured outputs and evidence

Executor cannot finalize completed, failed, or gate transitions on its own.

Spec Reviewer

Reads the actual artifact and records one of:

- INLINECODE40
INLINECODE41
INLINECODE42

Plus findings with file references or artifact references.

Quality Reviewer

Reads the actual artifact after spec gate passes and records:

- maintainability concerns
test gaps
safety or regression risk
approval/rework recommendation

Lobster Approver / Recovery Actor

Lobster manages:

- approval prompts
pause/resume after interruption
resuming idempotent or compensating side-effect steps

Lobster does not own the business workflow phase. It only writes approval state and recovery
evidence back to task-store.

Minimal Lifecycle

CODEBLOCK1

Failure branches:

CODEBLOCK2

Protocol By Phase

`spec_draft`

- Create task in task-store.
Persist full spec content or spec reference.
Do not spawn builders yet.

`spec_review`

- Reviewer checks the spec itself for ambiguity and testability.
Orchestrator either:

- fixes the spec and stays in spec_draft, or - transitions to INLINECODE48

`execution_ready`

- Orchestrator chooses runtime:

- local worker for low-side-effect or local tasks - ACP for Codex/external harness

- Orchestrator creates a new attempt record.

`executing`

- Executor works only against declared inputs/outputs.
Checkpoints go through typed tools.
Side effects are declared ahead of time as planned actions.

`spec_gate`

- Spec reviewer inspects produced artifact.
Reviewer writes findings only.
Orchestrator decides:

- pass to quality_gate - rework back to execution_ready - open circuit if repeated mismatch indicates spec or implementation collapse

`quality_gate`

- Quality reviewer records findings only.
Orchestrator decides:

- completed - execution_ready - INLINECODE57

`awaiting_approval`

- Lobster requests human approval with structured context.
Approved result becomes evidence in store.
Orchestrator transitions to ready_to_resume.

`ready_to_resume`

- Lobster or orchestrator resumes the exact side-effect step using persisted idempotency data.

`circuit_open`

- Stop automatic retries.
Surface:

- failure summary - attempts - last known good artifact - unblock options

What Goes In Storage

The task-store plugin should persist at least:

- task header
current phase
spec payload or reference
review records
attempt records
artifact records
approval records
event log
optional external session references

The plugin storage is authoritative. Prompt text is not.

OpenProse Guidance

The .prose workflow should be minimal and boring:

- read state
branch on typed state
dispatch one actor
store result
decide next phase

Do not encode business state only in the prose graph. The graph coordinates. The plugin stores.

Read workflows/openclaw-native-v2.prose when wiring the
workflow.

Lobster Guidance

Use Lobster only where it adds hard guarantees:

- approval request with resumable context
idempotent recovery after interruption
controlled side-effect replay

Read lobster/approval-recovery.template.yaml when a
task contains side effects or human approval.

Plugin Guidance

Use the task-store plugin as the only write path for protocol state.

Read references/task-store-plugin.md when:

- implementing the plugin
validating tool shapes
deciding schema changes

Permissions

Use least privilege. The matrix lives in
references/agent-permissions.md.

Key rule:

- executors can write artifacts and attempts
reviewers can write findings
only orchestrator can move the phase

Migration Rules From v1

Read references/migration.md before replacing an existing v1 setup.

Summary:

- replace fixed session identity with run-scoped attempt_id and optional INLINECODE66
replace blackboard with typed storage
replace beads/git buses with plugin tools
replace reviewer-led state changes with orchestrator-led transitions

Quick Start

1. Enable task-store.
Create a task with a full spec.
Run the OpenProse workflow.
Route external coding work through ACP.
Use Lobster only for approval/recovery steps.
Let orchestrator decide every phase transition from stored evidence.

Anti-Patterns

Do not do any of the following:

- use fixed role sessionKey as the memory backbone
store canonical state in INLINECODE69
let reviewer verdict directly close the task
let executor mutate final phase
assume git or beads exists without declaring it
recover from interruption by guessing from prompt history
add LangGraph just to simulate a state machine already held in SQLite

多智能体协议 v2

OpenClaw原生多智能体协议。保留v1版本的优点：

- 规范优先
审查关卡
带熔断器的重试机制

替换v1版本中的脆弱部分：

- 无固定 sessionKey 内存契约
无 shared/blackboard.json
无未声明的 beads 或 git 依赖
无纯提示状态机
无LangGraph

架构

使用以下技术栈，不要静默替换层级：

1. SKILL.md

定义协议、角色、依赖期望和不可协商的规则。

2. OpenProse

负责编排流程和智能体调度。

3. Lobster

负责审批、暂停/恢复和副作用恢复模板。

4. task-store 插件

通过类型化工具 + SQLite事件日志拥有权威任务状态。

5. ACP

连接外部编码工具，如Codex。

事实来源

事实来源是 task-store 插件，而非提示和审查者输出。

- 规范任务阶段存储在SQLite中。
每个阶段变更都是一个事件。
审查者附加调查结果和裁决。
审查者不最终确定阶段转换。
编排者是唯一决定阶段移动的执行者。

必需依赖

在技能或工作流设置中显式声明这些依赖：

- 带有 OpenProse 的OpenClaw运行时
用于审批/恢复的 Lobster 运行时
已启用的 task-store 插件
本地SQLite可用性
涉及外部工具时的ACP桥接

可选，但使用时需显式声明：

- git
浏览器/运行时插件
特定语言的构建工具

不要假设：

- beads
bd
shared/blackboard.json
通过固定会话的持久角色记忆
git worktree

核心规则

1. 规范优先

在 task-store 中存在规范记录之前，不启动任何执行阶段。

最小规范负载：

- goal（目标）
scopein（范围包含）
scopeout（范围排除）
inputs（输入）
outputs（输出）
acceptance_criteria[]（验收标准）
risks[]（风险）

如果验收标准薄弱或缺失，编排者将任务保持在 spec_draft 状态。

2. 阶段转换是显式的

使用存储的阶段枚举。推荐基准：

text
spec_draft（规范草稿）
spec_review（规范审查）
execution_ready（执行就绪）
executing（执行中）
spec_gate（规范关卡）
quality_gate（质量关卡）
awaiting_approval（等待审批）
readytoresume（就绪可恢复）
completed（已完成）
failed（失败）
circuit_open（熔断开启）

所有转换必须通过 task_transition 写入。

3. 审查者是证据生产者

审查者输出是证据，而非权威。

- 规范审查者回答：工件是否满足规范？
质量审查者回答：实现是否在可维护性和风险方面可接受？
审查者通过 taskappendreview 写入调查结果。
编排者读取审查状态并决定下一阶段。

4. 重试和熔断器是存储状态

重试不在自由文本中跟踪。

- 尝试计数器存储在SQLite中
重试原因被事件化
熔断状态是显式的

推荐策略：

- 尝试 1-2：使用有界退避重试同一阶段
尝试 3：可选更强模型/运行时
尝试 >= 4：circuit_open（熔断开启）

5. 副作用需要Lobster关卡

任何现实世界的影响应通过Lobster：

- 写入外部系统
审批
在声明的沙箱之外不可逆的文件变更
部署
通知
合并

Lobster暂停、请求审批，并从持久化状态恢复。

6. ACP是外部工具的桥接

当使用Codex或其他外部编码工具时：

- 通过ACP启动工作，而非纯提示中继
传递 taskid、attemptid、workspace 和允许的能力
将外部会话元数据捕获为非权威引用

从本地OpenClaw安装推断的实践说明：父级流功能如 streamTo 绑定到 runtime=acp，而非通用子智能体运行时。据此设计工作流。

角色模型

编排者

编排者：

- 创建任务记录
验证规范完整性
调度智能体
读取存储的调查结果
决定阶段转换
在需要审批或恢复时触发Lobster
在重试耗尽时开启熔断

编排者不成为被动的消息中继或自由形式的黑板解析器。

执行者

执行者可以是：

- 本地OpenClaw工作节点
ACP支持的外部工具，如Codex
只读研究智能体

执行者职责：

- 产生工件
通过类型化工具记录尝试心跳/检查点
报告结构化输出和证据

执行者不能自行最终确定 completed、failed 或关卡转换。

规范审查者

读取实际工件并记录以下之一：

- approved（已批准）
changes_requested（请求变更）
blocked（已阻止）

以及带有文件引用或工件引用的调查结果。

质量审查者

在规范关卡通过后读取实际工件并记录：

- 可维护性问题
测试缺口
安全或回归风险
批准/返工建议

Lobster审批者/恢复执行者

Lobster管理：

- 审批提示
中断后的暂停/恢复
恢复幂等或补偿性副作用步骤

Lobster不拥有业务流程阶段。它只将审批状态和恢复证据写回 task-store。

最小生命周期

text
task_create（任务创建）
-> spec_review（规范审查）
-> execution_ready（执行就绪）
-> executing（执行中）
-> spec_gate（规范关卡）
-> quality_gate（质量关卡）
-> awaiting_approval（等待审批，仅当存在副作用时）
-> readytoresume（就绪可恢复）
-> completed（已完成）

失败分支：

text
executing（执行中） -> retrying（重试中） -> executing（执行中）
executing（执行中） -> circuit_open（熔断开启）
specgate（规范关卡） -> executionready（执行就绪）
qualitygate（质量关卡） -> executionready（执行就绪）
awaiting_approval（等待审批） -> failed（失败）

按阶段协议

spec_draft（规范草稿）

- 在 task-store 中创建任务。
持久化完整规范内容或规范引用。
暂不生成构建者。

spec_review（规范审查）

- 审查者检查规范本身的模糊性和可测试性。
编排者要么：

- 修复规范并保持在 spec_draft，或 - 转换到 execution_ready

execution_ready（执行就绪）

- 编排者选择运行时：

- 本地工作节点用于低副作用或本地任务 - ACP用于Codex/外部工具

- 编排者创建新的尝试记录。

executing（执行中）

- 执行者仅针对声明的输入/输出工作。
检查点通过类型化工具进行。
副作用提前声明为计划操作。

spec_gate（规范关卡）

- 规范审查者检查生成的工件。
审查者仅写入调查结果。
编排者决定：

- 通过到 quality_gate - 返工回 execution_ready - 如果重复不匹配表明规范或实现崩溃，则开启熔断

quality_gate（质量关卡）

- 质量审查者仅记录调查结果。
编排者决定：

- completed - execution_ready - awaiting_approval

awaiting_approval（等待审批）

- Lobster以结构化上下文请求人工审批。
批准结果成为存储中的证据。
编排者转换到 readytoresume。

readytoresume（就绪可恢复）

- Lobster或编排者使用持久化的幂等数据恢复确切的副作用步骤。

circuit_open（熔断开启）

- 停止自动重试。
呈现：

- 失败摘要 - 尝试次数 - 最后已知良好工件 - 解除阻塞选项

存储内容

task-store 插件应至少持久化：

- 任务头部
当前阶段

multi-agent-protocol多智能体协议