Restart Task Recovery

Use this workflow to maximize successful recovery after OpenClaw restart.

1) Pre-restart checkpoint (required)

Before any gateway.config.patch, gateway.config.apply, gateway.update.run, or gateway.restart:

1. List active sessions that may be impacted (sessions_list).
For each active work session, capture the latest context (sessions_history, limit 20-50).
Write a compact checkpoint file at:

- memory/restart-checkpoints/<YYYY-MM-DD>/<HHmmss>.md

4. Include per session:

- sessionKey / label / agent - goal - last completed step - next exact step - blocked dependencies (if any) - a ready-to-send resume message (1-2 lines)

Keep checkpoint concise and executable.

2) Restart with explicit recovery intent

When calling gateway restart/config change, set note to include recovery intent, e.g.:

- “配置已更新并重启；将按 checkpoint 恢复中断任务。”

3) Post-restart recovery sweep

After restart:

1. Re-list sessions (sessions_list) and compare against checkpoint.
For each interrupted/idle target session, send resume message via sessions_send:

- “Continue where you left off. Last completed: . Next: . If previous tool call failed, retry from .”

3. Do not poll in tight loops. Check on-demand only.
Summarize recovery status to user:

- recovered sessions - still blocked sessions - manual follow-up needed

4) Idempotent task design rules

When resuming tasks, enforce:

1. Re-run-safe steps (idempotency key / upsert / duplicate-safe writes).
Small step boundaries with explicit “done markers”.
External writes batched, not one-by-one loops.
On uncertainty, verify state first then continue.

5) V2 automation helper

Use script: scripts/build_checkpoint.py to generate checkpoint markdown from structured JSON.

Example:

CODEBLOCK0

Expected stdin JSON shape:

CODEBLOCK1

6) V3 resume-plan automation

Use script: scripts/generate_resume_plan.py to parse the latest checkpoint and produce a structured resume plan.

Example:

CODEBLOCK2

Then send each items[].resumeMessage to items[].sessionKey via sessions_send.

Rules:

- Send once per session (no loop polling).
If a session is already active and progressing, skip resend.
After sends, post one concise recovery summary to user.

7) V4 one-click recovery payload generator

Use script: scripts/recover_from_latest_checkpoint.py.

It auto-selects the latest checkpoint file and emits a ready JSON payload list for sessions_send calls.

Examples:

CODEBLOCK3

Execution guidance:

- Read INLINECODE17
Execute each actions[] item with INLINECODE19
Post one concise summary to user

8) V5 pre-resume verifier + manual confirmation gate

Use script: scripts/pre_resume_verify.py to score resume actions before sending.

Examples:

CODEBLOCK4

Behavior:

- Marks each action as INLINECODE21
INLINECODE22 risk actions are set to decision=hold and INLINECODE24
Only send decision=send automatically
Ask user confirmation before executing held actions

Recommended execution flow:

1. Generate actions with V4
Verify with V5
Send all INLINECODE26
Present decision=hold list to user for explicit confirmation

9) V6 execution-plan generator (auto-send safe items)

Use script: scripts/execute_verified_recovery.py with V5 output.

Example:

CODEBLOCK5

Behavior:

- Emits sendActions[] for auto-safe resumes (decision=send)
Emits holdForManualConfirm[] for risky resumes (decision=hold)

Execution:

1. Execute all sendActions[] with INLINECODE34
Ask user to confirm INLINECODE35
Execute confirmed held items
Post concise summary

10) Message templates

Read and use: INLINECODE36

重启任务恢复

使用此工作流最大化OpenClaw重启后的成功恢复。

1) 重启前检查点（必需）

在任何 gateway.config.patch、gateway.config.apply、gateway.update.run 或 gateway.restart 之前：

1. 列出可能受影响的活跃会话（sessionslist）。
对每个活跃工作会话，捕获最新上下文（sessionshistory，限制20-50条）。
在以下路径写入紧凑的检查点文件：

- memory/restart-checkpoints//.md

4. 每个会话包含：

- sessionKey / label / agent - 目标 - 最后完成的步骤 - 下一步确切步骤 - 阻塞依赖（如有） - 一条可立即发送的恢复消息（1-2行）

保持检查点简洁且可执行。

2) 带明确恢复意图的重启

调用网关重启/配置变更时，设置 note 包含恢复意图，例如：

- “配置已更新并重启；将按检查点恢复中断任务。”

3) 重启后恢复扫描

重启后：

1. 重新列出会话（sessionslist）并与检查点对比。
对每个中断/空闲的目标会话，通过 sessionssend 发送恢复消息：

- “从你中断的地方继续。最后完成：。下一步：。如果之前的工具调用失败，从重试。”

3. 不要在紧密循环中轮询。仅按需检查。
向用户总结恢复状态：

- 已恢复的会话 - 仍阻塞的会话 - 需要手动跟进

4) 幂等任务设计规则

恢复任务时，强制执行：

1. 可安全重跑的步骤（幂等键 / upsert / 防重复写入）。
带有明确“完成标记”的小步骤边界。
外部写入批量处理，非逐个循环。
不确定时，先验证状态再继续。

5) V2自动化辅助

使用脚本：scripts/build_checkpoint.py 从结构化JSON生成检查点Markdown。

示例：

bash
cat session-snapshot.json | python3 scripts/build_checkpoint.py memory/restart-checkpoints/$(date +%F)/$(date +%H%M%S).md

预期的stdin JSON格式：

json
{
sessions: [
{
sessionKey: agent:engineer:main,
agentId: engineer,
goal: 完成回归验证,
lastDone: 401/幂等/时区/保留用例通过,
nextStep: 发布最终验收摘要,
blockers: 无
}
]
}

6) V3恢复计划自动化

使用脚本：scripts/generateresumeplan.py 解析最新检查点并生成结构化恢复计划。

示例：

bash
python3 scripts/generateresumeplan.py memory/restart-checkpoints/2026-03-09/162200.md /tmp/resume-plan.json

然后通过 sessions_send 将每个 items[].resumeMessage 发送到 items[].sessionKey。

规则：

- 每个会话仅发送一次（无循环轮询）。
如果会话已活跃且正在推进，跳过重新发送。
发送后，向用户发布一条简洁的恢复摘要。

7) V4一键恢复负载生成器

使用脚本：scripts/recoverfromlatest_checkpoint.py。

它自动选择最新的检查点文件，并发出一个可用于 sessions_send 调用的JSON负载列表。

示例：

bash

自动使用最新检查点

python3 scripts/recoverfromlatest_checkpoint.py > /tmp/recover-actions.json

使用特定检查点

python3 scripts/recoverfromlatest_checkpoint.py memory/restart-checkpoints/2026-03-09/162200.md > /tmp/recover-actions.json

执行指导：

- 读取 /tmp/recover-actions.json
使用 sessions_send 执行每个 actions[] 项
向用户发布一条简洁摘要

8) V5恢复前验证器 + 手动确认门控

使用脚本：scripts/preresumeverify.py 在发送前对恢复操作进行评分。

示例：

bash
python3 scripts/preresumeverify.py /tmp/recover-actions.json /tmp/recover-verified.json

行为：

- 将每个操作标记为 risk=normal|high
high 风险操作设置为 decision=hold 和 requiresManualConfirm=true
仅自动发送 decision=send
在执行保留操作前请求用户确认

推荐执行流程：

1. 使用V4生成操作
使用V5验证
发送所有 decision=send
向用户展示 decision=hold 列表以获取明确确认

9) V6执行计划生成器（自动发送安全项）

使用脚本：scripts/executeverifiedrecovery.py 配合V5输出。

示例：

bash
python3 scripts/executeverifiedrecovery.py /tmp/recover-verified.json > /tmp/recover-exec.json

行为：

- 发出 sendActions[] 用于自动安全的恢复（decision=send）
发出 holdForManualConfirm[] 用于有风险的恢复（decision=hold）

执行：

1. 使用 sessions_send 执行所有 sendActions[]
请求用户确认 holdForManualConfirm[]
执行已确认的保留项
发布简洁摘要

10) 消息模板

读取并使用：references/templates.md

restart-task-recovery任务恢复重启