OpenClaw Safe Upgrade
Single atomic command. Auto-rollbacks on ANY failure. Survives the gateway restart it triggers.
Script
CODEBLOCK0
What Happens (one command, 10 steps)
- 1. Cgroup escape: re-execs via
systemd-run --user --scope so gateway stop can't kill the script - Pre-flight: version check, disk space, breaking changes
- Backup: installation tarball, config, cron jobs, acpx customizations
- INLINECODE1
- Restore acpx config (ACP agent customizations survive upgrades)
- Gateway restart (process-isolated: stop + start, survives script's own lifecycle)
- Wait for gateway health (configurable timeout) → auto-rollback if fails
- Wait for WhatsApp reconnect (non-fatal timeout)
- Verify: correct version + cron count preserved → auto-rollback if fails
- Record result → optional golden snapshot → cleanup backup
If ANY critical check fails, the script automatically rolls back — restores install, config, crons, and acpx. Trap handler catches unexpected exits during critical phases.
Agent Workflow
- 1. Run
--check first. Review output with the user. - Inform user: "Launching upgrade — I'll go offline during gateway restart."
- Run the upgrade:
_UPGRADE_FORCE_ESCAPE=1 bash skills/upgrade/scripts/safe-upgrade.sh
Do NOT pipe the output (no
| tee, no
2>&1 | cat). The script writes to
/tmp/upgrade-live.log.
- 4. The current session will end when the gateway restarts. This is expected.
- After restart, the new session checks:
-
~/.openclaw/upgrade-result.json for status
-
/tmp/upgrade-live.log for live output
-
success: report to user, update any version references
-
rolled_back: tell user what went wrong (reason in result file)
- No result file + backup at
~/.openclaw/upgrade-backups/current/: script was killed — run
--rollback
- 6. Full forensic log at
~/.openclaw/upgrade-last.log.
What Gets Backed Up
- - OpenClaw installation tarball
- Config (
openclaw.json) - Cron jobs (
jobs.json) - acpx user config (
~/.acpx/config.json) if present - Metadata (from/to version, timestamp, cron count)
Backup location: INLINECODE16
Result File
INLINECODE17 :
CODEBLOCK2
Why Cgroup Escape?
OpenClaw runs as a systemd service. When an agent runs this script, the script is a child process inside the service's cgroup. systemctl stop sends SIGKILL to ALL processes in the cgroup — including the upgrade script. SIGKILL cannot be caught (no trap handler fires).
The script detects this and re-execs itself via systemd-run --user --scope into its own transient systemd scope. The parent process exits immediately — no pipes, no tee, no connections back to the gateway cgroup. This is why piping output is forbidden.
Important Notes
- - Never run
gateway update.run directly — always use this script - Always set
_UPGRADE_FORCE_ESCAPE=1 when running from an agent session - acpx customizations are auto-preserved across upgrades
- Rollback restores the EXACT previous state: install + config + crons + acpx
- INLINECODE22 is safe to run anytime, changes nothing
- The script auto-detects gateway port from config (no hardcoded defaults)
- Optional hooks: if
golden-snapshot.sh or service-quick-check.py exist in your workspace, they're used; otherwise silently skipped
OpenClaw 安全升级
单一原子命令。任何失败时自动回滚。在触发网关重启后依然存活。
脚本
bash
从代理会话中执行 — 始终设置转义标志(确保脚本在网关重启后存活)
UPGRADEFORCE_ESCAPE=1 bash skills/upgrade/scripts/safe-upgrade.sh
即使已是最新版本也强制升级
UPGRADEFORCE_ESCAPE=1 bash skills/upgrade/scripts/safe-upgrade.sh --force
安全的只读检查(无需转义)
bash skills/upgrade/scripts/safe-upgrade.sh --check # 仅预检
bash skills/upgrade/scripts/safe-upgrade.sh --rollback # 手动回滚
执行过程(一条命令,10个步骤)
- 1. Cgroup 转义:通过 systemd-run --user --scope 重新执行,使网关停止时无法杀死脚本
- 预检:版本检查、磁盘空间、破坏性变更
- 备份:安装包、配置、定时任务、acpx 自定义配置
- npm i -g openclaw@latest
- 恢复 acpx 配置(ACP 代理自定义配置在升级后保留)
- 网关重启(进程隔离:停止 + 启动,不受脚本自身生命周期影响)
- 等待网关健康检查(可配置超时)→ 失败则自动回滚
- 等待 WhatsApp 重连(非致命超时)
- 验证:正确版本 + 定时任务数量保留 → 失败则自动回滚
- 记录结果 → 可选的金色快照 → 清理备份
如果任何关键检查失败,脚本自动回滚 — 恢复安装、配置、定时任务和 acpx。陷阱处理器捕获关键阶段中的意外退出。
代理工作流程
- 1. 先运行 --check。与用户一起审查输出。
- 告知用户:正在启动升级 — 网关重启期间我将离线。
- 运行升级:
bash
UPGRADEFORCE_ESCAPE=1 bash skills/upgrade/scripts/safe-upgrade.sh
不要管道输出(不要使用 | tee,不要使用 2>&1 | cat)。脚本写入 /tmp/upgrade-live.log。
- 4. 当前会话将在网关重启时结束。这是预期行为。
- 重启后,新会话检查:
- ~/.openclaw/upgrade-result.json 获取状态
- /tmp/upgrade-live.log 获取实时输出
- success:向用户报告,更新所有版本引用
- rolled_back:告知用户出错原因(结果文件中的原因)
- 无结果文件 + ~/.openclaw/upgrade-backups/current/ 存在备份:脚本被杀死 — 运行 --rollback
- 6. 完整取证日志位于 ~/.openclaw/upgrade-last.log。
备份内容
- - OpenClaw 安装包
- 配置(openclaw.json)
- 定时任务(jobs.json)
- acpx 用户配置(~/.acpx/config.json,如果存在)
- 元数据(来源/目标版本、时间戳、定时任务数量)
备份位置:~/.openclaw/upgrade-backups/current/
结果文件
~/.openclaw/upgrade-result.json:
json
{
status: success|rolledback|rollbackfailed|no_change|blocked,
from_version: 2026.3.2,
to_version: 2026.3.7,
message: ...,
timestamp: ...,
log: ~/.openclaw/upgrade-last.log
}
为什么需要 Cgroup 转义?
OpenClaw 作为 systemd 服务运行。当代理运行此脚本时,脚本是该服务 cgroup 内的子进程。systemctl stop 向 cgroup 中的所有进程发送 SIGKILL — 包括升级脚本。SIGKILL 无法被捕获(不会触发陷阱处理器)。
脚本检测到这一点并通过 systemd-run --user --scope 将其自身重新执行为独立的临时 systemd 作用域。父进程立即退出 — 没有管道、没有 tee、没有与网关 cgroup 的连接。这就是禁止管道输出的原因。
重要说明
- - 切勿直接运行 gateway update.run — 始终使用此脚本
- 从代理会话运行时始终设置 UPGRADEFORCE_ESCAPE=1
- acpx 自定义配置在升级过程中自动保留
- 回滚恢复精确的先前状态:安装 + 配置 + 定时任务 + acpx
- --check 可随时安全运行,不会更改任何内容
- 脚本从配置中自动检测网关端口(无硬编码默认值)
- 可选钩子:如果工作区中存在 golden-snapshot.sh 或 service-quick-check.py,则使用它们;否则静默跳过