OpenClaw Local Embedding Setup
Use this skill to enable local embedding for OpenClaw memory search on machines where outbound internet access is restricted to an HTTP CONNECT proxy. Do not use this skill for remote embedding providers (OpenAI, Gemini, Voyage, etc.) or for machines with direct internet access.
Target environment
- - Any Linux machine (cloud VM, on-premises server, or development workstation).
- CPU-only (no GPU required; embedding model is small enough for CPU inference).
- No direct internet access; outbound HTTPS requires an HTTP CONNECT proxy.
- OS: Ubuntu 20.04+ (GLIBC 2.31+).
- Node.js 22+ (required for
NODE_USE_ENV_PROXY support). - OpenClaw installed via
npm install -g openclaw (no source build required).
Workflow
Follow these steps in order. Do not skip steps.
Step 1: Check if model is already cached
The default model is embeddinggemma-300m-qat-Q8_0.gguf (~313 MB). Check the standard cache directories:
CODEBLOCK0
If the model exists and is larger than 100 MB, skip to Step 4 (configuration).
Step 2: Resolve network and proxy
The model must be downloaded from HuggingFace. Machines without direct internet access need an HTTP CONNECT proxy.
Known proxy defaults
| Platform | Proxy address |
|---|
| Kuaishou cloud VMs (快手云主机) | INLINECODE3 |
If the machine is a Kuaishou cloud VM and no proxy is configured, try the default above first. If a different proxy was used previously and recorded in ~/.openclaw/workspace/skills/openclaw-local-embedding/.proxy, load it automatically:
CODEBLOCK1
Strategy: progressive fallback.
- 1. Test current environment first — the user may already have
HTTPS_PROXY or a recorded proxy configured:
CODEBLOCK2
- 2. If direct access fails, try proxies in this order:
a. Recorded proxy from previous run (.proxy file, see above).
b. Kuaishou cloud default: INLINECODE7
CODEBLOCK3
c. If neither works, ask the user for their proxy address.
- 3. Once a working proxy is confirmed, record it for future runs:
CODEBLOCK4
Then set it only for the download process (not permanently):
CODEBLOCK5
These environment variables are process-scoped. They do not affect other processes or the gateway.
Important: NODE_TLS_REJECT_UNAUTHORIZED=0 disables TLS certificate verification. Only set it in the download script/process. Never persist it to shell profiles.
Step 3: Download and verify model
The skill includes a helper script (scripts/init-model.mjs) that handles proxy detection, model download, and verification automatically. Run it with the proxy env vars from Step 3 active (or let the script auto-detect):
CODEBLOCK6
The script will:
- 1. Auto-detect the OpenClaw installation directory
- Check if the model is already cached (idempotent — safe to re-run)
- Probe network connectivity (recorded proxy → env proxy → direct → Kuaishou cloud default)
- Record a working proxy to
.proxy for future runs - Download the model via
node-llama-cpp's INLINECODE12
Expected download size: ~313 MB. Speed through proxy: ~5–10 MB/s.
cmake troubleshooting
If node-llama-cpp cannot find a prebuilt binary (common on Ubuntu 20.04 with GLIBC < 2.32), it falls back to compiling llama.cpp from source. This requires cmake >= 3.19.
Check cmake version:
CODEBLOCK7
If cmake is < 3.19, install a newer version:
CODEBLOCK8
After cmake is available, re-run the model download. The compilation is automatic and one-time.
Step 4: Configure openclaw.json
INLINECODE15 (dot-notation path assignment) is a long-standing core feature available in all OpenClaw versions. Use it to apply settings and then verify:
CODEBLOCK9
Verify the result — the output must show all four fields set correctly:
CODEBLOCK10
Expected output:
CODEBLOCK11
Also run openclaw config validate to confirm the full config is well-formed. If it reports errors, fix them before proceeding.
If config set fails (e.g., exits with a non-zero code or config get shows wrong values), fall back to direct JSON editing. Open ~/.openclaw/openclaw.json and manually add the block under agents.defaults:
CODEBLOCK12
Do not set memorySearch at the top level. It must be nested under agents.defaults.
After manual editing, validate JSON syntax: INLINECODE23
Step 5: Restart gateway
The configuration change requires a gateway restart. Use the standard OpenClaw command:
CODEBLOCK13
This works regardless of how OpenClaw was installed (systemd, launchd, or Windows service). If the gateway is not registered as a supervised service, run it manually in foreground instead:
CODEBLOCK14
Step 6: Verify
After restart, the first memory_search tool call triggers model loading (~1.6 seconds, one-time). Subsequent calls use the in-memory model with no network access.
Check gateway status:
CODEBLOCK15
For deeper log inspection, use your system's service log viewer:
CODEBLOCK16
Resource expectations
| Metric | Value |
|---|
| Model file on disk | ~313 MB |
| Cold start (model load) |
~1.6 seconds (one-time per gateway start) |
| RSS after model load | ~880 MB |
| Per-chunk embedding latency | ~500 ms (400-token chunk, CPU) |
| Minimum available RAM | 2 GB (recommended: 4+ GB) |
| GPU required | No |
| Network required after setup | No (fully offline inference) |
Common issues
Model download hangs or times out
- - Verify proxy reachability: INLINECODE25
(If
HTTPS_PROXY is not set, try the Kuaishou cloud default:
curl --proxy http://10.74.176.8:11080 https://huggingface.co)
- - Check if a proxy was recorded from a previous run: INLINECODE28
- Ensure
NODE_USE_ENV_PROXY=1 is set in the download process. - If the download still fails with TLS errors, set
NODE_TLS_REJECT_UNAUTHORIZED=0 — this is needed when the proxy performs TLS inspection (common in corporate/cloud environments).
llama.cpp compilation fails
- - Check cmake version: must be >= 3.19. Fix:
pip3 install cmake. - Check GCC: must be >= 9. Ubuntu 20.04 ships GCC 9.4 which is sufficient.
- Compilation is automatic and happens only once. The built binary is cached at
<openclaw-node-modules>/node-llama-cpp/llama/localBuilds/.
GLIBC version mismatch
The prebuilt node-llama-cpp binary requires GLIBC >= 2.32. Ubuntu 20.04 has GLIBC 2.31. When this happens, node-llama-cpp automatically falls back to source compilation (requires cmake >= 3.19).
Gateway crash or high memory after enabling
- - RSS of ~880 MB is expected and stable. The model weights are memory-mapped.
- If the machine has less than 2 GB available RAM, do not enable local embedding. Use a remote provider instead.
- Memory does not grow over time — the model is loaded once and reused.
Security notes
- - Never persist
NODE_TLS_REJECT_UNAUTHORIZED=0 in shell profiles or system-wide configuration. It disables TLS verification for all Node.js processes. - The proxy environment variables (
HTTPS_PROXY, NODE_USE_ENV_PROXY) should only be set in the download script, not in the gateway runtime. - After model download completes, the gateway runs fully offline. No proxy or network configuration is needed.
OpenClaw 本地嵌入设置
使用此技能可在出站互联网访问受限(仅允许通过 HTTP CONNECT 代理)的机器上为 OpenClaw 内存搜索启用本地嵌入。请勿将此技能用于远程嵌入提供商(OpenAI、Gemini、Voyage 等)或具有直接互联网访问权限的机器。
目标环境
- - 任何 Linux 机器(云虚拟机、本地服务器或开发工作站)。
- 仅限 CPU(无需 GPU;嵌入模型足够小,可在 CPU 上推理)。
- 无直接互联网访问;出站 HTTPS 需要通过 HTTP CONNECT 代理。
- 操作系统:Ubuntu 20.04+(GLIBC 2.31+)。
- Node.js 22+(需要支持 NODEUSEENV_PROXY)。
- 通过 npm install -g openclaw 安装 OpenClaw(无需源码构建)。
工作流程
请按顺序执行以下步骤。不要跳过任何步骤。
步骤 1:检查模型是否已缓存
默认模型为 embeddinggemma-300m-qat-Q8_0.gguf(约 313 MB)。检查标准缓存目录:
bash
MODELFILE=hfggml-orgembeddinggemma-300m-qat-Q80.gguf
for dir in $HOME/.node-llama-cpp/models $HOME/.cache/node-llama-cpp/models; do
if [ -f $dir/$MODEL_FILE ]; then
SIZEMB=$(du -m $dir/$MODELFILE | cut -f1)
echo 模型已缓存:$dir/$MODELFILE($SIZEMB MB)
fi
done
如果模型存在且大于 100 MB,则跳转到步骤 4(配置)。
步骤 2:解决网络和代理问题
模型必须从 HuggingFace 下载。没有直接互联网访问权限的机器需要 HTTP CONNECT 代理。
已知代理默认值
| 平台 | 代理地址 |
|---|
| 快手云主机 | http://10.74.176.8:11080 |
如果机器是快手云主机且未配置代理,请先尝试上述默认值。如果之前使用了不同的代理并记录在 ~/.openclaw/workspace/skills/openclaw-local-embedding/.proxy 中,则自动加载:
bash
RECORDED_PROXY=$(cat ~/.openclaw/workspace/skills/openclaw-local-embedding/.proxy 2>/dev/null)
策略:渐进式回退。
- 1. 首先测试当前环境——用户可能已经配置了 HTTPS_PROXY 或记录的代理:
bash
快速连接测试(5 秒超时)
curl -sI --connect-timeout 5 https://huggingface.co -o /dev/null -w %{http_code}
- 2. 如果直接访问失败,按以下顺序尝试代理:
a. 上次运行的记录代理(.proxy 文件,见上文)。
b. 快手云默认值:http://10.74.176.8:11080
bash
curl -sI --connect-timeout 5 --proxy http://10.74.176.8:11080 https://huggingface.co -o /dev/null -w %{http_code}
c. 如果两者都不起作用,询问用户其代理地址。
- 3. 确认工作代理后,记录以供将来运行使用:
bash
mkdir -p ~/.openclaw/workspace/skills/openclaw-local-embedding
echo http://the-working-proxy:port > ~/.openclaw/workspace/skills/openclaw-local-embedding/.proxy
然后仅针对下载过程设置它(非永久):
bash
export HTTPS_PROXY=http://the-working-proxy:port # 使用确认的代理地址
export HTTPPROXY=$HTTPSPROXY
export NODEUSEENV_PROXY=1
export NODETLSREJECT_UNAUTHORIZED=0 # 仅当代理执行 TLS 检查(MITM)时设置
这些环境变量是进程范围的。它们不会影响其他进程或网关。
重要提示: NODETLSREJECT_UNAUTHORIZED=0 会禁用 TLS 证书验证。仅在下载脚本/进程中设置。切勿将其持久化到 shell 配置文件中。
步骤 3:下载并验证模型
该技能包含一个辅助脚本(scripts/init-model.mjs),可自动处理代理检测、模型下载和验证。在步骤 3 中的代理环境变量处于活动状态时运行它(或让脚本自动检测):
bash
脚本位于技能文件夹中(默认 clawhub 安装位置):
node ~/.openclaw/workspace/skills/openclaw-local-embedding/scripts/init-model.mjs
要显式覆盖代理:
node ~/.openclaw/workspace/skills/openclaw-local-embedding/scripts/init-model.mjs --proxy http://your-proxy:port
脚本将:
- 1. 自动检测 OpenClaw 安装目录
- 检查模型是否已缓存(幂等——可安全重复运行)
- 探测网络连接(记录代理 → 环境代理 → 直接 → 快手云默认值)
- 将工作代理记录到 .proxy 以供将来运行使用
- 通过 node-llama-cpp 的 resolveModelFile 下载模型
预期下载大小:约 313 MB。通过代理的速度:约 5–10 MB/s。
cmake 故障排除
如果 node-llama-cpp 找不到预构建的二进制文件(在 GLIBC < 2.32 的 Ubuntu 20.04 上常见),它会回退到从源码编译 llama.cpp。这需要 cmake >= 3.19。
检查 cmake 版本:
bash
cmake --version
如果 cmake < 3.19,安装较新版本:
bash
pip3 install cmake
验证:cmake --version 应显示 >= 3.19
cmake 可用后,重新运行模型下载。编译是自动且一次性的。
步骤 4:配置 openclaw.json
openclaw config set(点号路径赋值)是长期存在的核心功能,在所有 OpenClaw 版本中均可用。使用它来应用设置,然后验证:
bash
openclaw config set agents.defaults.memorySearch.enabled true
openclaw config set agents.defaults.memorySearch.provider local
openclaw config set agents.defaults.memorySearch.fallback none
openclaw config set agents.defaults.memorySearch.query.hybrid.enabled true
验证结果——输出必须显示所有四个字段已正确设置:
bash
openclaw config get agents.defaults.memorySearch
预期输出:
json
{
enabled: true,
provider: local,
fallback: none,
query: { hybrid: { enabled: true } }
}
同时运行 openclaw config validate 以确认完整配置格式正确。如果报告错误,请先修复后再继续。
如果 config set 失败(例如,以非零代码退出或 config get 显示错误值),回退到直接 JSON 编辑。打开 ~/.openclaw/openclaw.json 并手动将块添加到 agents.defaults 下:
json5
// 在现有的 agents → defaults 对象内:
memorySearch: {
enabled: true,
provider: local,
fallback: none,
query: {
hybrid: {
enabled: true
}
}
}
不要在顶层设置 memorySearch。它必须嵌套在 agents.defaults 下。
手动编辑后,验证 JSON 语法:openclaw config validate
步骤 5:重启网关
配置更改需要重启网关。使用标准 OpenClaw 命令:
bash
openclaw gateway restart
无论 OpenClaw 是如何安装的(systemd、launchd 或 Windows 服务),此命令都有效。如果网关未注册为受监督的服务,则改为在前台手动运行:
bash
openclaw gateway run
步骤 6:验证
重启后,第一次 memory_search 工具调用会触发模型加载(约 1.6 秒,一次性)。后续调用使用内存中的模型,无需网络访问。
检查网关状态:
bash
openclaw gateway status
要进行更深入的日志检查,请使用系统的服务日志查看器:
bash
systemd(Linux):
journalctl --user -u openclaw-gateway -n 50 | grep -i embed\|memory\|llama
macOS launchd:
log show --predicate subsystem ==