agent-sentinel

Purpose

This skill is the mandatory evaluation layer between the agent's intent
and any high-stakes tool execution. You are not permitted to use the tools
listed under Interception Triggers without first
calling this script and receiving "decision": "ALLOW" or "decision": "ADVISE" in the result.

Think of this as a circuit breaker — if the Sentinel trips, the circuit opens
and the action stops.

Interception Triggers

You are FORBIDDEN from invoking any of the following tools without first
running eval_engine.py and parsing its response.

Tool	Trigger condition
INLINECODE3	Every search, without exception
INLINECODE4

No exceptions apply. Even if you are certain the action is safe, the
Sentinel must still be called. This is a policy requirement, not a
suggestion.

How to Call the Sentinel

Run the following command before invoking any trigger tool:

CODEBLOCK0

Example — flight booking:
CODEBLOCK1

Example — web search:
CODEBLOCK2

Important: The script writes Chain-of-Thought reasoning to stderr
and emits only valid JSON to stdout. Parse stdout with
json.loads(...). Do not parse stderr.

Response Schema

The script always returns a single JSON object:

CODEBLOCK3

Decision Handling Rules

`"ALLOW"` — Proceed

The action passed all checks. Continue with the intended tool call.
If the result contains "severity": "LOW" alongside ALLOW, surface any
informational notes to the user as a soft advisory but do not block.

`"ADVISE"` — Pause and Confirm

The action is not blocked but a preference mismatch or soft-limit warning
was detected. You must:

1. Stop before invoking the tool.
Show the reason and alternatives fields to the user verbatim.
Ask the user explicitly: "Would you like to proceed anyway?"
Only continue if the user confirms. If they do not confirm within the

turn, treat it as a BLOCK.

Example ADVISE response to user:

I noticed an advisory before completing your request:
Advisory: Price $480 is within 15% of your $500 budget cap.
Suggestion: Confirm this cost is acceptable or I can search for
cheaper alternatives.
Would you like me to proceed with this booking, or should I look for
less expensive options?

`"BLOCK"` — Stop Immediately

You are strictly forbidden from proceeding. Do not attempt to:

- Retry the same action with different parameters
Find a workaround or alternative path to the same outcome
Bypass the Sentinel by splitting the action into smaller steps
Claim the Sentinel is wrong and proceed anyway

You must:

1. Do not call the trigger tool.
Apologize to the user and clearly explain the violation.
Quote the reason field exactly.
If alternatives is non-empty, present it as the recommended path forward.
Ask for an explicit user override if they wish to continue.

Example BLOCK response to user (budget violation):

I'm sorry — I can't complete this booking.
Blocked: Price $650.00 exceeds your maximum budget of $500.00.
What you can do: Look for options priced at or below $500. Consider
flexible dates or alternate airports.
If you'd like to override this limit for this booking only, please say
"override" and I'll ask you to confirm the amount before proceeding.

Example BLOCK response to user (child-safety violation):

I'm sorry — I can't perform this search.
Blocked: This content is restricted under the household child-safety
policy (severity: HIGH).
Reason: [reason from the Sentinel]
Please modify your request. If you believe this is an error, an adult
in the household can review and override the policy in SENTINEL_CONFIG.md.

Override Protocol

If a user explicitly says "override" for a BLOCK decision, you must:

1. Repeat the blocking reason and severity back to the user.
Ask for explicit written confirmation: *"Please type 'I confirm' to

proceed despite this policy violation."*

3. Log the override in your response (e.g., "Proceeding with user override.").
Never offer override for a severity: HIGH (Tier-1 child-safety)

BLOCK unless an adult user has explicitly established that permission in writing within the same conversation turn.

Installing Dependencies

CODEBLOCK4

Configuration

Edit SENTINEL_CONFIG.md (in the skill directory or ~/.openclaw/) to
update your preferences and safety policy. See that file for full
documentation of all supported keys.

Key	Type	Effect
INLINECODE21	integer	Activates child-safety tier
INLINECODE22

agent-sentinel

目的

该技能是代理意图与任何高风险工具执行之间的强制性评估层。在调用此脚本并收到结果中的decision: ALLOW或decision: ADVISE之前，不允许使用拦截触发器下列出的工具。

将其视为断路器——如果哨兵触发，电路断开，操作停止。

拦截触发器

在运行eval_engine.py并解析其响应之前，禁止调用以下任何工具。

工具	触发条件
websearch	每次搜索，无一例外
bookingtool

不适用任何例外。 即使你确定操作是安全的，也必须调用哨兵。这是一项策略要求，而非建议。

如何调用哨兵

在调用任何触发工具之前运行以下命令：

示例——航班预订：
bash
python3 ~/.openclaw/skills/agent-sentinel/eval_engine.py \
--intent 为春假预订一次奥兰多家庭旅行 \
--action booking_tool \
--data 达美航空，出发08:30，到达11:45，总价389美元，直飞，经济舱

示例——网络搜索：
bash
python3 ~/.openclaw/skills/agent-sentinel/eval_engine.py \
--intent 为我女儿寻找适龄的科学视频 \
--action web_search \
--data https://www.youtube.com/results?search_query=kids+science+experiments

重要提示： 脚本将思维链推理写入stderr，并仅向stdout输出有效的JSON。使用json.loads(...)解析stdout。不要解析stderr。

响应模式

脚本始终返回一个JSON对象：

json
{
decision: ALLOW | BLOCK | ADVISE,
severity: LOW | MEDIUM | HIGH,
reason: <清晰解释>,
alternatives: <解决违规的建议>
}

决策处理规则

ALLOW — 继续执行

操作已通过所有检查。继续执行预期的工具调用。
如果结果中包含severity: LOW且决策为ALLOW，则将任何信息性说明作为软建议呈现给用户，但不要阻止。

ADVISE — 暂停并确认

操作未被阻止，但检测到偏好不匹配或软限制警告。你必须：

1. 在调用工具前停止。
将reason和alternatives字段逐字呈现给用户。
明确询问用户：您是否仍要继续？
仅在用户确认后继续。 如果用户在此轮对话中未确认，则视为BLOCK。

向用户展示ADVISE响应的示例：

在完成您的请求前，我注意到一条建议：
建议： 价格480美元在您500美元预算上限的15%范围内。
建议： 请确认此费用是否可接受，或者我可以搜索更便宜的替代方案。
您希望我继续此预订，还是寻找更便宜的选择？

BLOCK — 立即停止

严格禁止继续执行。 不要尝试：

- 使用不同参数重试相同操作
寻找变通方法或替代路径以达到相同结果
通过将操作拆分为更小的步骤来绕过哨兵
声称哨兵错误并继续执行

你必须：

1. 不要调用触发工具。
向用户道歉并清晰解释违规原因。
准确引用reason字段。
如果alternatives非空，将其作为推荐的前进路径呈现。
如果用户希望继续，要求明确的用户覆盖。

向用户展示BLOCK响应的示例（预算违规）：

很抱歉——我无法完成此预订。
已阻止： 价格650.00美元超过了您500.00美元的最高预算。
您可以做什么： 寻找价格在500美元或以下的选择。考虑灵活日期或替代机场。
如果您想仅为此预订覆盖此限制，请说override，我将要求您在继续前确认金额。

向用户展示BLOCK响应的示例（儿童安全违规）：

很抱歉——我无法执行此搜索。
已阻止： 此内容受家庭儿童安全策略限制（严重性：高）。
原因： [哨兵提供的原因]
请修改您的请求。如果您认为这是错误，家庭中的成年人可以查看并在SENTINEL_CONFIG.md中覆盖该策略。

覆盖协议

如果用户对BLOCK决策明确说override，你必须：

1. 向用户重复阻止的reason和severity。
要求明确的书面确认：请键入I confirm以忽略此策略违规继续执行。
在响应中记录覆盖（例如，正在以用户覆盖方式继续执行。）。
永远不要为severity: HIGH（一级儿童安全）的BLOCK提供覆盖，除非成年用户在同一对话轮次中已明确书面建立该权限。

安装依赖

bash
cd ~/.openclaw/skills/agent-sentinel
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

配置

编辑SENTINEL_CONFIG.md（在技能目录或~/.openclaw/中）以更新您的偏好和安全策略。请参阅该文件以获取所有支持键的完整文档。

键	类型	效果
ChildAgeLimit	整数	激活儿童安全层级
Max_Budget

agent-sentinel哨兵代理

agent-sentinel

agent-sentinel

Purpose

Interception Triggers

How to Call the Sentinel

Response Schema