Level	Controls	Use Case
Open	Audit only, no restrictions	Internal dev/testing
Standard

Practice	Rationale
Policy as configuration	Store policies in YAML/JSON, not hardcoded — enables change without deploys
Most-restrictive-wins

代理治理模式

为AI代理系统添加安全性、信任和策略执行的模式。

概述

治理模式确保AI代理在定义好的边界内运行——控制它们可以调用哪些工具、可以处理哪些内容、可以执行多少操作，并通过审计追踪维护问责制。

用户请求 → 意图分类 → 策略检查 → 工具执行 → 审计日志
↓ ↓ ↓
威胁检测允许/拒绝信任更新

何时使用

- 具有工具访问权限的代理：任何调用外部工具（API、数据库、Shell命令）的代理
多代理系统：代理委托给其他代理时需要信任边界
生产部署：合规、审计和安全要求
敏感操作：金融交易、数据访问、基础设施管理

模式1：治理策略

将代理允许执行的操作定义为一个可组合、可序列化的策略对象。

python
from dataclasses import dataclass, field
from enum import Enum
from typing import Optional
import re

class PolicyAction(Enum):
ALLOW = allow
DENY = deny
REVIEW = review # 标记为需要人工审核

@dataclass
class GovernancePolicy:
控制代理行为的声明式策略。
name: str
allowedtools: list[str] = field(defaultfactory=list) # 白名单
blockedtools: list[str] = field(defaultfactory=list) # 黑名单
blockedpatterns: list[str] = field(defaultfactory=list) # 内容过滤器
maxcallsper_request: int = 100 # 速率限制
requirehumanapproval: list[str] = field(default_factory=list) # 需要审批的工具

def checktool(self, toolname: str) -> PolicyAction:
检查工具是否被此策略允许。
if toolname in self.blockedtools:
return PolicyAction.DENY
if toolname in self.requirehuman_approval:
return PolicyAction.REVIEW
if self.allowedtools and toolname not in self.allowed_tools:
return PolicyAction.DENY
return PolicyAction.ALLOW

def check_content(self, content: str) -> Optional[str]:
根据屏蔽模式检查内容。返回匹配的模式或None。
for pattern in self.blocked_patterns:
if re.search(pattern, content, re.IGNORECASE):
return pattern
return None

策略组合

组合多个策略（例如，组织级 + 团队级 + 代理特定）：

python
def compose_policies(*policies: GovernancePolicy) -> GovernancePolicy:
合并策略，采用最严格规则优先的语义。
combined = GovernancePolicy(name=composed)

for policy in policies:
combined.blockedtools.extend(policy.blockedtools)
combined.blockedpatterns.extend(policy.blockedpatterns)
combined.requirehumanapproval.extend(policy.requirehumanapproval)
combined.maxcallsper_request = min(
combined.maxcallsper_request,
policy.maxcallsper_request
)
if policy.allowed_tools:
if combined.allowed_tools:
combined.allowed_tools = [
t for t in combined.allowedtools if t in policy.allowedtools
]
else:
combined.allowedtools = list(policy.allowedtools)

return combined

用法：从宽泛到具体分层策略

org_policy = GovernancePolicy( name=org-wide, blockedtools=[shellexec, delete_database], blockedpatterns=[r(?i)(api[-]?key|secret|password)\s*[:=]], maxcallsper_request=50 ) team_policy = GovernancePolicy( name=data-team, allowedtools=[querydb, readfile, writereport], requirehumanapproval=[write_report] ) agentpolicy = composepolicies(orgpolicy, teampolicy)

策略作为YAML

将策略存储为配置而非代码：

yaml

governance-policy.yaml

name: production-agent
allowed_tools:
- search_documents
- query_database
- send_email
blocked_tools:
- shell_exec
- delete_record
blocked_patterns:
- (?i)(api[_-]?key|secret|password)\\s*[:=]
- (?i)(drop|truncate|delete from)\\s+\\w+
maxcallsper_request: 25
requirehumanapproval:
- send_email

python
import yaml

def load_policy(path: str) -> GovernancePolicy:
with open(path) as f:
data = yaml.safe_load(f)
return GovernancePolicy(data)

模式2：语义意图分类

在提示词到达代理之前检测其中的危险意图，使用基于模式的信号。

python
from dataclasses import dataclass

@dataclass
class IntentSignal:
category: str # 例如 dataexfiltration, privilegeescalation
confidence: float # 0.0 到 1.0
evidence: str # 触发检测的内容

用于威胁检测的加权信号模式

THREAT_SIGNALS = [ # 数据泄露 (r(?i)send\s+(all|every|entire)\s+\w+\s+to\s+, data_exfiltration, 0.8), (r(?i)export\s+.*\s+to\s+(external|outside|third.?party), data_exfiltration, 0.9), (r(?i)curl\s+.*\s+-d\s+, data_exfiltration, 0.7),

# 权限提升
(r(?i)(sudo|as\s+root|admin\s+access), privilege_escalation, 0.8),
(r(?i)chmod\s+777, privilege_escalation, 0.9),

# 系统修改
(r(?i)(rm\s+-rf|del\s+/[sq]|format\s+c:), system_destruction, 0.95),
(r(?i)(drop\s+database|truncate\s+table), system_destruction, 0.9),

# 提示注入
(r(?i)ignore\s+(previous|above|all)\s+(instructions?|rules?), prompt_injection, 0.9),
(r(?i)you\s+are\s+now\s+(a|an)\s+, prompt_injection, 0.7),
]

def classify_intent(content: str) -> list[IntentSignal]:
对内容进行威胁信号分类。
signals = []
for pattern, category, weight in THREAT_SIGNALS:
match = re.search(pattern, content)
if match:
signals.append(IntentSignal(
category=category,
confidence=weight,
evidence=match.group()
))
return signals

def is_safe(content: str, threshold: float = 0.7) -> bool:
快速检查：内容是否在给定阈值以上是安全的？
signals = classify_intent(content)
return not any(s.confidence >= threshold for s in signals)

关键洞察：意图分类发生在工具执行之前，作为预飞安全检查。这与仅在生成之后检查的输出护栏有本质区别。

模式3：工具级治理装饰器

使用治理检查包装单个工具函数：

python
import functools
import time
from collections import defaultdict

callcounters: dict[str, int] = defaultdict(int)

def govern(policy: GovernancePolicy, audit_trail=None):
在工具函数上强制执行治理策略的装饰器。
def decorator(func):
@functools.wraps(func)
async def wrapper(args, *kwargs):
tool_name = func.name

# 1. 检查工具白名单/黑名单
action = policy.checktool(toolname)
if action == PolicyAction.DENY:
raise PermissionError(f策略 {policy.name} 阻止了工具 {tool_name})
if action == PolicyAction.REVIEW:
raise PermissionError(f工具 {tool_name} 需要人工审批)

# 2. 检查速率限制
callcounters[policy.name] += 1
if callcounters[policy.name] > policy.maxcallsper_request:
raise PermissionError(f超出速率限制：{policy.maxcallsper_request} 次调用)

# 3. 检查参数中的内容
for arg in list(args) + list(kwargs.values()):
if isinstance(arg, str):
matched = policy

agent-governance代理治理

agent-governance

Agent Governance Patterns

Overview

When to Use

Pattern 1: Governance Policy

Policy Composition

Policy as YAML

Pattern 2: Semantic Intent Classification

Pattern 3: Tool-Level Governance Decorator

Pattern 4: Trust Scoring

Pattern 5: Audit Trail

Pattern 6: Framework Integration

PydanticAI

CrewAI

OpenAI Agents SDK

Governance Levels

Best Practices

Quick Start Checklist

Related Resources

代理治理模式

概述

何时使用

模式1：治理策略

策略组合

用法：从宽泛到具体分层策略

策略作为YAML

governance-policy.yaml

模式2：语义意图分类

用于威胁检测的加权信号模式

模式3：工具级治理装饰器

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

agent-governance代理治理

agent-governance

Agent Governance Patterns

Overview

When to Use

Pattern 1: Governance Policy

Policy Composition

Policy as YAML

Pattern 2: Semantic Intent Classification

Pattern 3: Tool-Level Governance Decorator

Pattern 4: Trust Scoring

Pattern 5: Audit Trail

Pattern 6: Framework Integration

PydanticAI

CrewAI

OpenAI Agents SDK

Governance Levels

Best Practices

Quick Start Checklist

Related Resources

代理治理模式

概述

何时使用

模式1：治理策略

策略组合

用法：从宽泛到具体分层策略

策略作为YAML

governance-policy.yaml

模式2：语义意图分类

用于威胁检测的加权信号模式

模式3：工具级治理装饰器

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement