skill-defender
# Skill Defender — Malicious Pattern Scanner
## When to Run
### Automatic Triggers
1. **New skill installed** — Immediately run `scan_skill.py` against it before allowing use
2. **Skill updated** — Re-scan after any file changes in a skill directory
3. **Periodic audit** — Run batch scan on all installed skills when requested
### Manual Triggers
- User says "scan skill X" → scan that specific skill
- User says "scan all skills" → batch scan all skills
- User says "security check" or "audit skills" → same as above
## Scripts
### `scripts/scan_skill.py` — Single Skill Scanner
Scans one skill directory for malicious patterns. Produces JSON or human-readable output.
### `scripts/aggregate_scan.py` — Batch Scanner
Scans ALL installed skills and produces a single JSON report. Includes a built-in allowlist to reduce false positives from security-related skills, API skills, and other known-safe patterns.
## How to Run
```bash
# Scan a single skill (human-readable)
python3 scripts/scan_skill.py /path/to/skill-dir
# Scan a single skill (JSON output)
python3 scripts/scan_skill.py /path/to/skill-dir --json
# Scan ALL installed skills (JSON aggregate report)
python3 scripts/aggregate_scan.py
# With custom skills directory
python3 scripts/aggregate_scan.py --skills-dir /path/to/skills
# With verbose warnings
python3 scripts/scan_skill.py /path/to/skill-dir --verbose
# Exclude false positives
python3 scripts/scan_skill.py /path/to/skill-dir --exclude "pattern1" "pattern2"
```
### Exit Codes (scan_skill.py)
- `0` = clean or informational only
- `1` = suspicious (medium/high findings)
- `2` = dangerous (critical findings)
- `3` = error
### Output Format (aggregate_scan.py)
```json
{
"skills": [
{
"name": "skill-name",
"verdict": "clean|suspicious|dangerous|error",
"findingsCount": 0,
"findings": []
}
],
"summary": "All 37 skills passed with no significant issues.",
"totalSkills": 37,
"cleanCount": 37,
"suspiciousCount": 0,
"dangerousCount": 0,
"errorCount": 0,
"timestamp": "2026-02-02T06:00:00+00:00"
}
```
## Auto-Detection
Both scripts auto-detect paths:
- **Skills directory**: Detected from script location (walks up to find `skills/` parent), falls back to `~/clawd/skills`, `~/skills`, `~/.openclaw/skills`
- **Scanner script**: `aggregate_scan.py` finds `scan_skill.py` co-located in the same directory
## Handling Results
### ✅ Clean (`verdict: "clean"`)
- No action needed — skill is safe
### ⚠️ Suspicious (`verdict: "suspicious"`)
- Warn the user with a summary of findings
- Show the category and severity of each finding
### 🚨 Dangerous (`verdict: "dangerous"`)
- Block the skill — do not proceed with installation or use
- Show the full detailed findings to the user
- Require explicit user override to proceed
## Built-in Allowlist
The aggregate scanner includes an allowlist for known false positives:
- **Security scanners** (skill-defender, clawdbot-security-check) — their docs/scripts contain the very patterns they detect
- **Auth-dependent skills** (tailscale, reddit, n8n, event-planner) — legitimately reference credential paths and API keys
- **Config-aware skills** (memory-setup, eightctl, summarize) — reference config paths in documentation
- **Agent-writing skills** (self-improving-agent) — designed to modify agent files
## Pattern Reference
See `references/threat-patterns.md` for full documentation of all detected patterns, organized by category with explanations of why each is dangerous.
## Important Notes
- **No external dependencies** — standard library only (Python 3.9+)
- **Fast** — under 1 second per skill, ~30 seconds for a full batch of 30+ skills
- This is **deterministic pattern matching** (Layer 2 defense). Not LLM-based.
- False positives are possible — the allowlist and `--exclude` flag help
- The scanner **will flag itself** if scanned without the allowlist — this is expected
标签
skill
ai