Run anonymization and restore flows
Use this skill when you need to anonymize text/files, restore placeholders with a saved map, or tune the local detector.
Maintainer-only validation assets are excluded from ClawHub uploads.
Scope
- anonymize (
scripts/anonymize.py)
- deanonymize (
scripts/deanonymize.py)
- local detector diagnostics (
scripts/detect_local.py)
- file/map workflow helpers behind those entrypoints
- request/response gateway routing (
modeio-middleware)
- command safety analysis (
security)
- staged-diff or git pre-commit scanning
Working directory
Run these commands from inside the privacy-protector folder.
Requirements
- - Hard requirement: INLINECODE6
- Optional package:
requests for API-backed dynamic, strict, and INLINECODE10 - Optional package:
python-docx for INLINECODE12 - Optional package:
PyMuPDF for INLINECODE14 - Optional override:
ANONYMIZE_API_URL for non-lite levels - Optional override:
MODEIO_REDACT_MAP_DIR for local map storage
Core commands
Anonymize text
CODEBLOCK0
Anonymize a file
CODEBLOCK1
Restore from a saved map
CODEBLOCK2
Tune the local detector
CODEBLOCK3
Runtime notes
- -
lite runs fully local. dynamic, strict, and crossborder call the backend API - For
crossborder, pass both --sender-code and INLINECODE24 - Supported file inputs:
.txt, .md, .markdown, .csv, .tsv, .json, .jsonl, .yaml, .yml, .xml, .html, .htm, .rst, .log, .docx, INLINECODE40 - Saved maps default to
~/.modeio/redact/maps; use MODEIO_REDACT_MAP_DIR to override that location - Text-like outputs get embedded map markers or sidecar
.map.json references when needed - INLINECODE44 supports anonymization only; de-anonymization is not supported
- Rich-file outputs keep assurance metadata in the JSON response so callers can decide how strict they want to be
- Use
--json when you want the stable machine-readable envelope and file workflow metadata
Resources
- -
ARCHITECTURE.md for package boundaries - INLINECODE47 for flags and output contracts
- INLINECODE48 for map linkage and assurance behavior
- INLINECODE49 for profiles and shipped config examples
- INLINECODE50 for ready-to-edit tuning files
When not to use
- - Middleware interception or policy routing
- Safety approval/block decisions
技能名称: privacy-protector
运行匿名化与还原流程
当你需要对文本/文件进行匿名化处理、使用已保存的映射恢复占位符或调整本地检测器时,请使用此技能。
仅限维护者使用的验证资产不会上传至ClawHub。
范围
- 匿名化(scripts/anonymize.py)
- 去匿名化(scripts/deanonymize.py)
- 本地检测器诊断(scripts/detect_local.py)
- 这些入口点背后的文件/映射工作流辅助工具
- 请求/响应网关路由(modeio-middleware)
- 命令安全性分析(security)
- 分阶段差异或Git预提交扫描
工作目录
请在privacy-protector文件夹内运行这些命令。
依赖要求
- - 硬性要求:python3
- 可选包:requests(用于基于API的dynamic、strict和crossborder模式)
- 可选包:python-docx(用于.docx文件)
- 可选包:PyMuPDF(用于.pdf文件)
- 可选覆盖:ANONYMIZEAPIURL(用于非lite级别)
- 可选覆盖:MODEIOREDACTMAP_DIR(用于本地映射存储)
核心命令
匿名化文本
bash
python3 scripts/anonymize.py \
--input 邮箱: alice@example.com, 电话: 415-555-1234 \
--level lite \
--json
匿名化文件
bash
python3 scripts/anonymize.py \
--input ./incident.docx \
--level lite \
--json
从已保存的映射恢复
bash
python3 scripts/deanonymize.py \
--input 邮箱: [EMAIL_1] \
--map ~/.modeio/redact/maps/.json \
--json
调整本地检测器
bash
python3 scripts/detect_local.py \
--input 项目代号Phoenix已获批。请联系support@example.com。 \
--allowlist-file examples/detect-local/allowlist.json \
--blocklist-file examples/detect-local/blocklist.json \
--thresholds-file examples/detect-local/thresholds.json \
--json
运行时说明
- - lite模式完全在本地运行。dynamic、strict和crossborder模式会调用后端API
- 对于crossborder模式,需同时传递--sender-code和--recipient-code
- 支持的文件输入格式:.txt、.md、.markdown、.csv、.tsv、.json、.jsonl、.yaml、.yml、.xml、.html、.htm、.rst、.log、.docx、.pdf
- 已保存的映射默认存储在~/.modeio/redact/maps;使用MODEIOREDACTMAP_DIR可覆盖该位置
- 文本类输出在必要时会嵌入映射标记或附带.map.json引用
- .pdf仅支持匿名化,不支持去匿名化
- 富文件输出会在JSON响应中保留保证元数据,以便调用方自行决定严格程度
- 当你需要稳定的机器可读信封及文件工作流元数据时,请使用--json参数
参考资料
- - ARCHITECTURE.md:包边界说明
- references/cli-contracts.md:标志和输出合约
- references/file-workflows.md:映射关联与保证行为
- references/local-detector.md:配置文件及内置配置示例
- examples/detect-local/:可直接编辑的调优文件
不适用场景