HaS Privacy

HaS exposes a single umbrella CLI:

- has text ... for text anonymization, restoration, and scanning
INLINECODE1 for image scanning, masking, and category discovery

Use it when you need to remove private data locally before sending content elsewhere, inspect a directory for privacy risks, or mask visual privacy targets in photos and screenshots.

Agent Decision Guidelines

- Prefer has text for plaintext and has image for raster images. For mixed directories, run both and combine the results into one report.
For PDFs, Word documents, or scanned pages, extract text first and then use has text. For screenshots/photos where the goal is simply to hide visible carriers such as faces, screens, paper, labels, or QR codes, use has image. If the goal is to reason about the text content inside an image, run OCR first and then use has text.
Do not overwrite or delete the original files. Text commands can restore later, image masking is irreversible.
Proactively mention configurable knobs when the user intent is clear: has text uses repeated --type; has image uses repeated --type, plus --method and --strength.
If the user intent is ambiguous, start with scan before hide.
After batch scans, summarize text file count, image file count, findings by type/category, high-risk items, and the suggested next step.
If timing matters to the user, add --timing and report the elapsed result in plain language afterward.
For qr_code and barcode, the default mosaic strength is automatically raised based on the detection size to ensure the encoding is destroyed. The agent does not need to manually increase --strength for these categories. If a detection output includes effective_strength, report it to the user.

Shared CLI Contract

The current CLI contract is designed for agents first:

- Success returns compact JSON.
Failure also returns compact JSON with error.code and error.message.
Returned path fields are absolute.

- This includes file, output, mapping_output, and skipped[].file.

- Invalid combinations fail fast instead of silently falling back.
Directory mode is non-recursive. Only immediate children are processed.
Batch results can include skipped and skipped_count.

- Treat skipped entries as unprocessed files, not as clean files.

Shared command layout:

CODEBLOCK0

Shared options can be placed before or after the subcommand.

Part 1: `has text`

INLINECODE30 is the plaintext namespace. It supports:

- INLINECODE31
INLINECODE32
INLINECODE33

It runs entirely on-device and uses a local llama-server plus the HaS text model when model inference is required.

Core Text Concepts

Semantic tags

Anonymized text uses semantic tags such as:

CODEBLOCK1

This preserves structure better than a flat [REDACTED] token and is the reason restored downstream LLM output can remain usable.

Open-set types

Repeated --type flags are open-set. They are not limited to a fixed catalog. Natural language type names such as "person name", "address", "phone number", or "numeric values (transaction amounts)" are valid.

Public/private distinction

Type wording matters. For example, "personal location" is usually safer than "location" if you want to preserve public places but hide private addresses. Public/private person-name distinctions remain less stable and should not be trusted without verification.

Multilingual support

The text model supports Chinese, English, French, German, Spanish, Portuguese, Japanese, and Korean, including mixed-language text.

Type name language

Match the --type language to the source text language:

- Chinese text → use Chinese type names: INLINECODE44
Non-Chinese text (English, French, German, etc.) → use English type names: INLINECODE45

Text Runtime Prerequisites

INLINECODE46 auto-starts a local llama-server when needed.

- Default model path: INLINECODE48
Override model path: INLINECODE49
Override parallel cap: INLINECODE50
If HuggingFace downloads fail, see Model Download Mirrors below.

Text Usage

CODEBLOCK2

Namespace options:

Option	Description
INLINECODE51	Include `elapsed_ms` in the JSON output
INLINECODE53

Emit runtime status and progress messages to stderr |

Input methods:

Method	Description
INLINECODE54	Pass text directly
INLINECODE55

Rules:

- --text, --file, and --dir are mutually exclusive.
Empty --type values are rejected.
Directory mode only accepts batch output flags.
Single-file hide requires --mapping-output.
Single-file restore requires --mapping.
In text directory mode, skipped can include unprocessed files (binary, encoding, or read errors).

`has text scan`

Finds sensitive entities without replacing them.

CODEBLOCK3

Parameters:

Parameter	Required	Description
INLINECODE70	yes	Entity type to scan for; repeat to add more
INLINECODE71 / `--file` / INLINECODE73

Output:

- Single-text mode returns INLINECODE78
Directory mode returns INLINECODE79
Batch output may include skipped and INLINECODE81

`has text hide`

Replaces sensitive entities with semantic tags.

CODEBLOCK4

Parameters:

Parameter	Required	Description
INLINECODE83	yes	Entity type to anonymize; repeat to add more
INLINECODE84 / `--file` / INLINECODE86

Behavior:

- Single-file mode never emits the mapping table inline.
Single-file mode returns either:

- {"text":"...","mapping_output":"/abs/path/to/map.json"} - {"output":"/abs/path/to/out.txt","mapping_output":"/abs/path/to/map.json"}

- Batch mode does not accept shared --mapping.
Mapping files are sensitive assets. Protect them.

`has text restore`

Restores anonymized text using mapping JSON.

CODEBLOCK5

Parameters:

Parameter	Required	Description
INLINECODE104	single-file: yes	Mapping JSON file path
INLINECODE105 / `--file` / INLINECODE107

Behavior:

- Single-file mode returns inline text unless --output is provided.
INLINECODE120 uses per-file mapping JSON files. It does not accept a shared --mapping.
INLINECODE122 expects mapping files at <mapping-dir>/<filename>.mapping.json (matching the naming convention produced by hide --dir).

Typical Text Workflow

Anonymize text before sending it to a cloud LLM, then restore the answer:

1. hide to produce anonymized text plus mapping
send anonymized text to the cloud model with a tag-format explanation (see below)
INLINECODE126 the model response with the mapping

For multi-line text, prefer file-based intermediates over shell variables.

Prompting the cloud LLM with anonymized text

When forwarding anonymized text to a cloud LLM, the agent must prepend a brief explanation of the tag format so the model understands and preserves the tags. Include wording equivalent to the following (adjust language to match the conversation):

The text below has been anonymized. Sensitive entities are replaced by tags in the format <EntityType[ID].Category.Attribute>:

- EntityType — the kind of entity (matches the --type value, e.g. person name, address, phone number).
[ID] — a numeric identifier. The same type + same ID always refers to the same real-world entity (e.g. every <person name[1]> is the same person; <person name[2]> is a different person).
.Category.Attribute — additional semantic classification of the entity.

Rules:

1. Preserve every tag exactly as-is in your response — do not modify, translate, paraphrase, omit, or expand any tag.
When referring to an anonymized entity, reuse the original tag with the correct ID.
Do not attempt to guess the real values behind the tags.

Omitting this explanation may cause the cloud model to strip, rewrite, or misinterpret the tags, which will break the restore step.

Model Download Mirrors

If HuggingFace downloads fail, use these ModelScope mirrors:

- text model: INLINECODE135
image model: INLINECODE136

Part 2: `has image`

INLINECODE138 is the image namespace. It supports:

- INLINECODE139
INLINECODE140
INLINECODE141

It loads the YOLO segmentation model directly and does not require llama-server.

Image Usage

CODEBLOCK6

Namespace options:

Option	Applies to	Description
INLINECODE143	all image commands	Include `elapsed_ms` in the JSON output
INLINECODE145

scan, hide | Override the image model path |

Image Privacy Categories

Common categories include biometric_face, id_card, passport, license_plate, qr_code, mobile_screen, and paper.

Use has image categories when you need the full catalog of 21 supported classes.

INLINECODE156 accepts:

- English names
Chinese names
numeric IDs
unique partial matches such as INLINECODE157

Rules:

- Empty --type values are rejected.
Ambiguous partial matches fail fast.
Omit --type to scan or mask all supported categories.
In image directory mode, skipped can include unprocessed files.

`has image scan`

Finds privacy regions without modifying the image.

CODEBLOCK7

Parameters:

Parameter	Required	Description
INLINECODE162 / INLINECODE163	one input	Single image or batch directory
INLINECODE164

Output:

- Single-image mode returns detections and INLINECODE169
Directory mode returns results, count, summary, and optional INLINECODE173

`has image hide`

Detects and masks privacy regions in images.

CODEBLOCK8

Parameters:

Parameter	Required	Description
INLINECODE175 / INLINECODE176	one input	Single image or batch directory
INLINECODE177

Behavior:

- Refuses to overwrite the source image.
Directory mode accepts --output-dir, not --output.
For qr_code and barcode detections with --method mosaic, the block size is automatically raised to max(strength, bbox_short_side // 10, 20) to prevent the encoding from surviving pixelation. After masking, a lightweight verification confirms the code is no longer machine-readable; if it is, the strength is escalated further (up to a fill fallback). Each affected detection includes an effective_strength field in the output.
A cv2-based fallback supplements YOLO detection for QR codes and barcodes. When YOLO misses a code (e.g. large codes on plain backgrounds), cv2.QRCodeDetector and cv2.barcode.BarcodeDetector provide additional coverage. When YOLO misclassifies a code region as a different category (e.g. monitor_screen), cv2 corrects the category before --type filtering, so --type qr_code catches all QR codes regardless of YOLO's label. Corrected detections include a "corrected_from" field; new detections include "cv2_fallback": true.

`has image categories`

Lists all supported image privacy categories.

CODEBLOCK9

Behavior:

- Returns INLINECODE208
Supports INLINECODE209

Suggested Combined Scan

For a mixed workspace:

1. run has text scan ... --dir <dir> for plaintext
run has image scan --dir <dir> for images
merge the two JSON results into one privacy report

If the user wants masking after that, use hide on the specific files or directories you already identified.

HaS 隐私保护

HaS 提供了一个统一的 CLI 工具：

- has text ... 用于文本匿名化、恢复和扫描
has image ... 用于图像扫描、遮罩和类别发现

当您需要在将内容发送到其他地方之前本地移除隐私数据、检查目录中的隐私风险，或在照片和截图中遮盖视觉隐私目标时使用。

代理决策指南

- 纯文本优先使用 has text，光栅图像优先使用 has image。对于混合目录，同时运行两者并将结果合并到一个报告中。
对于 PDF、Word 文档或扫描页面，先提取文本再使用 has text。对于仅需隐藏可见载体（如人脸、屏幕、纸张、标签或二维码）的截图/照片，使用 has image。如果目标是推理图像中的文本内容，先运行 OCR 再使用 has text。
不要覆盖或删除原始文件。文本命令可以后续恢复，图像遮罩是不可逆的。
当用户意图明确时，主动提及可配置参数：has text 使用重复的 --type；has image 使用重复的 --type，外加 --method 和 --strength。
如果用户意图不明确，在 hide 之前先使用 scan。
批量扫描后，汇总文本文件数量、图像文件数量、按类型/类别的发现结果、高风险项目以及建议的下一步操作。
如果用户关心时间，添加 --timing 并用通俗语言报告耗时结果。
对于 qrcode 和 barcode，默认的马赛克强度会根据检测尺寸自动提高，以确保编码被破坏。代理无需手动为这些类别增加 --strength。如果检测输出包含 effectivestrength，请向用户报告。

共享 CLI 约定

当前 CLI 约定优先为代理设计：

- 成功返回紧凑的 JSON。
失败也返回紧凑的 JSON，包含 error.code 和 error.message。
返回的路径字段为绝对路径。

- 包括 file、output、mapping_output 和 skipped[].file。

- 无效组合会快速失败，而不是静默回退。
目录模式是非递归的。仅处理直接子文件。
批量结果可以包含 skipped 和 skipped_count。

- 将 skipped 条目视为未处理的文件，而不是干净的文件。

共享命令布局：

bash
{baseDir}/scripts/has.sh [options]

共享选项可以放在子命令之前或之后。

第一部分：has text

has text 是纯文本命名空间。支持：

- scan
hide
restore

它完全在设备上运行，当需要模型推理时，使用本地 llama-server 和 HaS 文本模型。

核心文本概念

语义标签

匿名化文本使用语义标签，例如：

text

这比扁平化的 [REDACTED] 标记更好地保留了结构，也是下游 LLM 输出恢复后仍可用的原因。

开放集类型

重复的 --type 标志是开放集的。它们不限于固定目录。自然语言类型名称如 person name、address、phone number 或 numeric values (transaction amounts) 都是有效的。

公共/私有区分

类型措辞很重要。例如，如果您想保留公共场所但隐藏私人地址，personal location 通常比 location 更安全。公共/私有人员姓名的区分仍然不太稳定，未经验证不应信任。

多语言支持

文本模型支持中文、英文、法文、德文、西班牙文、葡萄牙文、日文和韩文，包括混合语言文本。

类型名称语言

将 --type 语言与源文本语言匹配：

- 中文文本 → 使用中文类型名称：--type 人名 --type 电话号码 --type 地址
非中文文本（英文、法文、德文等）→ 使用英文类型名称：--type person name --type phone number --type address

文本运行时前提条件

has text 在需要时自动启动本地 llama-server。

- 默认模型路径：~/.openclaw/tools/has-anonymizer/models/hastextmodel.gguf
覆盖模型路径：HASTEXTMODELPATH=/abs/path/to/hastextmodel.gguf
覆盖并行能力：HASTEXTMAXPARALLEL_REQUESTS
如果 HuggingFace 下载失败，请参阅下面的模型下载镜像。

文本使用

bash
{baseDir}/scripts/has.sh text [--timing] [--verbose] [options]

命名空间选项：

选项	描述
--timing	在 JSON 输出中包含 elapsed_ms
--verbose

向 stderr 输出运行时状态和进度消息 |

输入方法：

方法	描述
--text <text>	直接传递文本
--file <path>

规则：

- --text、--file 和 --dir 互斥。
拒绝空的 --type 值。
目录模式仅接受批量输出标志。
单文件 hide 需要 --mapping-output。
单文件 restore 需要 --mapping。
在文本目录模式下，skipped 可以包含未处理的文件（二进制、编码或读取错误）。

has text scan

查找敏感实体而不替换它们。

bash
{baseDir}/scripts/has.sh text scan --type person name --type phone number --file report.txt
{baseDir}/scripts/has.sh text scan --type person name --type phone number --dir ./reports/

参数：

参数	必需	描述
--type	是	要扫描的实体类型；重复添加更多
--text / --file / --dir

输出：

- 单文本模式返回 {entities: ...}
目录模式返回 {results:[...],count:N,summary:{...}}
批量输出可能包含 skipped 和 skipped_count

has text hide

用语义标签替换敏感实体。

bash
{baseDir}/scripts/has.sh text hide --type person name --type address --text John lives in Brooklyn --mapping-output ./mapping.json
{baseDir}/scripts/has.sh text hide --type person name --file note.txt --output ./note.anonymized.txt --mapping-output ./note.mapping.json
{baseDir}/scripts/has.sh text hide --type person name --dir ./docs/

参数：

参数	必需	描述
--type	是	要匿名化的实体类型；重复添加更多
--text / --file / --dir

行为：

- 单文件模式从不内联输出映射表。
单文件模式返回以下之一：

has-anonymizer文本图像匿名化

has-anonymizer

HaS Privacy

Agent Decision Guidelines

Shared CLI Contract

Part 1: has text

Core Text Concepts

Semantic tags

Open-set types

Public/private distinction

Multilingual support

Type name language

Text Runtime Prerequisites

Text Usage

has text scan

has text hide

has text restore

Typical Text Workflow

Prompting the cloud LLM with anonymized text

Model Download Mirrors

Part 2: has image

Image Usage

Image Privacy Categories

has image scan

has image hide

has image categories

Suggested Combined Scan

HaS 隐私保护

代理决策指南

共享 CLI 约定

第一部分：has text

核心文本概念

语义标签

开放集类型

公共/私有区分

多语言支持

类型名称语言

文本运行时前提条件

文本使用

has text scan

has text hide

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement

Part 1: `has text`

`has text scan`

`has text hide`

`has text restore`

Part 2: `has image`

`has image scan`

`has image hide`

`has image categories`