Sensitive Info Protection

Overview

A general-purpose sensitive information protection skill that provides real-time scanning, detection, and interactive handling of sensitive data in conversation content. It helps prevent accidental exposure of personal information, authentication credentials, and commercial secrets through configurable detection rules.

Core Capabilities

1. Sensitive Information Detection

- Real-time scanning of user input and output content
Detection of multiple sensitive information types by default
Support for custom user-defined detection rules with regex patterns
Priority-based rule matching

2. Configuration Management

- Built-in default sensitive type library with common patterns
Support for adding/removing/enabling/disabling custom rules
Rule priority adjustment for conflict resolution

3. External Data Integration

- Import sensitive keyword/pattern lists from external data sources
Support for dynamic rule updates
JSON/YAML configuration format

4. Interactive Processing

- Standardized output format for detection results
Clear operation options for user decision-making
Support for one-click desensitization or manual editing

Default Sensitive Types

Built-in detection for the following types:

- api_key - API keys, access tokens, authentication credentials
INLINECODE1 - Credit card numbers
INLINECODE2 - National ID card numbers (Chinese)
INLINECODE3 - Mobile phone numbers (Chinese)
INLINECODE4 - Email addresses
INLINECODE5 - Bank card numbers
INLINECODE6 - Password patterns in code or logs
INLINECODE7 - Commercial secrets, confidential information markers

Usage

Scanning Content for Sensitive Information

CODEBLOCK0

Adding Custom Rule

CODEBLOCK1

Loading Configuration from File

CODEBLOCK2

Output Format

When sensitive information is detected, the following format is used:

CODEBLOCK3

Resources

scripts/

- detector.py - Main detection engine class
INLINECODE9 - Data models for detection rules and results
INLINECODE10 - Command-line interface
INLINECODE11 - Built-in default detection rules

references/

- configuration.md - Detailed configuration guide
INLINECODE13 - API documentation for integration

assets/

- default_config.json - Default configuration template

敏感信息保护

概述

一个通用型敏感信息保护技能，提供对话内容中敏感数据的实时扫描、检测和交互式处理。通过可配置的检测规则，帮助防止个人信息、认证凭证和商业机密意外泄露。

核心能力

1. 敏感信息检测

- 实时扫描用户输入和输出内容
默认检测多种敏感信息类型
支持使用正则表达式模式自定义用户定义检测规则
基于优先级的规则匹配

2. 配置管理

- 内置常见模式的默认敏感类型库
支持添加/删除/启用/禁用自定义规则
规则优先级调整以解决冲突

3. 外部数据集成

- 从外部数据源导入敏感关键词/模式列表
支持动态规则更新
JSON/YAML配置格式

4. 交互式处理

- 检测结果的标准化输出格式
清晰的操作选项供用户决策
支持一键脱敏或手动编辑

默认敏感类型

内置检测以下类型：

- apikey - API密钥、访问令牌、认证凭证
creditcard - 信用卡号
idcard - 身份证号（中国）
phone - 手机号码（中国）
email - 电子邮件地址
bankcard - 银行卡号
password - 代码或日志中的密码模式
secret - 商业机密、保密信息标记

使用方法

扫描内容中的敏感信息

python
from scripts.detector import SensitiveDetector

detector = SensitiveDetector()
results = detector.scan(text_content)

if results:
# 以标准格式打印检测结果
detector.print_results(results)
# 等待用户决定是否继续
else:
# 未检测到敏感信息
pass

添加自定义规则

python
from scripts.models import DetectionRule

new_rule = DetectionRule(
name=custom_secret,
pattern=rMY_SECRET=\w+,
sensitivity=high,
description=自定义秘密模式
)
detector.addrule(newrule)

从文件加载配置

python
detector.load_config(path/to/config.json)

输出格式

检测到敏感信息时，使用以下格式：

检测结果

- 敏感类型: [type]
位置: [start:end]
原文: [original content]
敏感度: [high/medium/low]

操作选项

1. 确认放行
修改后发送
取消发送

资源

scripts/

- detector.py - 主检测引擎类
models.py - 检测规则和结果的数据模型
cli.py - 命令行界面
default_rules.json - 内置默认检测规则

references/

- configuration.md - 详细配置指南
api.md - 集成API文档

assets/

- default_config.json - 默认配置模板

sensitive-info-protection敏感信息防护