CSV Cleanroom
Purpose
Profile messy CSV files, standardize columns, detect data quality issues, and produce a reproducible cleanup plan.
Trigger phrases
- - 清洗 CSV
- profile this dataset
- 数据质量检查
- 列名规范化
- build a cleanup plan
Ask for these inputs
- - CSV file or schema
- target schema if available
- known bad values
- dedupe rules
- date/currency locale
Workflow
- 1. Profile the CSV: row count, nulls, duplicates, type mismatches, and outliers.
- Normalize headers and map to the target schema.
- Generate a step-by-step cleanup plan and optional transformed output.
- Document irreversible operations before applying them.
- Return a quality score and remediation checklist.
Output contract
- - profile report
- normalized schema
- cleanup plan
- quality scorecard
Files in this skill
- - Script: INLINECODE0
- Resource: INLINECODE1
Operating rules
- - Be concrete and action-oriented.
- Prefer preview / draft / simulation mode before destructive changes.
- If information is missing, ask only for the minimum needed to proceed.
- Never fabricate metrics, legal certainty, receipts, credentials, or evidence.
- Keep assumptions explicit.
Suggested prompts
- - 清洗 CSV
- profile this dataset
- 数据质量检查
Use of script and resources
Use the bundled script when it helps the user produce a structured file, manifest, CSV, or first-pass draft.
Use the resource file as the default schema, checklist, or preset when the user does not provide one.
Boundaries
- - This skill supports planning, structuring, and first-pass artifacts.
- It should not claim that files were modified, messages were sent, or legal/financial decisions were finalized unless the user actually performed those actions.
Compatibility notes
- - Directory-based AgentSkills/OpenClaw skill.
- Runtime dependency declared through
metadata.openclaw.requires. - Helper script is local and auditable:
scripts/csv_cleanroom.py. - Bundled resource is local and referenced by the instructions:
resources/data_quality_checklist.md.
CSV 清洁室
目的
分析杂乱的CSV文件,标准化列名,检测数据质量问题,并生成可复现的清理方案。
触发短语
- - 清洗 CSV
- 分析此数据集
- 数据质量检查
- 列名规范化
- 构建清理方案
需要提供的输入
- - CSV文件或结构定义
- 目标结构(如有)
- 已知的无效值
- 去重规则
- 日期/货币区域设置
工作流程
- 1. 分析CSV:行数、空值、重复项、类型不匹配和异常值。
- 规范化表头并映射到目标结构。
- 生成逐步清理方案及可选的转换后输出。
- 在应用不可逆操作前进行文档记录。
- 返回质量评分和整改清单。
输出约定
本技能包含的文件
- - 脚本:{baseDir}/scripts/csvcleanroom.py
- 资源:{baseDir}/resources/dataquality_checklist.md
操作规则
- - 具体且以行动为导向。
- 在破坏性更改前优先使用预览/草稿/模拟模式。
- 若信息缺失,仅询问推进所需的最少信息。
- 绝不捏造指标、法律确定性、收据、凭证或证据。
- 明确说明假设条件。
建议提示
脚本和资源的使用
当有助于用户生成结构化文件、清单、CSV或初稿时,使用捆绑脚本。
当用户未提供默认结构、清单或预设时,使用资源文件作为默认方案。
边界
- - 本技能支持规划、结构化和初稿产出。
- 除非用户实际执行了相关操作,否则不应声称文件已被修改、消息已发送或法律/财务决策已最终确定。
兼容性说明
- - 基于目录的AgentSkills/OpenClaw技能。
- 通过metadata.openclaw.requires声明运行时依赖。
- 辅助脚本为本地可审计文件:scripts/csvcleanroom.py。
- 捆绑资源为本地文件,由指令引用:resources/dataquality_checklist.md。