parallel-enrichment

# Parallel Enrichment Bulk data enrichment that adds web-sourced fields to lists of companies, people, or products. Describe what you want in natural language. ## When to Use Trigger this skill when the user asks for: - "enrich this list with...", "add CEO names to...", "find funding for these companies..." - "look up contact info for...", "get LinkedIn profiles for..." - Bulk data operations on CSV files or lists - Adding web-sourced columns to existing datasets - Lead enrichment, company research, product comparison ## Quick Start ```bash # Inline data parallel-cli enrich run \ --data '[{"company": "Google"}, {"company": "Microsoft"}]' \ --intent "CEO name and founding year" \ --target output.csv # CSV file parallel-cli enrich run \ --source-type csv --source input.csv \ --target output.csv \ --intent "CEO name and founding year" ``` ## CLI Reference ### Basic Usage ```bash parallel-cli enrich run [options] ``` **Note:** There is no `--json` flag for enrich. Results are written to the target file. ### Common Flags | Flag | Description | |------|-------------| | `--data "<json>"` | Inline JSON array of records | | `--source-type csv` | Source file type | | `--source <path>` | Input CSV file path | | `--target <path>` | Output CSV file path | | `--source-columns "<json>"` | Describe input columns | | `--enriched-columns "<json>"` | Specify output columns | | `--intent "<description>"` | Natural language description of what to find | | `--processor <tier>` | Processing tier (see table below) | ### Processor Tiers | Processor | Use Case | |-----------|----------| | `lite-fast` | Simple lookups | | `base-fast` | Basic enrichment | | `core-fast` | Standard enrichment | | `pro-fast` | Deep enrichment (default) | | `ultra-fast` | Complex multi-source enrichment | ### Examples **Inline data enrichment:** ```bash parallel-cli enrich run \ --data '[{"company": "Stripe"}, {"company": "Square"}, {"company": "Adyen"}]' \ --intent "CEO name, headquarters city, and latest funding round" \ --target ./companies-enriched.csv ``` **CSV file enrichment:** ```bash parallel-cli enrich run \ --source-type csv \ --source ./leads.csv \ --target ./leads-enriched.csv \ --source-columns '[{"name": "company_name", "description": "Company name"}]' \ --intent "Find CEO name, company size, and LinkedIn company page URL" ``` **With explicit output columns:** ```bash parallel-cli enrich run \ --data '[{"name": "Sam Altman"}, {"name": "Satya Nadella"}]' \ --source-columns '[{"name": "name", "description": "Person full name"}]' \ --enriched-columns '[ {"name": "current_company", "description": "Current company/employer"}, {"name": "title", "description": "Current job title"}, {"name": "twitter", "description": "Twitter/X handle"} ]' \ --target ./people-enriched.csv ``` **Using AI to suggest columns:** ```bash # First, get AI suggestions parallel-cli enrich suggest \ --source-type csv \ --source ./companies.csv \ --intent "competitor analysis data" # Then run with suggested columns parallel-cli enrich run \ --source-type csv \ --source ./companies.csv \ --target ./companies-analysis.csv \ --intent "competitor analysis: market position, key products, recent news" ``` ## Best-Practice Prompting ### Intent Description Write 1-2 sentences describing: - What specific fields you want to add - Context about the data (B2B companies, tech startups, etc.) - Any constraints (recent data, specific sources) **Good:** ``` --intent "Find CEO name, total funding raised, and number of employees for B2B SaaS companies" ``` **Poor:** ``` --intent "Find stuff about these companies" ``` ### Source Column Descriptions When using `--source-columns`, provide context: ```json [ {"name": "company", "description": "Company name, may include Inc/LLC suffix"}, {"name": "website", "description": "Company website URL, may be partial"} ] ``` ## Response Format The CLI outputs: - A monitoring URL to track progress - Status updates as rows are processed - Final output written to target CSV The target CSV contains: - All original columns from the source - New enriched columns as specified - A `_parallel_status` column indicating success/failure per row ## Output Handling After enrichment completes: 1. Report the number of rows enriched 2. Preview the first few rows: `head -6 output.csv` 3. Share the full path to the output file 4. Note any rows that failed enrichment ## Configuration File For complex enrichments, use a YAML config: ```yaml # enrich-config.yaml source: type: csv path: ./input.csv columns: - name: company_name description: "Company legal name" - name: website description: "Company website URL" target: type: csv path: ./output.csv enriched_columns: - name: ceo_name description: "Current CEO full name" - name: employee_count description: "Approximate number of employees" - name: funding_total description: "Total funding raised in USD" processor: pro-fast ``` Then run: ```bash parallel-cli enrich run enrich-config.yaml ``` ## Running Out of Context? For large enrichments, save results and use `sessions_spawn`: ```bash parallel-cli enrich run --source-type csv --source input.csv --target /tmp/enriched-<topic>.csv --intent "..." ``` Then spawn a sub-agent: ```json { "tool": "sessions_spawn", "task": "Read /tmp/enriched-<topic>.csv and summarize the results. Report row count, success rate, and preview first 5 rows.", "label": "enrich-summary" } ``` ## Error Handling | Exit Code | Meaning | |-----------|---------| | 0 | Success | | 1 | Unexpected error (network, parse) | | 2 | Invalid arguments | | 3 | API error (non-2xx) | Common issues: - **Row failures:** Check `_parallel_status` column in output - **Timeout:** Use smaller batches or lower processor tier - **Rate limits:** Add delays between large enrichments ## Prerequisites 1. Get an API key at [parallel.ai](https://parallel.ai) 2. Install the CLI: ```bash curl -fsSL https://parallel.ai/install.sh | bash export PARALLEL_API_KEY=your-key ``` ## References - [API Docs](https://docs.parallel.ai) - [Enrichment API Reference](https://docs.parallel.ai/api-reference/enrichment)

parallel-enrichment

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

parallel-enrichment