CSVBrain

Version: 1.0.3
Author: @TheShadowRose
License: MIT

Description

Load CSV files and ask questions in plain English. AI-powered natural language queries via Anthropic, OpenAI, or local Ollama. No SQL required.

CSVBrain parses CSV files (comma, semicolon, or tab-delimited), profiles your data automatically, and lets you query it with structured filters or plain English questions powered by AI.

Features

- CSV Loading — Parse CSV files with automatic delimiter detection (comma, semicolon, tab). Handles quoted fields and escaped quotes.
Data Profiling — Instant statistics for every column: count, missing values, unique values, min/max/avg for numeric columns.
Structured Queries — Filter, sort, limit, and aggregate your data programmatically.
Natural Language Ask — Ask questions about your data in plain English. AI analyzes your dataset's structure, types, and statistics to give accurate answers with specific numbers.
Multi-Provider AI — Route questions to Anthropic (Claude), OpenAI (GPT), or local Ollama models. Just change the model prefix.
Zero Dependencies — Pure Node.js. No npm packages required. HTTP calls use built-in https/http modules.

Installation

Copy src/csv-brain.js into your project.

CODEBLOCK0

Quick Start

CODEBLOCK1

API

`new CSVBrain(options?)`

Create a new instance.

Option	Type	Default	Description
INLINECODE4	INLINECODE5	INLINECODE6	Default AI model for INLINECODE7

CODEBLOCK2

`load(filePath, options?)`

Load a CSV file synchronously.

Option	Type	Default	Description
INLINECODE9	INLINECODE10	auto-detect	Force a specific delimiter

Returns: INLINECODE11

CODEBLOCK3

`profile()`

Get statistical profile of all columns.

Returns: Object keyed by column name, each with type, count, missing, unique, and (for numeric columns) min, max, avg.

CODEBLOCK4

`query(options)`

Run a structured query against loaded data.

Option	Type	Description
INLINECODE21	INLINECODE22	Filter rows. Operators: `>`, `<`, `>=`, `<=`, `=`, INLINECODE28
INLINECODE29

{ column, order } | Sort by column. Order: "asc" or "desc" |
| limit | number | Maximum rows to return |
| aggregate | { column } | Return count, sum, avg, min, max for a numeric column |

CODEBLOCK5

`async ask(question, options?)`

Ask a natural language question about your data. Requires an AI provider API key (or local Ollama).

Option	Type	Default	Description
INLINECODE43	INLINECODE44	Instance default	AI model with provider prefix
INLINECODE45

Returns: INLINECODE50

CODEBLOCK6

AI Provider Setup

Anthropic (Claude)

Set your API key as an environment variable:

CODEBLOCK7

Models: anthropic/claude-haiku-4-5, anthropic/claude-sonnet-4-20250514, etc.

OpenAI (GPT)

CODEBLOCK8

Models: openai/gpt-4o-mini, openai/gpt-4o, etc.

Ollama (Local)

No API key required. Just run Ollama locally:

CODEBLOCK9

Models: ollama/llama3, ollama/mistral, etc.

Optionally set a custom host:

CODEBLOCK10

Error Handling

If the AI provider is unavailable, ask() returns a graceful error instead of throwing:

CODEBLOCK11

Supported File Formats

- CSV — Comma-separated values (.csv)
TSV — Tab-separated values (.tsv, .txt)
Semicolon-delimited — Common in European locale exports

Delimiter is auto-detected from the first line, or can be specified manually.

Note: Excel files (.xlsx, .xls) are not supported. Export your spreadsheet to CSV first.

Limitations

- Files are loaded synchronously and fully into memory. Very large files (100MB+) may cause performance issues.
AI answers depend on the quality and context window of the chosen model. Only column profiles and the first 5 sample rows are sent to the AI — not the entire dataset.
No streaming support. The full AI response is returned at once.
No built-in export functionality. Use query() results with your own file-writing logic.

Disclaimer

CSVBrain is provided as-is under the MIT License. AI-generated answers may not always be accurate — always verify critical data analysis. API usage may incur costs from your AI provider.

Support

- Issues: github.com/TheShadowRose/CSVBrain/issues
Author: @TheShadowRose

CSVBrain

版本： 1.0.3
作者： @TheShadowRose
许可证： MIT

描述

加载CSV文件，并用纯英语提问。通过Anthropic、OpenAI或本地Ollama实现AI驱动的自然语言查询。无需SQL。

CSVBrain可解析CSV文件（逗号、分号或制表符分隔），自动分析您的数据，并允许您通过结构化过滤器或AI驱动的纯英语问题来查询数据。

功能特性

- CSV加载 — 解析CSV文件，自动检测分隔符（逗号、分号、制表符）。支持引号字段和转义引号。
数据分析 — 每列的即时统计：计数、缺失值、唯一值、数值列的最小/最大/平均值。
结构化查询 — 以编程方式过滤、排序、限制和聚合数据。
自然语言提问 — 用纯英语提问关于数据的问题。AI会分析数据集的结构、类型和统计信息，给出带有具体数字的准确答案。
多提供商AI — 将问题路由到Anthropic（Claude）、OpenAI（GPT）或本地Ollama模型。只需更改模型前缀。
零依赖 — 纯Node.js。无需npm包。HTTP调用使用内置的https/http模块。

安装

将src/csv-brain.js复制到您的项目中。

js
const { CSVBrain } = require(./src/csv-brain);

快速开始

js
const { CSVBrain } = require(./src/csv-brain);

const brain = new CSVBrain();
const info = brain.load(sales.csv);
console.log(info);
// { rows: 1200, columns: 8, types: { month: text, revenue: number, ... } }

// 分析数据
const stats = brain.profile();
console.log(stats.revenue);
// { type: number, count: 1200, missing: 0, unique: 987, min: 12.5, max: 94200, avg: 8450.32 }

// 用纯英语提问
const result = await brain.ask(我们哪个月的收入最高？);
console.log(result.answer);
// 根据数据，三月的总收入最高，达到94,200美元。
console.log(result.model);
// anthropic/claude-haiku-4-5

API

new CSVBrain(options?)

创建一个新实例。

选项	类型	默认值	描述
model	string	anthropic/claude-haiku-4-5	ask()的默认AI模型

js
const brain = new CSVBrain({ model: openai/gpt-4o-mini });

load(filePath, options?)

同步加载CSV文件。

选项	类型	默认值	描述
delimiter	string	自动检测	强制指定特定分隔符

返回： { rows: number, columns: number, types: object }

js
const info = brain.load(data.csv);
const info2 = brain.load(data.tsv, { delimiter: \t });

profile()

获取所有列的统计概况。

返回： 以列名为键的对象，每个包含type、count、missing、unique，以及（对于数值列）min、max、avg。

js
const stats = brain.profile();
console.log(stats);

query(options)

对已加载的数据运行结构化查询。

选项	类型	描述
filter	{ column, operator, value }	过滤行。运算符：>、<、>=、<=、=、contains
sort

{ column, order } | 按列排序。排序方式：asc或desc |
| limit | number | 返回的最大行数 |
| aggregate | { column } | 返回数值列的count、sum、avg、min、max |

js
// 过滤和排序
const topSales = brain.query({
filter: { column: revenue, operator: >, value: 10000 },
sort: { column: revenue, order: desc },
limit: 10
});

// 聚合
const totals = brain.query({
aggregate: { column: revenue }
});
console.log(totals);
// { count: 1200, sum: 10140384, avg: 8450.32, min: 12.5, max: 94200 }

async ask(question, options?)

用自然语言提问关于数据的问题。需要AI提供商API密钥（或本地Ollama）。

选项	类型	默认值	描述
model	string	实例默认值	带提供商前缀的AI模型
apiKey

返回： { answer: string, data: any, query: object|null, model: string }

js
// 使用Anthropic（默认）
// 需要ANTHROPICAPIKEY环境变量
const result = await brain.ask(哪个产品类别的平均价格最高？);
console.log(result.answer);
// 电子产品的平均价格最高，为342.50美元，其次是家电，为289.00美元。

// 使用OpenAI
// 需要OPENAIAPIKEY环境变量
const result2 = await brain.ask(第四季度下了多少订单？, {
model: openai/gpt-4o-mini
});

// 使用本地Ollama（无需API密钥）
const result3 = await brain.ask(总结销售趋势, {
model: ollama/llama3
});

AI提供商设置

Anthropic（Claude）

将您的API密钥设置为环境变量：

bash
export ANTHROPICAPIKEY=sk-ant-...

模型：anthropic/claude-haiku-4-5、anthropic/claude-sonnet-4-20250514等。

OpenAI（GPT）

bash
export OPENAIAPIKEY=sk-...

模型：openai/gpt-4o-mini、openai/gpt-4o等。

Ollama（本地）

无需API密钥。只需在本地运行Ollama：

bash
ollama serve
ollama pull llama3

模型：ollama/llama3、ollama/mistral等。

可选设置自定义主机：

bash
export OLLAMA_HOST=http://192.168.1.100:11434

错误处理

如果AI提供商不可用，ask()会返回一个优雅的错误信息，而不是抛出异常：

js
const result = await brain.ask(趋势如何？);
if (result.answer.startsWith(AI不可用：)) {
console.log(回退到手动查询...);
const data = brain.query({ sort: { column: date, order: asc } });
}

支持的文件格式

- CSV — 逗号分隔值（.csv）
TSV — 制表符分隔值（.tsv、.txt）
分号分隔 — 欧洲地区导出常见格式

分隔符从第一行自动检测，也可以手动指定。

注意： 不支持Excel文件（.xlsx、.xls）。请先将电子表格导出为CSV。

限制

- 文件同步加载并完全读入内存。非常大的文件（100MB以上）可能导致性能问题。
AI答案取决于所选模型的质量和上下文窗口。只有列概况和前5行样本数据会发送给AI——而非整个数据集。
不支持流式传输。AI响应一次性完整返回。
没有内置导出功能。请使用query()结果配合您自己的文件写入逻辑。

免责声明

CSVBrain按MIT许可证原样提供。AI生成的答案可能并不总是准确——请始终验证关键数据分析。API使用可能会产生来自AI提供商的费用。

支持

- 问题反馈： github.com/TheShadowRose/CSVBrain/issues
作者： @TheShadowRose

CSVBrain Natural Language Data QueriesCSV自然语言查询

CSVBrain Natural Language Data Queries

CSVBrain

Description

Features

Installation

Quick Start