Agent Learner

An AI toolkit for configuring, benchmarking, comparing, and optimizing agent prompts and evaluation results. Agent Learner provides persistent, file-based logging for each command category with timestamped entries, summary statistics, multi-format export, and full-text search across all records.

Commands

Command	Description
INLINECODE0	Configure agent settings — log configuration entries or view recent ones
INLINECODE1

Each data command (configure, benchmark, compare, etc.) works in two modes:

- Without arguments: displays the 20 most recent entries from that category
With arguments: saves the input as a new timestamped entry and reports the total count

Data Storage

All data is stored in plain text files under the data directory:

- Category logs: $DATA_DIR/<command>.log — one file per command (e.g., configure.log, benchmark.log, prompt.log), each entry is INLINECODE23
History log: $DATA_DIR/history.log — audit trail of every command executed with timestamps
Export files: $DATA_DIR/export.<fmt> — generated by the export command in json, csv, or txt format

Default data directory: INLINECODE27

Requirements

- Bash (with set -euo pipefail support)
Standard Unix utilities: grep, cat, date, echo, wc, du, head, tail, INLINECODE37
No external dependencies or API keys required

When to Use

1. Benchmarking agent performance — When you need to track and compare benchmark results across different agent configurations, models, or prompt strategies
Prompt engineering iteration — When you're testing multiple prompt variations and want to log each version with results for later comparison
Cost and usage tracking — When you need to monitor API costs and usage metrics over time to optimize spending
Fine-tuning experiments — When running fine-tuning sessions and you want to log parameters, results, and observations for reproducibility
Cross-category analysis — When you need to search across all logged data (benchmarks, prompts, evaluations, costs) to find patterns or specific entries

Examples

CODEBLOCK0

Output

All commands return output to stdout. Export files are written to the data directory:

CODEBLOCK1

Every command execution is logged to $DATA_DIR/history.log for auditing purposes.

智能体学习器

一个用于配置、基准测试、比较和优化智能体提示词及评估结果的AI工具包。智能体学习器为每个命令类别提供基于文件的持久化日志记录，包含时间戳条目、汇总统计、多格式导出以及跨所有记录的全文本搜索功能。

命令

命令	描述
configure	配置智能体设置 — 记录配置条目或查看最近的配置
benchmark

每个数据命令（configure、benchmark、compare等）有两种工作模式：

- 无参数：显示该类别最近的20条条目
带参数：将输入保存为新的带时间戳条目，并报告总条目数

数据存储

所有数据以纯文本文件形式存储在数据目录下：

- 类别日志：$DATADIR/.log — 每个命令一个文件（例如configure.log、benchmark.log、prompt.log），每条条目格式为timestamp|value
历史日志：$DATADIR/history.log — 每个执行命令的审计追踪记录，包含时间戳
导出文件：$DATA_DIR/export. — 由export命令以json、csv或txt格式生成

默认数据目录：~/.local/share/agent-learner/

系统要求

- Bash（支持set -euo pipefail）
标准Unix工具：grep、cat、date、echo、wc、du、head、tail、basename
无需外部依赖或API密钥

使用场景

1. 智能体性能基准测试 — 当您需要追踪和比较不同智能体配置、模型或提示词策略下的基准测试结果时
提示词工程迭代 — 当您测试多个提示词变体并希望记录每个版本及其结果以便后续比较时
成本和使用追踪 — 当您需要监控API成本和使用指标以优化支出时
微调实验 — 当运行微调会话并希望记录参数、结果和观察结果以确保可复现性时
跨类别分析 — 当您需要搜索所有记录数据（基准测试、提示词、评估、成本）以发现模式或特定条目时

示例

bash

初始化并检查状态

agent-learner status

记录基准测试结果

agent-learner benchmark GPT-4o在MMLU上：88.7%准确率，平均延迟1.2秒

记录提示词变体

agent-learner prompt 系统：你是一个有用的编程助手。始终逐步解释你的推理过程。

比较两个配置

agent-learner compare GPT-4o vs Claude-3.5：GPT-4o快12%，Claude在代码任务上准确率高5%

追踪成本

agent-learner cost 三月份批次：输入12,450个token，输出3,200个token，总计$0.47

查看所有最近的基准测试

agent-learner benchmark

跨所有日志搜索特定术语

agent-learner search 准确率

将所有数据导出为JSON

agent-learner export json

查看汇总统计

agent-learner stats

显示最近活动

agent-learner recent

输出

所有命令将输出返回到stdout。导出文件写入数据目录：

bash
agent-learner export json # → ~/.local/share/agent-learner/export.json
agent-learner export csv # → ~/.local/share/agent-learner/export.csv
agent-learner export txt # → ~/.local/share/agent-learner/export.txt

每个命令的执行都会被记录到$DATA_DIR/history.log中，用于审计目的。

由BytesAgain提供 | bytesagain.com | hello@bytesagain.com

agent-learner智能体学习器

agent-learner

Agent Learner

Commands

Data Storage

Requirements

When to Use

Examples

Output

智能体学习器

命令

数据存储

系统要求

使用场景

示例

初始化并检查状态

记录基准测试结果

记录提示词变体

比较两个配置

追踪成本

查看所有最近的基准测试

跨所有日志搜索特定术语

将所有数据导出为JSON

查看汇总统计

显示最近活动

输出

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

agent-learner智能体学习器

agent-learner

Agent Learner

Commands

Data Storage

Requirements

When to Use

Examples

Output

智能体学习器

命令

数据存储

系统要求

使用场景

示例

初始化并检查状态

记录基准测试结果

记录提示词变体

比较两个配置

追踪成本

查看所有最近的基准测试

跨所有日志搜索特定术语

将所有数据导出为JSON

查看汇总统计

显示最近活动

输出

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement