返回顶部
l

llm-evaluator

>

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.0
安全检测
已通过
667
下载量
1
收藏
概述
安装方式
版本历史

llm-evaluator

# LLM Evaluator ⚖️ LLM-as-a-Judge evaluation system powered by Langfuse. Uses GPT-5-nano to score AI outputs. ## When to Use - Evaluating quality of search results or AI responses - Scoring traces for relevance, accuracy, hallucination detection - Batch scoring recent unscored traces - Quality assurance on agent outputs ## Usage ```bash # Test with sample cases python3 {baseDir}/scripts/evaluator.py test # Score a specific Langfuse trace python3 {baseDir}/scripts/evaluator.py score <trace_id> # Score with specific evaluator only python3 {baseDir}/scripts/evaluator.py score <trace_id> --evaluators relevance # Backfill scores on recent unscored traces python3 {baseDir}/scripts/evaluator.py backfill --limit 20 ``` ## Evaluators | Evaluator | Measures | Scale | |-----------|----------|-------| | relevance | Response relevance to query | 0–1 | | accuracy | Factual correctness | 0–1 | | hallucination | Made-up information detection | 0–1 | | helpfulness | Overall usefulness | 0–1 | ## Credits Built by [M. Abidi](https://www.linkedin.com/in/mohammad-ali-abidi) | [agxntsix.ai](https://www.agxntsix.ai) [YouTube](https://youtube.com/@aiwithabidi) | [GitHub](https://github.com/aiwithabidi) Part of the **AgxntSix Skill Suite** for OpenClaw agents. 📅 **Need help setting up OpenClaw for your business?** [Book a free consultation](https://cal.com/agxntsix/abidi-openclaw)

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 llm-evaluator-pro-1776420065 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 llm-evaluator-pro-1776420065 技能

通过命令行安装

skillhub install llm-evaluator-pro-1776420065

下载 Zip 包

⬇ 下载 llm-evaluator v1.0.0

文件大小: 5.1 KB | 发布时间: 2026-4-17 18:57

v1.0.0 最新 2026-4-17 18:57
LLM-as-a-Judge evaluator via Langfuse

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部