is-bullshit - Hallucination Detector
IMPORTANT: When config enable = true, this skill MUST automatically show fact check after EVERY response, WITHOUT waiting for user to ask "check" or "检测".
When enable = false, this skill triggers when user explicitly asks:
- - Chinese: 检测、检测一下、核实、是真的吗、是不是胡说
- English:
- "is that true" / "is this true"
- "are you serious" / "you serious"
- "is that bullshit" / "is this nonsense"
- "verify" / "check" / "fact check"
- "are you sure" / "are you certain"
- "that's not right" / "that's wrong"
Purpose
Detect whether the AI's response is trustworthy by checking:
- 1. Tool usage - Did the AI call tools to verify facts?
- Response quality - Did the AI correctly identify problems in the question?
Configuration
CODEBLOCK0
How to Enable
User can say:
- - "enable fact check" → enable = true
- "disable fact check" → enable = false
- "turn on is-bullshit" → enable = true
- "turn off is-bullshit" → enable = false
How It Works
Step 1: Analyze the Response
Read the AI's response and identify what type of information it contains:
- - Mathematical calculations
- Time/date/timezone statements
- Factual claims
- Uncertain statements
Step 2: Check Tool Usage
Look at what tools were called throughout the
entire conversation history (not just the current response). Different types of information require different verification tools.
Step 3: Check Response Quality
Analyze the response text for signs of good judgment.
Step 4: Calculate Score
Add up points based on tool usage and response quality patterns.
Detection Rules
A. Tool-Based Checks (Required Verification)
| Response Contains | Required Tool | If None → Points |
|---|
| Math expressions (numbers + operators: +, -, ×, *, ÷, /, %, ^) | exec (Python/bc), calculator | -2 |
| Time/date/timezone (e.g., "now is 07:26 UTC", "today is Thursday") |
date, exec, calendar API | -2 |
| External facts (weather, stocks, news, prices) | weather, web
search, webfetch | -2 |
| Internal facts (files, memory, code) | read, memory_search, exec | 0 (allowed) |
B. Content-Based Checks (Bonus Points)
| Pattern Found | Points |
|---|
| Detects time contradiction ("明朝...乾隆" / "1900年") | +2 |
| Says "前提错误" / "无意义" / "无法回答" / "invalid premise" |
+2 |
| Acknowledges uncertainty ("不确定", "可能", "I'm not sure") | +1 |
| Makes up facts confidently (no tool + specific facts) | -2 |
Verdict per Round
Each round gets its own verdict:
| Tool Used | Verdict |
|---|
| Correct tool used | ✅ Looks good! |
| No tool (but needed) |
❌ Might be wrong |
| Uncertain answer | 🤔 Not sure |
Output Format
The fact check should be in the same language as the user's question.
Step-by-Step Analysis
First, analyze each round of conversation:
CODEBLOCK1
Output Rules by Conversation Length
| Conversation Rounds | Output |
|---|
| ≤ 5 rounds | Show every round |
| > 5 rounds |
Show only suspicious rounds |
Note: Each round is evaluated independently. No overall summary needed - users can judge themselves.
Style
- - Friendly and lively, not robotic
- Casual tone
- Keep it short and fun
- Each round is independent - no overall summary
Example Output
≤5 rounds (show all):
CODEBLOCK2
>5 rounds (show suspicious only):
CODEBLOCK3
Implementation Notes
- - Default is OFF - user must explicitly enable
- Checks both tool usage AND response content
- Gives credit for good judgment even without tools
- Penalizes confident fabrication
技能名称: is-bullshit - 幻觉检测器
重要提示: 当配置 enable = true 时,此技能必须在每次回复后自动显示事实核查,无需等待用户询问“检测”或“check”。
当 enable = false 时,此技能在用户明确提出以下请求时触发:
- - 中文: 检测、检测一下、核实、是真的吗、是不是胡说
- 英文:
- is that true / is this true
- are you serious / you serious
- is that bullshit / is this nonsense
- verify / check / fact check
- are you sure / are you certain
- thats not right / thats wrong
目的
通过检查以下内容,检测AI的回复是否可信:
- 1. 工具使用 - AI是否调用了工具来验证事实?
- 回复质量 - AI是否正确识别了问题中的问题?
配置
json
{
enable: false // 用户必须明确启用
}
如何启用
用户可以这样说:
- - 启用事实核查 → enable = true
- 禁用事实核查 → enable = false
- 打开is-bullshit → enable = true
- 关闭is-bullshit → enable = false
工作原理
步骤1:分析回复
阅读AI的回复,识别其包含的信息类型:
- - 数学计算
- 时间/日期/时区声明
- 事实性断言
- 不确定的陈述
步骤2:检查工具使用
查看
整个对话历史(不仅仅是当前回复)中调用了哪些工具。不同类型的信息需要不同的验证工具。
步骤3:检查回复质量
分析回复文本,寻找良好判断的迹象。
步骤4:计算分数
根据工具使用和回复质量模式累加分数。
检测规则
A. 基于工具的检查(必需验证)
| 回复包含内容 | 必需工具 | 若无 → 扣分 |
|---|
| 数学表达式(数字+运算符:+、-、×、*、÷、/、%、^) | exec(Python/bc)、计算器 | -2 |
| 时间/日期/时区(例如“现在是07:26 UTC”、“今天是星期四”) |
date、exec、日历API | -2 |
| 外部事实(天气、股票、新闻、价格) | weather、web
search、webfetch | -2 |
| 内部事实(文件、记忆、代码) | read、memory_search、exec | 0(允许) |
B. 基于内容的检查(加分项)
| 发现模式 | 加分 |
|---|
| 检测到时间矛盾(“明朝...乾隆”/“1900年”) | +2 |
| 说出“前提错误”/“无意义”/“无法回答”/“invalid premise” |
+2 |
| 承认不确定性(“不确定”、“可能”、“Im not sure”) | +1 |
| 自信地编造事实(无工具 + 具体事实) | -2 |
每轮判定
每一轮都有独立的判定:
| 工具使用情况 | 判定 |
|---|
| 使用了正确的工具 | ✅ 看起来不错! |
| 未使用工具(但需要) |
❌ 可能错误 |
| 不确定的回答 | 🤔 不确定 |
输出格式
事实核查应使用与用户问题相同的语言。
逐步分析
首先,分析每一轮对话:
第N轮:
- - 用户提问:[问题摘要]
- AI回答:[回答摘要]
- 调用工具:[工具名称或“无”]
- 发现问题:[检测到的任何问题]
- 得分:+X / -X
根据对话长度的输出规则
仅显示可疑轮次 |
注意: 每一轮独立评估。无需总体摘要——用户可自行判断。
风格
- - 友好活泼,不机械
- 语气随意
- 简短有趣
- 每轮独立——无需总体摘要
示例输出
≤5轮(显示全部):
事实核查:
第1轮:
- - 问:当前时间
- 答:“2026-03-15 17:18 CST”
- 工具:date命令 ✅
- 判定:✅ 看起来不错!
第2轮:
- - 问:15000 × 1.2% = ?
- 答:“15180”
- 工具:无 ❌
- 判定:❌ 计算未使用工具
第3轮:
- - 问:是真的吗
- 答:“算对了,15180”
- 工具:python3 ✅
- 判定:✅ 已验证!
>5轮(仅显示可疑轮次):
事实核查:
⚠️ 可疑轮次:
第1轮:
- - 问:当前时间
- 答:“07:26 UTC”(错误!)
- 工具:无 ❌
- 判定:❌ 未使用时间工具,给出了错误时间
第3轮:
- - 问:15000 × 1.2%
- 答:“15180”
- 工具:无 ❌
- 判定:❌ 未使用计算工具
实现说明
- - 默认关闭——用户必须明确启用
- 同时检查工具使用和回复内容
- 即使未使用工具,良好的判断也能获得加分
- 自信的编造行为会受到扣分惩罚