ThoughtProof — Epistemic Verification Skill

Multi-agent verification protocol for AI decisions. Like a TÜV for AI reasoning.

How It Works

ThoughtProof runs your question through multiple independent AI agents (different model families), then a critic layer identifies blind spots, and a synthesizer produces a consensus with confidence scores.

Pipeline: Normalize → Generate (3+ models) → Critique (adversarial) → Evaluate → Synthesize

Prerequisites

- pot-cli installed: INLINECODE1
At least one API key (Anthropic, OpenAI, xAI, or Moonshot)
More keys = more model diversity = better verification

Quick Start

Verify a claim or decision

CODEBLOCK0

Chain context from previous verifications

CODEBLOCK1

Deep analysis with rotated roles

CODEBLOCK2

Configuration

pot-cli reads config from ~/.potrc.json:

CODEBLOCK3

Show current config: INLINECODE3

Model Diversity Requirement

ThoughtProof enforces ≥3 different model families for generators. This is core to the protocol — no single provider can verify itself.

Output

Each verification produces an Epistemic Block:

- Proposals from each generator (independent reasoning)
Critique identifying blind spots, contradictions, and risks
Synthesis with consensus score, confidence level, and dissent
MDI (Model Diversity Index) — measures independence of reasoning

Blocks are stored locally as JSON and can be reviewed with tp list / tp show <n>.

Commands

Command	Description
INLINECODE6	Run full verification pipeline
INLINECODE7

Tiers

Tier	Agents	Time	Best For
Light	3	~30s	Quick sanity checks
Standard

5-7 | ~3min | Business decisions | | Deep | 7-12 | ~5min | High-stakes, regulatory |

When to Use ThoughtProof

- High-stakes decisions — investment, legal, medical, compliance
Audit trail needed — regulatory, governance, due diligence
Blind spot detection — when you suspect a single model is biased
Cross-domain questions — where no single model is expert

When NOT to Use

- Simple factual lookups (Google it)
Creative writing (subjective, no "correct" answer)
Time-sensitive queries under 30 seconds
Questions with trivially verifiable answers

Architecture Note

ThoughtProof is BYOK (Bring Your Own Key). Your API keys, your data, your models. Nothing routes through ThoughtProof servers. The skill is MIT-licensed; the consensus protocol is BSL-licensed.

References

- references/block-format.md — Epistemic Block JSON schema
INLINECODE13 — How consensus is calculated

ThoughtProof — 认知验证技能

面向AI决策的多智能体验证协议。如同AI推理领域的TÜV认证。

工作原理

ThoughtProof通过多个独立AI智能体（不同模型家族）运行你的问题，随后由批评层识别盲点，最后由综合器生成带有置信度评分的共识结论。

处理流程： 标准化 → 生成（3+模型） → 批评（对抗式） → 评估 → 综合

前置条件

- 已安装pot-cli：npm install -g pot-cli
至少一个API密钥（Anthropic、OpenAI、xAI或Moonshot）
密钥越多=模型多样性越高=验证效果越好

快速开始

验证某个主张或决策

bash
tp verify 我们的MVP应该采用微服务还是单体架构？

链式引用前次验证的上下文

bash
tp verify --context last 关于扩展性考量呢？

轮换角色进行深度分析

bash
tp deep 这个投资论点合理吗？

配置

pot-cli从~/.potrc.json读取配置：

json
{
generators: [
{ provider: xai, model: grok-4-1-fast },
{ provider: moonshot, model: kimi-k2.5 },
{ provider: anthropic, model: claude-sonnet-4-6 }
],
critic: { provider: anthropic, model: claude-opus-4-6 },
synthesizer: { provider: anthropic, model: claude-opus-4-6 }
}

查看当前配置：tp config

模型多样性要求

ThoughtProof强制要求生成器使用≥3种不同模型家族。这是协议的核心——没有任何单一提供商可以自我验证。

输出

每次验证生成一个认知区块：

- 提案来自每个生成器（独立推理）
批评识别盲点、矛盾点和风险点
综合包含共识评分、置信水平和异议意见
MDI（模型多样性指数）——衡量推理独立性

区块以JSON格式本地存储，可通过tp list / tp show 查看。

命令

命令	描述
tp verify <问题>	运行完整验证流程
tp verify --context last

层级

层级	智能体数量	耗时	最佳应用场景
轻量	3	~30秒	快速合理性检查
标准

5-7 | ~3分钟 | 商业决策 | | 深度 | 7-12 | ~5分钟 | 高风险、监管场景 |

何时使用ThoughtProof

- 高风险决策 — 投资、法律、医疗、合规
需要审计追踪 — 监管、治理、尽职调查
盲点检测 — 怀疑单一模型存在偏见时
跨领域问题 — 单一模型无法胜任专家角色时

何时不应使用

- 简单事实查询（请用搜索引擎）
创意写作（主观性强，无正确答案）
30秒内需响应的时效性查询
答案可简单验证的问题

架构说明

ThoughtProof采用BYOK（自带密钥）模式。你的API密钥、你的数据、你的模型。所有数据均不经过ThoughtProof服务器。该技能采用MIT许可证；共识协议采用BSL许可证。

参考资料

- references/block-format.md — 认知区块JSON模式
references/consensus-protocol.md — 共识计算方法

thoughtproof思想验证

thoughtproof

ThoughtProof — Epistemic Verification Skill

How It Works

Prerequisites