Kalshalyst — Contrarian Prediction Market Scanner
Overview
Kalshalyst is a complete intelligence system for finding and trading prediction market opportunities. It combines:
- - Claude Sonnet contrarian estimation — sees market prices and finds reasons they're WRONG
- Brier score tracking — measures how well your estimates calibrate against actual outcomes
- Kelly Criterion position sizing — calculates optimal trade size for each opportunity
- Five-phase pipeline — FETCH → CLASSIFY → ESTIMATE → EDGE → ALERT
The key insight: blind estimation (not seeing prices) produces consensus-matching estimates with zero edge. Contrarian mode (showing Claude the price and asking it to disagree) produces opinionated, directional calls with real edge.
When to Use This Skill
- - You want to find mispricings on Kalshi prediction markets
- You're looking for contrarian opportunities where the market is wrong
- You need to track how accurate your probability estimates are over time
- You want to size positions intelligently based on edge and confidence
- You're building a systematic prediction market trading system
First run does not require Kalshi credentials. If they are missing, Kalshalyst prints a realistic demo scan and writes demo cache data so downstream tools like Market Morning Brief still have something useful to show.
Requirements
API Keys & Credentials
- 1. Kalshi API Key (free at kalshi.com)
- Sign up at https://kalshi.com
- Navigate to Settings → API
- Generate API credentials (key ID + private key file)
- Cost: Free tier supports unlimited reads, small position limits
- 2. Anthropic API Key (for Claude Sonnet)
- Create account at https://console.anthropic.com
- Generate an API key
- The public reference implementation calls Anthropic directly
- Budget: variable by scan volume; expect non-zero Claude cost if you run frequent scans
- 3. Polygon.io API Key (optional, free tier available)
- Sign up at https://polygon.io
- Free tier includes market data + basic news
- Cost: Free tier sufficient, paid plans for higher volume
Python & Dependencies
- - Python 3.10 or higher
- Required packages:
CODEBLOCK0
- - Optional (for local fallback estimation):
- Ollama (https://ollama.ai) with Qwen model
- Install from https://ollama.ai (macOS, Linux, Windows)
- Then: INLINECODE0
Configuration
Create or update your config.yaml file with:
CODEBLOCK1
The Five-Phase Pipeline
Phase 1: FETCH — Kalshi Market Discovery
Fetches all open markets from Kalshi and applies pre-filters:
Blocklist Filtering:
- - Ticker prefixes: KXHIGH, KXLOW, KXRAIN, KXSNOW, INX, NASDAQ (weather, intraday noise)
- Category slugs: weather, climate, entertainment, sports, social-media
- Micro-timeframes: "in next 15 min", "in next 5 hours" (coin flips)
- Sports tokens: NFL, NBA, soccer, esports (blocked from the production stack)
Timeframe Gates:
- - Minimum days to close: 7 (default)
- Maximum days to close: 365 (default)
- Markets without expiration dates are blocked (usually garbage)
Volume Floor:
- - Minimum volume: 50 contracts (default)
- Filters out illiquid, low-interest markets
Price Availability:
- - Must have at least one price signal (yesbid, yesask, yesprice, or lastprice)
- Resolves multiple price sources (bid/ask mid preferred)
Output: List of ~100-500 pre-filtered markets ready for analysis
Phase 2: CLASSIFY — Market Classification
Status: Disabled (Qwen unreliable) — Markets pass through with defaults.
When re-enabled, would use local Qwen to classify each market by:
- - Category: politics, economics, crypto, policy, technology, etc.
- Tradability: 0.0-1.0 score (how analyzable with public info?)
- News sensitivity: True if breaking news would materially shift the probability
For now, all markets receive default classification values and proceed to Phase 3.
Phase 3: ESTIMATE — Claude Contrarian Probability Estimation
The core IP. Claude sees the market price and is asked to find reasons it's WRONG.
System Prompt (Contrarian Mode):
CODEBLOCK2
Context Enrichment:
- - Recent news from Polygon.io (if configured)
- Macro indicators: S&P 500, Bitcoin, VIX proxy, Gold
- X/Twitter sentiment signals (if available)
- Market liquidity and volume
Fallback to Qwen:
- - If Claude is unavailable (cooldown, network error), falls back to local Qwen
- Qwen runs blind (no price shown) to prevent anchoring
- Falls back to full Qwen batch mode after 3 consecutive Claude failures
Output: Estimated probability, confidence, reasoning, key factors
Phase 4: EDGE — Edge Calculation
For each estimate, calculates the edge (profit potential):
CODEBLOCK3
Why limit orders? You're a sophisticated trader who can post limit orders at midpoint or better, avoiding spread costs.
Filtering:
- - Minimum effective edge: 3% (default, configurable)
- Minimum confidence: 0.4 (filter out uncertain estimates)
- Drop estimates with direction = "fair" (no edge)
Output: Ranked list of edges, highest first
Phase 5: CACHE & ALERT
Cache Writing:
- - Writes research cache to
.kalshi_research_cache.json (used by related commands) - Detailed edge data to
state/kalshalyst_results.json for analysis
Alerting:
- - Filters opportunities by alert threshold (default: 6% effective edge)
- Sends top 3 opportunities to the user
- Format: ticker, direction (YES/NO), probability, edge, confidence, reasoning
Brier Tracking:
- - Every estimate is logged to SQLite database
- Computes info_density (news + X signals + economic context + liquidity)
- Used for later calibration analysis
Blocklist System
Complete Blocklist Reference
Ticker Prefixes (High-Volume Garbage):
CODEBLOCK4
Category Slugs (API-Level Blocking):
CODEBLOCK5
Micro-Timeframe Patterns (Coin Flips):
CODEBLOCK6
Sports Tokens (Blocked From The Production Stack):
- - Major leagues: NFL, NBA, MLB, NHL, MLS, NCAA, PGA, UFC, WWE
- Soccer: Premier League, La Liga, Serie A, Bundesliga, Champions League, Copa
- Esports: Valorant, League of Legends, CS:GO, Dota, Overwatch
- Individual sports: ATP, WTA, Tennis, Boxing, MMA
Why These Filters?
- - Weather + Intraday: Near-pure noise — impossible to extract edge
- Sports: Intentionally excluded. Recent evaluation did not show durable model edge, so sports are not part of the current production stack.
- Entertainment: Celebrity/social media volatility — not analyzable with Claude
- Micro-timeframe: Spreads dominate, zero informational edge
- Blocklist philosophy: Cut the bottom 80% of opportunities (noise) to focus Claude on the top 20% (signal)
Contrarian Estimation — Why It Works
The Problem with Blind Estimation
Blind mode (not showing Claude the market price):
- - Claude produces "consensus" estimates
- Usually close to 50% for uncertain markets
- Results in zero edge (estimate ≈ market price)
- Not actionable
Why? Claude doesn't know what the market thinks, so it defaults to high uncertainty.
The Solution: Contrarian Mode
Contrarian mode (showing Claude the price):
- - Claude sees the market price: "Market is priced at 35%"
- Prompt asks: "Is this price WRONG?"
- Claude identifies reasons for disagreement:
- Missing recent news
- Crowd psychology error
- Timing mismatch
- Base rate neglect
- - Result: Opinionated directional call (e.g., 62% vs 35% market) → 27% edge
Example
Market: "Will Ukraine still be at war in 2026?"
Market Price: 72% (market implies YES very likely)
Recent Context: Leaked peace negotiations, US pushing settlement
Claude Contrarian Reasoning:
CODEBLOCK7
Edge: |38% - 72%| = 34% effective edge → BUY NO at 28¢ expected return
Brier Score Tracking
What It Measures
Brier Score = (1/n) * Σ(forecast - outcome)²
- - 0.0 = perfect estimates
- 0.25 = random baseline (coin flip for 50/50 events)
- Above 0.25 = worse than guessing (miscalibrated)
How It Works
- 1. Log Phase: Every edge scanner run logs all estimates to SQLite
- Ticker, estimated probability, market price, confidence, estimator (Claude vs Qwen)
- Category (politics, policy, crypto, etc.)
- Edge percentage, info density (context richness score)
- 2. Resolve Phase: When markets close on Kalshi, log the outcome
- Automatic:
check_and_resolve_markets() polls Kalshi API daily
- Manual: INLINECODE5
- 3. Report Phase:
get_brier_report() computes calibration
- Overall Brier score
- Breakdown by estimator (Claude vs Qwen accuracy)
- Breakdown by category (politics, crypto, policy)
- Calibration buckets: when you say 70%, does it resolve YES ~70%?
- Win rate for "edge trades" (edge >= 4%)
Database Schema
CODEBLOCK8
Info Density Scoring
Measures how much context was available at estimation time (0.0-1.0):
- - News articles (0.0-0.25): 0-3 Polygon articles → 0/0.15/0.25
- X signals (0.0-0.25): Corroborating social signal from X scanner
- Economic context (0.0-0.25): S&P 500, Bitcoin, VIX available
- Liquidity (0.0-0.25): Volume + open interest proxy
Higher density estimates should be better calibrated (more informed).
Calibration Alerts
Weekly check: get_calibration_alert()
- - Identifies categories with Brier > 0.25 (worse than random)
- Suggests recalibrating estimator prompts for those categories
- Example: "Politics: Brier 0.31 (5 resolved) — systematically overconfident"
Kelly Criterion Position Sizing
The Math
For binary prediction markets, Kelly fraction tells you what % of bankroll to risk:
CODEBLOCK9
Example: Estimate YES at 65%, market prices at 50¢
CODEBLOCK10
With $200 bankroll → $60 bet = 120 contracts at 50¢
Fractional Kelly (Conservative)
Full Kelly is risky with noisy estimates. Kalshalyst uses fractional Kelly (α = 0.25 default):
CODEBLOCK11
Same example, confidence = 0.7:
CODEBLOCK12
Hard Caps (Defense-in-Depth)
- - Max contracts per trade: 100
- Max cost per trade: $25
- Max portfolio exposure: $100 (user-configurable)
- Minimum edge: 3% (below this, noise dominates)
Configuration
CODEBLOCK13
Output Format
CODEBLOCK14
Scheduling & Deployment
Command-Line Usage
CODEBLOCK15
As a Cron Job (Every 60 Minutes)
CODEBLOCK16
As a OpenClaw Scheduled Task
CODEBLOCK17
Docker Deployment
CODEBLOCK18
Example Output
Alert Message
CODEBLOCK19
Research Cache (Programmatic Access)
CODEBLOCK20
API Reference
Main Functions
CODEBLOCK21
Troubleshooting
Claude Unavailable (Rate Limited / Cooldown)
Kalshalyst automatically falls back to Qwen after 3 consecutive Claude failures. Check:
- - Anthropic API key validity and quota
- Network connectivity
- Rate limits (Claude has usage-based limits)
To force Qwen fallback immediately:
CODEBLOCK22
No Markets Passing Filters
If "no markets passed filters":
- - Kalshi API may be down (check https://status.kalshi.com)
- All markets may be blocklisted (verify blocklist config)
- Check your network — fetch may have timed out
To debug:
CODEBLOCK23
Brier Score Not Computing
Brier score requires market resolutions. If none available:
- 1. Wait for markets to close (usually 24-48 hours)
- Manually log resolutions: INLINECODE8
- Check database: INLINECODE9
Position Sizing Returns 0 Contracts
Possible causes:
- - Edge below 3% minimum (check
min_edge_for_sizing) - Confidence below 0.2 threshold
- Bankroll too low or exposure at limit
- Kelly fraction negative (bad odds for the bet)
Verify with:
CODEBLOCK24
Performance & Cost
Typical Run Metrics
- - Runtime: 2-4 minutes (50 markets, Claude estimation)
- API calls:
- Kalshi: 1-10 calls (market fetch, pagination)
- Claude: 50-80 calls (estimate_batch)
- Polygon: 2 calls (economic indicators, every 12 hours)
- Claude: variable by model and usage volume
- Polygon: ~$0 (free tier)
- Kalshi: $0 (read-only)
Scaling
For scheduled operation, Claude spend scales directly with your scan frequency and model selection. If you want a zero-API-cost fallback, keep Ollama/Qwen available and treat it as a lower-quality backup path rather than the primary estimator.
OpenClaw Ecosystem Integration
Kalshalyst is the intelligence engine of the Prediction Market Trading Stack. It feeds edge data to other skills:
| Connected Skill | What It Gets From Kalshalyst |
|---|
| Market Morning Brief | Top edges appear in your daily morning digest |
| Kalshi Command Center |
Execute trades directly on flagged opportunities |
|
Portfolio Drift Monitor | Track positions opened on Kalshalyst recommendations |
Install the complete stack:
CODEBLOCK25
Implementation Notes
Battle-tested in production trading environments. Key design decisions:
- 1. Generic config loading — works with any OpenClaw config.yaml
- Direct alerts — sends opportunities directly to you, no middleware
- All thresholds, blocklists, and prompts refined through live trading
- Scripts are standalone — zero external dependencies beyond listed packages
Further Reading
- - See
references/contrarian-prompt.md for the full Claude system prompt with explanation - See
references/blocklist.md for complete blocklist documentation - See
references/brier-schema.md for database schema and methodology - See
references/kelly-math.md for Kelly Criterion math derivation
Support & Iteration
Kalshalyst is actively maintained. Common iteration paths:
- 1. Contrarian Prompt Tuning: Refine system prompt to improve edge quality
- Blocklist Refinement: Add new ticker prefixes or patterns as you encounter noise
- Category Recalibration: Use Brier reports to identify systematically miscalibrated categories
- Kelly Parameters: Adjust alpha, max contracts, exposure limits based on risk tolerance
- Fallback Estimators: Replace Qwen with custom local models or other APIs
See the GitHub issues (if published) or contact the maintainer for feature requests.
Agent Bug-Fix Protocol
CRITICAL: When you (the agent) discover and fix a bug in any script during a conversation, you MUST write the fix to the source file on disk using file editing tools. Showing the fix in chat is NOT sufficient — the next scheduled run starts a fresh session and will not have your conversational fix.
Checklist after finding a bug:
- 1. Edit the actual
.py file on disk (e.g., ~/skills/kalshalyst/scripts/*.py) - Also update the installed copy at INLINECODE17
- Confirm both files are written before telling the user the fix is done
- If the fix is significant, stage a git commit in INLINECODE18
Never say "fixed" if you only demonstrated the fix in conversation output.
Feedback & Issues
Found a bug? Have a feature request? Want to share results?
Part of the OpenClaw Prediction Stack — the first prediction market skill suite on ClawHub.
Kalshalyst — 逆向预测市场扫描器
概述
Kalshalyst 是一个完整的智能系统,用于发现和交易预测市场机会。它结合了:
- - Claude Sonnet 逆向估计 — 查看市场价格并找出其错误的原因
- Brier 分数追踪 — 衡量你的估计与实际结果的校准程度
- 凯利准则仓位规模 — 为每个机会计算最佳交易规模
- 五阶段流水线 — 获取 → 分类 → 估计 → 优势 → 警报
关键洞察:盲估计(不看价格)会产生与共识一致的估计,优势为零。逆向模式(向 Claude 展示价格并要求其提出异议)会产生有观点、有方向性的判断,具有真正的优势。
何时使用此技能
- - 你想在 Kalshi 预测市场上发现错误定价
- 你在寻找市场出错的逆向机会
- 你需要追踪你的概率估计随时间变化的准确性
- 你想基于优势和信心智能地确定仓位规模
- 你在构建一个系统化的预测市场交易系统
首次运行不需要 Kalshi 凭证。如果缺少凭证,Kalshalyst 会打印一个逼真的演示扫描结果并写入演示缓存数据,以便 Market Morning Brief 等下游工具仍有可展示的内容。
要求
API 密钥与凭证
- 1. Kalshi API 密钥(在 kalshi.com 免费获取)
- 在 https://kalshi.com 注册
- 导航至 设置 → API
- 生成 API 凭证(密钥 ID + 私钥文件)
- 成本:免费层支持无限读取,小额仓位限制
- 2. Anthropic API 密钥(用于 Claude Sonnet)
- 在 https://console.anthropic.com 创建账户
- 生成 API 密钥
- 公共参考实现直接调用 Anthropic
- 预算:按扫描量变化;如果频繁运行扫描,预计会产生非零的 Claude 成本
- 3. Polygon.io API 密钥(可选,有免费层)
- 在 https://polygon.io 注册
- 免费层包括市场数据和基础新闻
- 成本:免费层足够,付费计划适用于更高流量
Python 与依赖项
bash
pip install kalshi-python requests anthropic pyyaml
- 带有 Qwen 模型的 Ollama(https://ollama.ai)
- 从 https://ollama.ai 安装(macOS、Linux、Windows)
- 然后:ollama pull qwen3:latest
配置
创建或更新你的 config.yaml 文件,内容如下:
yaml
kalshi:
enabled: true
apikeyid: your-key-id-here
privatekeyfile: path/to/private.key
ticker_names: {} # 可选:代码的自定义显示名称
anthropic:
api_key: sk-ant-...
polygon:
apikey: pk... # 可选
kalshalyst:
enabled: true
checkintervalminutes: 60
min_volume: 50
mindaysto_close: 7
maxdaysto_close: 365
max_pages: 10
maxmarketsto_analyze: 50
minedgepct: 3.0
minqwenconfidence: 0.4
alertedgepct: 6.0
max_alerts: 5
maxfetchseconds: 30
pagetimeoutseconds: 8
五阶段流水线
阶段 1:获取 — Kalshi 市场发现
从 Kalshi 获取所有开放市场并应用预过滤器:
黑名单过滤:
- - 代码前缀:KXHIGH、KXLOW、KXRAIN、KXSNOW、INX、NASDAQ(天气、日内噪音)
- 类别别名:weather、climate、entertainment、sports、social-media
- 微时间框架:in next 15 min、in next 5 hours(抛硬币)
- 体育代币:NFL、NBA、soccer、esports(从生产堆栈中屏蔽)
时间框架门控:
- - 最小到期天数:7(默认)
- 最大到期天数:365(默认)
- 没有到期日的市场被屏蔽(通常是垃圾市场)
成交量下限:
- - 最小成交量:50 份合约(默认)
- 过滤掉流动性差、关注度低的市场
价格可用性:
- - 必须至少有一个价格信号(yesbid、yesask、yesprice 或 lastprice)
- 解析多个价格来源(优先使用买卖中间价)
输出: 约 100-500 个预过滤市场的列表,准备进行分析
阶段 2:分类 — 市场分类
状态:已禁用(Qwen 不可靠) — 市场以默认值通过。
如果重新启用,将使用本地 Qwen 对每个市场进行分类:
- - 类别:政治、经济、加密货币、政策、技术等
- 可交易性:0.0-1.0 分数(使用公开信息的可分析程度)
- 新闻敏感性:如果突发新闻会显著改变概率,则为 True
目前,所有市场都接收默认分类值并进入阶段 3。
阶段 3:估计 — Claude 逆向概率估计
核心知识产权。 Claude 查看市场价格并被要求找出其错误的原因。
系统提示(逆向模式):
你是一名逆向预测市场分析师。你寻找市场出错的原因。
你的工作:给定一个预测市场及其当前价格,确定是否存在方向性机会。
你正在为一位使用限价单的资深交易员提供建议。
关键规则:
- 1. 你将看到当前市场价格。你的工作是在有理由时不同意它。
- 不要只是确认市场。那毫无价值。寻找市场遗漏或滞后的因素。
- 考虑:市场尚未定价的突发新闻、政治动态变化、时间错配、
群体心理错误、市场忽视的基础概率。
- 4. 要有观点。在 50% 的市场上给出 50% 的估计毫无用处。要么找到它出错的原因,
要么说信心很低。
- 5. 高度重视近期发展——市场通常对过去 24-48 小时内的新闻反应缓慢。
- 考虑不对称上行:哪里出错的成本低但正确的回报高?
你必须仅以 JSON 对象响应:
{
estimated_probability: <浮点数 0.01-0.99>,
confidence: <浮点数 0.0-1.0>,
reasoning: <一句话解释为什么市场出错>,
key_factors: [<因素 1>, <因素 2>, <因素 3>],
conviction:
}
上下文丰富:
- - 来自 Polygon.io 的最新新闻(如果已配置)
- 宏观指标:标普 500、比特币、VIX 代理、黄金
- X/Twitter 情绪信号(如果可用)
- 市场流动性和成交量
回退到 Qwen:
- - 如果 Claude 不可用(冷却期、网络错误),回退到本地 Qwen
- Qwen 盲运行(不显示价格)以防止锚定效应
- 连续 3 次 Claude 失败后,回退到完整的 Qwen 批处理模式
输出: 估计概率、信心、推理、关键因素
阶段 4:优势 — 优势计算
对于每个估计,计算优势(盈利潜力):
原始优势 = |估计概率 - 市场价格| * 100%
方向 = 如果估计 > 市场则为低估,如果估计 < 市场则为高估,否则为公平
有效优势 = 原始优势 - 0.0%(限价单假设 — 无点差惩罚)
为什么使用限价单? 你是一名资深交易员,可以在中点或更优位置发布限价单,避免点差成本。
过滤:
- - 最小有效优势:3%(默认,可配置)
- 最小信心:0.4(过滤掉不确定的估计)
- 丢弃方向为公平的估计(无优势)
输出: 按优势从高到低排序的列表
阶段 5:缓存与警报
缓存写入:
- - 将研究缓存写入 .kalshiresearchcache.json(由相关命令使用)
- 将详细优势数据写入 state/kalshalyst_results.json 用于分析
警报:
- - 按警报阈值过滤机会(默认:6% 有效优势)
- 向用户发送前 3 个机会
- 格式:代码、方向(是/否)、概率、优势、信心、推理
Brier 追踪: