MLB Desk — Strategy Specification
Version 3.0 | Updated: 2026-04-03
Status: DEPLOYED (shadow mode for composite formula)
Overview
The MLB desk identifies mispricings between Kalshi prediction market prices and Pinnacle's de-vigged fair value across moneyline, run line, totals, first 5 innings, and team total markets. Edge detection uses Negative Binomial distribution models with SP-conditional adjustments.
MLB-Specific Characteristics
- Starting Pitcher Dominance — Controls 55-70% of game outcome variance; all models are SP-conditional
- Fixed Run Lines (-1.5) — Walk-off correction required (home teams stop batting when leading)
- Environmental Factors — Park factors, wind, temperature, humidity, altitude, roof status, umpire zones
- 162-Game Schedule — Fatigue, rest patterns, travel effects create daily exploitable variance
- ABS Challenge System (2026) — New automated strike zone; tracking raw data, models deferred to April 20+
- Platoon Splits — 30-50 point wOBA swings on L/R matchups at team level
Distribution Models (from Council Ruling Apr 1)
| Market |
Distribution |
Key Parameter |
| Moneyline |
Negative Binomial (each team's runs) |
mu from SPQC x batting x park x weather; k from team variance |
| Run Line (-1.5) |
NB + walk-off correction (-2.5%) |
Home teams don't bat bottom 9th when leading |
| Totals |
NB (combined runs) |
Weather primary driver; dispersion ratio = 1.15 |
| First 5 Innings |
Modified NB (SP-only, 57% scoring fraction) |
No bullpen component; innings 1-5 rates only |
| Team Totals |
NB (per-team runs) |
Derived from Pinnacle spread + game total |
Edge Detection
- De-vig method: Shin + Power (both computed, Shin primary for multi-way)
- Minimum edge: 4 cents after Kalshi 7% fee
- Minimum SP sample: 5+ starts this season (early-season gate)
- NB dispersion ratio: 1.15 (MLB runs variance/mean)
- Walk-off correction: -2.5% on home -1.5 cover probability
- Run differential sigma: 3.8 (for spread/ML normal approximation)
Derived Metrics (DEPLOYED)
| Metric |
Formula |
Status |
| SPQC (SP Quality Composite) |
0.6 x FIP + 0.2 x ERA + 0.2 x EWMA_ERA |
DEPLOYED (using FIP for xFIP slot until xFIP available) |
| BAI (Bullpen Availability Index) |
0-100: closer(30) + setup(20) + stress(30) + workload(20) |
DEPLOYED |
| WRF (Weather Run Factor) |
park_factor + temp_adj + wind_adj + altitude_adj |
DEPLOYED |
| Platoon Advantage |
team OPS vs SP hand - team season OPS |
DEPLOYED (ALL splits; L/R when available) |
| FTTO Decay Rate |
SP innings 1-3 vs 4-5 performance |
NOT YET (needs Baseball Savant per-inning data) |
| Lineup Strength Delta |
Actual lineup wRC+ - projected wRC+ |
NOT YET (needs per-player wRC+ in lineup table) |
Quality Composite Formula (100 points) — SHADOW MODE
| Component |
Weight |
Sub-formula |
| SP Quality Delta |
32% |
30% xFIP + 25% K-BB% + 20% Stuff+ + 15% xwOBA-against + 10% rolling L10 xFIP |
| Team Offense vs SP Hand |
25% |
PA-weighted confirmed lineup against opposing SP hand |
| Bullpen Quality Delta |
18% |
Recent FIP + K-BB% with fatigue-adjusted usage index |
| Environmental Factor |
10% |
Game-specific park factor adjusted for team profile |
| Situational Edge |
10% |
Compound fatigue: (day_after_night x 2) + (timezone x 1) + (series_game x 0.5) |
| Defense + Framing |
5% |
OAA + catcher framing (reduced in ABS era) |
Status: Formula defined but running shadow mode until 100+ games validate the weights.
F5 Mini-Composite
55% SP Quality + 30% Team Offense + 10% Environmental + 5% Defense (no bullpen).
Edge Tier Classification
| Tier |
Criteria |
Action |
| S |
Composite > 65 + edge > 5% |
Maximum confidence |
| A |
Composite > 55 + edge > 3% |
Standard unit |
| B |
Composite > 45 + edge > 2% |
Half unit |
| C |
35-45 OR edge 1-2% |
Track only |
| D |
< 35 + edge < 1% |
Skip |
Position Sizing
- Single bet max: 3% bankroll
- Per-game max: 8% bankroll
- Daily max: 15% bankroll
- Kelly fraction: 1/4
Collection Schedule
| Data |
Frequency |
Time (ET) |
| SP stats, team batting, bullpen |
Daily |
11:00 AM |
| Lineups |
Continuous |
6 AM - lock |
| Pinnacle odds |
Adaptive |
2h/15m/5m cadence |
| Kalshi prices |
Every 30 min |
Continuous |
| SBR multi-book |
4x daily |
8/10 AM, 2/6 PM |
| Weather |
Pre-game |
2h + 30min before |
| Pregame sharp money |
3x daily |
10 AM, 2 PM, 6 PM |
| DRatings/Sagarin/Dimers/GameSim |
Daily |
11 AM / 12 PM |
| Edge scanners |
4x daily |
8/10 AM, 2/6 PM |
| Live scores |
Every 60s |
During games |
| Results |
Post-game |
~3 AM next day |
Key Constants
| Constant |
Value |
Source |
| NB dispersion ratio |
1.15 |
MLB variance/mean (Tango) |
| Walk-off correction |
-2.5% |
Opus analysis (council ruling) |
| F5 scoring fraction |
0.57 |
Published research (5/9 + early SP efficiency) |
| Run differential sigma |
3.8 |
MLB historical std dev |
| League avg runs/team |
4.5 |
2024-2025 run environment |
| League avg ERA |
4.20 |
2024-2025 |
| Kalshi fee rate |
7% |
Exchange profit fee |
| Min net edge |
4 cents |
After fee |
| EWMA alphas |
0.10, 0.12, 0.15 |
Three tracked simultaneously (council ruling) |
| Bullpen fatigue: B2B |
-0.25 xG |
Council ruling |
| Bullpen fatigue: 3-in-4 |
-0.35 xG |
Council ruling |
| Temperature adj |
+0.01 runs per degree above 72F |
Standard |
| Altitude adj |
+0.05 runs per 1000ft above 500ft |
Standard |
Data Sources (17 Active)
| Source |
Type |
Cost |
| MLB Stats API |
Schedule, lineups, rosters, transactions |
Free |
| Pinnacle (stealth scraper) |
Sharp odds anchor |
Free |
| Kalshi REST API |
Prediction market prices |
Free |
| Odds API |
DK/FanDuel opening lines |
$20-30/mo |
| SBR |
Multi-book consensus |
Free (scraped) |
| Baseball Reference |
Team batting/pitching stats |
Free (scraped) |
| Baseball Savant |
Statcast metrics |
Free (scraped) |
| MoneyPuck/DRatings |
Model predictions |
Free (scraped) |
| Sagarin/Dimers/GameSim |
Independent predictions |
Free (scraped) |
| UmpScorecards |
Umpire accuracy metrics |
Free (scraped) |
| ESPN |
Live scores |
Free |
| Pregame.com |
Sharp money flow |
Free (scraped) |
| Open-Meteo |
Weather (backup to NWS for US parks) |
Free |
Future / TODO
- ABS Challenge System models — tracking raw data now, models after April 20+
- FanGraphs Steamer/Stuff+/PitchingBot — requires $6/mo subscription
- Quality Composite live mode — after 100+ games shadow validation
- xFIP in SPQC — currently using FIP as proxy
- FTTO Decay Rate — needs per-inning pitch data from Baseball Savant
- Lineup Strength Delta — needs per-player wRC+ integration
- Live in-game 1-minute snapshots — requires WebSocket upgrade
- Ballpark Pal — $40/yr for game-specific park factors
- Polymarket integration — deferred (not available to US customers)
- Bullpen Usage Index (leverage-weighted) — formula defined, implementation pending
- Inverse-MSE model weighting — after 100+ games of Brier score data
Source: ~/edgeclaw/results/spec-panel/mlb-desk/spec-final.md