Tennis Model Anchor Calibration — Implementation Ruling

Date: 2026-04-03 | Status: IMPLEMENTED

Q1: Predictive Player Metrics

Answer: SA-Elo (Surface-Adjusted Elo) as primary strength signal, SPW/RPW (Serve/Return Points Won) as the atomic pricing input for the Markov chain. Hold probability derived via O'Malley formula from SPW. Log5 adjusts SPW for opponent quality.

Implementation: tennis_custom_metrics table stores per-player per-surface: avg_spw, avg_rpw, hold_prob, elo_on_surface, clutch_index. Markov chain uses SPW/RPW through Log5 → gameWinProb → setMarkov → matchProb.

Q2: Surface Adjustment

Answer: Per-surface tour averages derived from data:

Surface Tour Avg SPW Tour Avg RPW Tour Avg Hold
Hard 0.590 0.387 0.706
Clay 0.579 0.411 0.683
Grass 0.622 0.380 0.766

Log5 denominator uses surface-specific RPW average: pAServe = SPW_A × (1 - RPW_B) / (1 - surfaceAvgRPW).

Implementation: SURFACE_AVG_SPW and SURFACE_AVG_RPW constants in tennis-markov.ts. All players rated per-surface in tennis_custom_metrics.

Q3: Rating Stabilization (Minimum Matches)

Answer: Surface-dependent minimums based on schedule density:

Surface Min Matches Rationale
Hard 15 Year-round, highest volume
Clay 10 Concentrated in spring/summer
Grass 5 Very short season (4-6 weeks)

Implementation: SURFACE_MIN_MATCHES in tennis-markov.ts. Scanner skips players below threshold.

Q4: Fatigue & Tournament Format (BO3 vs BO5)

Answer: Grand Slams use BO5 (men), all other ATP events use BO3. BO5 amplifies the favorite (p=0.66 set → 0.74 BO3 → 0.79 BO5). Fatigue score already computed: minutes_7d + 0.5 × minutes_8to14d + travel_zones × 100.

Implementation: Scanner detects Grand Slam tickers and uses bestOf=5. Fatigue data in tennis_fatigue table (8 active players tracked).

Q5: Total Games Distribution

Answer: Markov chain produces exact games-per-set distribution, then Monte Carlo convolution (20K sims) across match set count. No assumed distribution — derived from model.

Typical BO3 range: 18-39 games. Peaks at ~24 games (two 6-3 sets).

Q6: Head-to-Head Records

Answer: Min 3 meetings (from council ruling). Time-decayed (4yr ago = 0.1× recent). 354 H2H pairs in database, avg 3.4 meetings.

Current gap: H2H adjustment is computed in tier2 metrics but NOT yet applied in the Markov pricer. Future enhancement.

Q7: Ranking → Win Probability

Answer: We do NOT use raw Elo-to-probability logistic conversion. Instead: SA-Elo informs player quality assessment, but the actual pricing comes from SPW/RPW through the Markov chain. This is more granular than Elo → logistic because it captures serve vs return balance.

Q8: Recent Form vs Career

Answer: SA-Elo blend: 0.5 × Overall + 0.3 × Surface + 0.2 × RecentForm. SPW/RPW in tennis_custom_metrics uses rolling averages from the last 3 years of Sackmann data.

Implementation gap: EWMA not yet applied to SPW/RPW (see Q11). Currently uses simple averages.

Q9: Retirement Risk

Answer: Logistic model computed in tier2 metrics (age, matches_14d, MTOs, heat_index, surface). Stored in tennis_custom_metrics.retirement_risk.

Implementation gap: Not yet used as edge discount in the scanner. Should reduce Kelly fraction when retirement_risk > 0.05.

Q10: Surface Transition Effects

Answer: Clay → Grass transition is the most severe (1-2 weeks adjustment period). Players who just finished clay season have depressed grass stats.

Not implemented. Would need a transition penalty factor. Low priority — affects small number of matches per year.

Q11: EWMA Alpha for Serve/Return

Answer: Recommended alpha = 0.15 (approximately last 6-7 matches weighted most heavily). Not currently implemented — using simple averages.

Implementation gap: Should weight recent matches more heavily. Future enhancement to tennis-tier2-metrics.ts.

Q12: Specific Numbers Summary

Parameter Value Source
SA-Elo blend 0.5 overall + 0.3 surface + 0.2 recent Council ruling
Hard SPW avg 0.590 Data (478 players)
Clay SPW avg 0.579 Data (383 players)
Grass SPW avg 0.622 Data (238 players)
pServe clamp [0.45, 0.78] Calibrated
Min matches Hard 15 Schedule density
Min matches Clay 10 Schedule density
Min matches Grass 5 Schedule density
H2H min meetings 3 Council ruling
H2H time decay 0.1× at 4 years Council ruling
Min edge 4 cents after 7% fee Strategy spec
Kalshi fee 7% on profit (~0.93× edge) Kalshi platform
Kelly fraction 1/8 match, 1/10 sets, 1/12 exact Strategy spec
Tiebreak MC sims 10,000 Calibrated
Distribution MC sims 20,000 Calibrated
Outright MC sims 100,000 Calibrated
Elo K-factor Dynamic (higher for young players) Tier1 implementation
EWMA alpha 0.15 (recommended, not yet implemented) Literature
Retirement risk threshold 0.05 (not yet applied) Recommended
Integrity kill-switch 40c Pinnacle move on Challenger/ITF Council ruling

Implementation Gaps (Ordered by Impact)

  1. EWMA for SPW/RPW — Low effort, medium impact. Add to tier2 recompute.
  2. H2H adjustment in Markov pricer — Medium effort, medium impact.
  3. Retirement risk discount — Low effort, low impact (rare events).
  4. Surface transition penalty — Low effort, very low impact.
Source: ~/edgeclaw/results/panel-results/tennis-calibration-ruling.md