Tennis Model Anchor Calibration — Implementation Ruling

Date: 2026-04-03 | Status: IMPLEMENTED

Q1: Predictive Player Metrics

Answer: SA-Elo (Surface-Adjusted Elo) as primary strength signal, SPW/RPW (Serve/Return Points Won) as the atomic pricing input for the Markov chain. Hold probability derived via O'Malley formula from SPW. Log5 adjusts SPW for opponent quality.

Implementation: tennis_custom_metrics table stores per-player per-surface: avg_spw, avg_rpw, hold_prob, elo_on_surface, clutch_index. Markov chain uses SPW/RPW through Log5 → gameWinProb → setMarkov → matchProb.

Q2: Surface Adjustment

Answer: Per-surface tour averages derived from data:

Surface	Tour Avg SPW	Tour Avg RPW	Tour Avg Hold
Hard	0.590	0.387	0.706
Clay	0.579	0.411	0.683
Grass	0.622	0.380	0.766

Log5 denominator uses surface-specific RPW average: pAServe = SPW_A × (1 - RPW_B) / (1 - surfaceAvgRPW).

Implementation: SURFACE_AVG_SPW and SURFACE_AVG_RPW constants in tennis-markov.ts. All players rated per-surface in tennis_custom_metrics.

Q3: Rating Stabilization (Minimum Matches)

Answer: Surface-dependent minimums based on schedule density:

Surface	Min Matches	Rationale
Hard	15	Year-round, highest volume
Clay	10	Concentrated in spring/summer
Grass	5	Very short season (4-6 weeks)

Implementation: SURFACE_MIN_MATCHES in tennis-markov.ts. Scanner skips players below threshold.

Q4: Fatigue & Tournament Format (BO3 vs BO5)

Answer: Grand Slams use BO5 (men), all other ATP events use BO3. BO5 amplifies the favorite (p=0.66 set → 0.74 BO3 → 0.79 BO5). Fatigue score already computed: minutes_7d + 0.5 × minutes_8to14d + travel_zones × 100.

Implementation: Scanner detects Grand Slam tickers and uses bestOf=5. Fatigue data in tennis_fatigue table (8 active players tracked).

Q5: Total Games Distribution

Answer: Markov chain produces exact games-per-set distribution, then Monte Carlo convolution (20K sims) across match set count. No assumed distribution — derived from model.

Typical BO3 range: 18-39 games. Peaks at ~24 games (two 6-3 sets).

Q6: Head-to-Head Records

Answer: Min 3 meetings (from council ruling). Time-decayed (4yr ago = 0.1× recent). 354 H2H pairs in database, avg 3.4 meetings.

Current gap: H2H adjustment is computed in tier2 metrics but NOT yet applied in the Markov pricer. Future enhancement.

Q7: Ranking → Win Probability

Answer: We do NOT use raw Elo-to-probability logistic conversion. Instead: SA-Elo informs player quality assessment, but the actual pricing comes from SPW/RPW through the Markov chain. This is more granular than Elo → logistic because it captures serve vs return balance.

Q8: Recent Form vs Career

Answer: SA-Elo blend: 0.5 × Overall + 0.3 × Surface + 0.2 × RecentForm. SPW/RPW in tennis_custom_metrics uses rolling averages from the last 3 years of Sackmann data.

Implementation gap: EWMA not yet applied to SPW/RPW (see Q11). Currently uses simple averages.

Q9: Retirement Risk

Answer: Logistic model computed in tier2 metrics (age, matches_14d, MTOs, heat_index, surface). Stored in tennis_custom_metrics.retirement_risk.

Implementation gap: Not yet used as edge discount in the scanner. Should reduce Kelly fraction when retirement_risk > 0.05.

Q10: Surface Transition Effects

Answer: Clay → Grass transition is the most severe (1-2 weeks adjustment period). Players who just finished clay season have depressed grass stats.

Not implemented. Would need a transition penalty factor. Low priority — affects small number of matches per year.

Q11: EWMA Alpha for Serve/Return

Answer: Recommended alpha = 0.15 (approximately last 6-7 matches weighted most heavily). Not currently implemented — using simple averages.

Implementation gap: Should weight recent matches more heavily. Future enhancement to tennis-tier2-metrics.ts.

Q12: Specific Numbers Summary

Parameter	Value	Source
SA-Elo blend	0.5 overall + 0.3 surface + 0.2 recent	Council ruling
Hard SPW avg	0.590	Data (478 players)
Clay SPW avg	0.579	Data (383 players)
Grass SPW avg	0.622	Data (238 players)
pServe clamp	[0.45, 0.78]	Calibrated
Min matches Hard	15	Schedule density
Min matches Clay	10	Schedule density
Min matches Grass	5	Schedule density
H2H min meetings	3	Council ruling
H2H time decay	0.1× at 4 years	Council ruling
Min edge	4 cents after 7% fee	Strategy spec
Kalshi fee	7% on profit (~0.93× edge)	Kalshi platform
Kelly fraction	1/8 match, 1/10 sets, 1/12 exact	Strategy spec
Tiebreak MC sims	10,000	Calibrated
Distribution MC sims	20,000	Calibrated
Outright MC sims	100,000	Calibrated
Elo K-factor	Dynamic (higher for young players)	Tier1 implementation
EWMA alpha	0.15 (recommended, not yet implemented)	Literature
Retirement risk threshold	0.05 (not yet applied)	Recommended
Integrity kill-switch	40c Pinnacle move on Challenger/ITF	Council ruling

Implementation Gaps (Ordered by Impact)

EWMA for SPW/RPW — Low effort, medium impact. Add to tier2 recompute.
H2H adjustment in Markov pricer — Medium effort, medium impact.
Retirement risk discount — Low effort, low impact (rare events).
Surface transition penalty — Low effort, very low impact.

Source: ~/edgeclaw/results/panel-results/tennis-calibration-ruling.md