Golf Research Pipeline — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: Opus (2 of 5 peer review votes — most split council, all others self-voted) Status: PENDING BOSS RULING on open questions


COUNCIL SUMMARY

Where Advisors Agreed

  1. Strokes Gained is THE core framework — SG:OTT, SG:APP, SG:ARG, SG:PUTT decomposition drives everything
  2. SG:Approach is the most stable/predictive component — low variance week-to-week, best single predictor
  3. SG:Putting is the noisiest component — high variance, low predictability, grass type matters hugely
  4. Course fit score from regression coefficients on SG components per venue
  5. Weather wave advantage is the #1 short-term edge — AM vs PM wave can be 2-3 strokes
  6. Monte Carlo tournament simulation (50K-100K iterations) for outright/top-N/make-cut markets
  7. 7 edge scanners required — Outright, H2H, Make Cut, Top 5/10/20, Round Leader, 3-Ball, Hole in One
  8. DataGolf as primary SG data source (comprehensive, covers all tours)
  9. NWS API for weather (US events), separate source for international
  10. Pinnacle de-vig via Power method for N-way outright markets

Where Advisors Disagreed

  1. SG weighting windows: Different decay factors and lookback windows proposed. Council verdict: Recent 8 rounds (0.85 decay) 30%, Medium 24 rounds (0.92 decay) 35%, Long-term 100 rounds (0.97 decay) 35%.
  2. Simulation count: Range from 50K to 100K. Council verdict: 100K for outrights (150+ player fields need it), 50K acceptable for H2H/3-ball.
  3. Tour-specific modeling: Some treated all tours the same. Council verdict: Separate parameters for PGA Tour, LIV (shotgun starts → no wave advantage), DP World Tour, Korn Ferry (higher variance/sigma), LPGA (separate regressions).
  4. Architecture complexity: One advisor recommended Kafka/Kubernetes/Snowflake. Council verdict: SQLite WAL, simple cron-based pipeline.

Strongest Arguments (from peer review)

Opus wins (split council — 2 genuine cross-model votes) with the most operationally specific design:

Biggest Blind Spot

No backtesting framework or calibration methodology — All advisors describe how to generate edges but none describe how to validate whether those edges actually make money. No historical backtest specification, expected hit rates, calibration curves, or feedback loops.

What Everyone Missed (from peer reviews)

  1. Hierarchical correlation structure — Players sharing the same wave, same group, same pin positions have correlated outcomes beyond weather. Need: round-level course effect → wave-level weather → group-level micro-conditions → individual noise. Models treating players as independent draws underestimate variance on wave-wide blowouts and overstate confidence on H2H/3-ball edges.
  2. Pin position impact — Daily pin placements change hole difficulty by 0.5-1.0 strokes. Not captured in any SG lookback.
  3. Putting surface type transitions — Bermuda → bentgrass → poa annua transitions cause huge putting variance. Players' putting SG needs grass-type-specific tracking.
  4. Green firmness and course setup changes — Courses play softer after rain, harder as week progresses. Stimpmeter readings and course setup decisions affect all markets.
  5. Alternate field compositions — Monday qualifiers and alternates have different SG profiles than the committed field. Pipeline must handle late additions.

BUILD PLAN

Phase 1: Core Golf Data Tables

golf_players: player_id, full_name, tour (PGA/LIV/DPWT/KF/LPGA), nationality, age, status, sg_total, sg_ott, sg_app, sg_arg, sg_putt, owgr_rank, datagolf_rank, sigma (scoring variance), updated_at

golf_player_sg: player_id, date, window (8/24/100 rounds), sg_total, sg_ott, sg_app, sg_arg, sg_putt, rounds_played, ewma_total, course_fit_score (per-tournament)

golf_courses: course_id, name, city, state_country, par, yardage, grass_type (bermuda/bentgrass/poa), altitude_ft, course_dna_ott, course_dna_app, course_dna_arg, course_dna_putt, avg_winning_score, cut_line_history_avg

golf_tournaments: tournament_id, name, course_id, tour, start_date, end_date, purse, field_size, format (stroke/match), cut_rule (top-65/top-70/no-cut), shotgun_start (boolean for LIV)

golf_field_lists: tournament_id, player_id, entry_type (committed/alternate/MQ), wd_status, wd_timestamp, tee_time_r1, wave_r1 (AM/PM), tee_time_r2, wave_r2, made_cut (boolean), final_position

golf_weather: tournament_id, round_number, wave (AM/PM), forecast_time, temperature_f, wind_speed_mph, wind_gust_mph, wind_direction, precipitation_prob, precipitation_type, wave_advantage_strokes, source (NWS/other)

golf_course_history: player_id, course_id, rounds_played, avg_sg_total, best_finish, made_cuts, missed_cuts, wins

golf_round_scores: tournament_id, round_number, player_id, score_to_par, sg_total, sg_ott, sg_app, sg_arg, sg_putt, tee_time, wave, position_after_round

Phase 2: Derived Metrics

Metric Formula Purpose
Course Fit Score dot(player_SG_vector, course_DNA_vector), z-normalized Player-course compatibility
Composite SG 0.30 × recent_8rd + 0.35 × medium_24rd + 0.35 × longterm_100rd Weighted player strength
Course History Bonus 3+ cuts: +0.1, prior T10: +0.15, prior win: +0.20, cap +0.3 Venue familiarity adjustment
Weather Wave Advantage Wind-adjusted scoring differential AM vs PM from forecast model Short-term edge signal
Player Sigma Std dev of SG:Total over last 40 rounds Variance/consistency measure
Cut Probability From MC: P(player in top-N after 36 holes) Make-cut market input
Wind Impact (wind_speed - 10) × 0.05 SG penalty per 5mph above threshold Scoring difficulty adjustment

Phase 3: Weather Model (Golf's Biggest Edge)

Data ingestion:

Wave advantage calculation:

  1. Get hourly wind forecast for AM tee times (7:00-9:30 AM local) and PM tee times (12:00-2:30 PM local)
  2. Average wind speed per wave window
  3. Apply wind impact formula: strokes penalty = (avg_wind - 10) × 0.05 per 5mph, minimum 0
  4. Wave advantage = PM_penalty - AM_penalty (positive = AM advantage)
  5. Typical range: 0 to 3+ strokes in extreme conditions

Re-pricing triggers:

Phase 4: Monte Carlo Tournament Simulation

Parameters:

Per-sim outputs:

Phase 5: 7 Edge Scanners

Common engine:

  1. Ingest Pinnacle odds
  2. De-vig: Power method for N-way outrights, multiplicative for 2-way markets
  3. Run Monte Carlo simulation
  4. Compare model probabilities to Kalshi prices
  5. Min edge: varies by market (see below)
  6. Output: {tournament_id, market, selection, model_prob, kalshi_price, edge, confidence, wave_flag}
Scanner Min Edge Unique Logic
Outright (150-way) 8 cents Power method de-vig; 100K MC sims; course fit + weather + form composite; Kelly 0.25x
Head-to-Head 4 cents Direct player comparison; wave-adjusted if different waves; course fit differential; Kelly 0.5x
Make the Cut 4 cents Consistency (low sigma) matters more than peak SG; course history weight higher; Kelly 0.4x
Top 5/10/20 5 cents From MC finish position distribution; course fit + elite SG composite; Kelly 0.35x
Round Leader 5 cents Single-round simulation; wave weather is DOMINANT input; putting variance highest; Kelly 0.3x
3-Ball Grouping 4 cents Mini-outright among 3 players; same-group correlation (shared conditions); Kelly 0.4x
Hole in One 6 cents Par-3 difficulty × field size × ace probability per player; very high variance; Kelly 0.15x

Phase 6: Matchup Card Format

TOURNAMENT: [Name] | [Course] | [Tour] | [Dates]
COURSE DNA: OTT [wt] | APP [wt] | ARG [wt] | PUTT [wt]
PAR: [val] | YARDAGE: [val] | GRASS: [type] | ALTITUDE: [ft]
CUT RULE: [top-65/70/no-cut] | FORMAT: [stroke/shotgun]

PLAYER: [Name] | Rank: OWGR [#] / DataGolf [#]
  SG Composite: [val] (Recent: [val] | Medium: [val] | Long: [val])
  SG Components: OTT [val] | APP [val] | ARG [val] | PUTT [val]
  Course Fit Score: [val] (z-score)
  Course History: [rounds at venue] | Best: [finish] | Cuts: [X/Y]
  Sigma: [val] (consistency tier: [elite/average/volatile])
  Form (Last 3 Events): [Event: Finish, Event: Finish, Event: Finish]

WAVE ASSIGNMENT: Round 1 [AM/PM] | Round 2 [PM/AM]
  Wave Advantage R1: [+/- strokes] | R2: [+/- strokes]

WEATHER:
  R1 AM: Wind [mph] [dir] | Temp [°F] | Rain [%]
  R1 PM: Wind [mph] [dir] | Temp [°F] | Rain [%]
  R2 AM: Wind [mph] [dir] | Temp [°F] | Rain [%]
  R2 PM: Wind [mph] [dir] | Temp [°F] | Rain [%]
  Weekend Forecast: [summary]

MODEL PROBABILITIES:
  Win: [%] | Top 5: [%] | Top 10: [%] | Top 20: [%] | Make Cut: [%]

STATUS:
  Equipment Changes: [Y/N — details]
  Caddie: [Name] | Change: [Y/N]
  Injury: [details if any]

INTELLIGENCE:
  [CRITICAL/MODERATE/CONTEXT findings]

Phase 7: Dashboard

Phase 8: Kill Switch


OPEN QUESTIONS FOR BOSS RULING

  1. Tour scope: PGA Tour + LIV + DP World Tour + Korn Ferry + LPGA all at launch? Or PGA Tour first?
  2. DataGolf subscription: Required for comprehensive SG data. Worth the cost?
  3. Simulation count: 100K for outrights confirmed? Computationally feasible?
  4. LIV shotgun starts: No wave advantage — different model needed. Worth building separate LIV logic?
  5. Hole in One scanner: Very high variance, novelty market. Build now or defer?
  6. Backtesting harness: Build calibration system before going live?
  7. International weather: NWS only covers US events. What API for DP World Tour/international events?

COUNCIL METADATA

Detail Value
Council date 2026-04-01
Advisory responses 5 (all completed)
Peer reviews 5 (all completed)
Strongest advisor Opus (2/5 votes — split council, all others self-voted)
Runner-up Gemini (2/5 votes — 1 genuine from gpt-oss)
Biggest blind spot No backtesting/calibration framework
Full council data /home/ubuntu/edgeclaw/data/councils/2026-04-01/golf-research/
Source: ~/edgeclaw/results/panel-results/golf-research-ruling.md