=== COUNCIL RESULTS ===
QUESTION: Design the complete research and intelligence pipeline for the NHL betting desk.
DATE: 2026-04-01 ADVISORS: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b PROCESS: Full council — 5 advisory, anonymized peer review, 5 independent reviews ANONYMIZATION MAP: A=Gemini, B=Opus, C=Grok, D=gpt-oss, E=Sonnet
Strongest Response:
Biggest Blind Spot:
final_xGF = base * goalie_factor * injury_factor * fatigue_factor. With hard floor (60% of baseline xGF) and ceiling (150% of baseline xGA).Opus's multiplicative adjustment framework — 4/5 reviewers endorsed this. The worked example and floor/ceiling bounds make it production-ready.
Opus's post-mortem feedback loop — Only advisor to design weekly tracking of source reliability that updates credibility tiers. Beat reporters who are consistently wrong get downgraded.
Opus's dual-scenario GTD handling — Run two parallel models (Goalie A scenario + Goalie B scenario). Only bet when edge exists in BOTH. Most operationally sound.
Sonnet's backup goalie quality tiers — 4-level system with specific SV% thresholds. Critical for the most common NHL research event (B2B backup start).
Sonnet's T-15 minute hard cutoff — No new bet execution within 15 minutes of puck drop when CRITICAL finding fires. Stricter than NBA, appropriate for NHL.
Opus's "time-based credibility decay does NOT apply in NHL" — Subtle but correct. The entire intelligence window is same-day (9 AM to puck drop). Recency is a tiebreaker, not a decay function.
On a 15-game NHL night: 30 teams × 7-8 queries × 3-4 passes = 400-900 web search API calls per day plus extraction/validation LLM calls. No advisor calculated total API cost or whether queries can complete within pass windows. Need a priority system: focus research spend on games with active Kalshi markets.
Also: Optional morning skate problem. Teams increasingly skip morning skate entirely (especially B2B). The entire intelligence architecture treats morning skate as the primary window, but sometimes there IS no morning skate. Need fallback protocol.
Pipeline assumes a standard sportsbook. Kalshi is a prediction market with:
All 5 advisors built morning skate query extraction but NONE instructed the extraction model to look for "non-contact jersey" or jersey colors. In NHL, a player skating in a non-contact jersey (red/yellow) is 100% OUT. Without this explicit extraction rule, the AI reads "Matthews is skating with the team" and flags him as EXPECTED_IN — missing the critical visual cue.
None built a bidirectional loop where line movement confirms/disconfirms research. When a goalie rumor leaks, the market often moves BEFORE public confirmation. An unexplained 8-cent ML shift should increase confidence in a beat reporter's tweet. Pipeline treats research as upstream truth and market as downstream only.
Entire pipeline is pre-game "set-and-forget." No live research for in-game events (star forward injured at 10-minute mark, goalie pulled early, momentum swings). A modern desk should have a low-latency event-driven in-play layer.
| Rank | Source | Confidence | Notes |
|---|---|---|---|
| 1 | NHL Official API (confirmed starter field) | 100% | Definitive. Overrides everything. |
| 2 | DailyFaceoff "Confirmed" tag | 95% | Rarely wrong once marked. |
| 3 | Beat reporter with visual confirmation at rink | 85% | "I saw X take starter's end" — physical evidence. |
| 4 | Team official PR/Twitter | 80% | Reliable when they post, often silent until game time. |
| 5 | National journalist (Friedman, Seravalli, Dreger) | 75% | Very reliable for trades, less so for daily lineups. |
| 6 | DailyFaceoff "Expected" tag | 70% | Usually right but not guaranteed. |
| 7 | Team beat writer (no visual confirmation) | 65% | "I'm hearing X will start" — less reliable than visual. |
| 8 | Coach presser quote | 60% | Coaches actively mislead about goalies. Cultural norm. |
| 9 | Fan/aggregator accounts | 20% | Context only, never auto-adjust. |
Status escalation:
UNCONFIRMED (default)
→ EXPECTED (DailyFaceoff "Expected" OR 1+ beat reporter says likely)
→ CONFIRMED (DailyFaceoff "Confirmed" OR NHL API OR 2+ independent beat reporters confirm visual)
Math layer treatment by status:
Conflict resolution:
Goalie change cascade (full recompute):
xGA_adjusted = xGA_base * (league_avg_sv% / new_goalie_sv%)Game-time decisions:
Emergency goalie (warmup illness/injury):
| Tier | Criteria | Treatment |
|---|---|---|
| A | 20+ NHL starts this season, .905+ SV% | Minimal delta from starter. Small recompute. |
| B | 5-19 starts, .895-.905 SV% | Modest downgrade. ~0.008 SV% reduction. |
| C | <5 starts or career backup, .880-.894 SV% | Significant downgrade. Full recompute. |
| D | AHL call-up, <5 NHL career games | Emergency. Use .880 baseline + max uncertainty. |
| Step | Model | Purpose | Cost |
|---|---|---|---|
| Web search (all passes) | Grok 4.1 Fast | Fast search, structured extraction | $0.20/$0.50M |
| Structured extraction | Gemini Flash | Parse raw results into IntelAdjustment JSON | Cheap |
| Contradiction detection | DeepSeek R1 | Compare new findings vs existing data | Cheap |
| Plausibility gate | Sonnet 4.6 | Final sanity check before math layer | Per-call |
Morning Skate (10:00-11:30 AM ET) — per game:
"[Team] morning skate [date] goalie starter""[Team] morning skate [date] lines combinations""[Team] injury update [date]""[Star Player] morning skate status [date]" (for any questionable players)"[Team] referee crew [date]""DailyFaceoff [Team] starter [date]"Afternoon (2:00-3:00 PM ET):
"[Team] goalie confirmed [date]""[Team] lineup update [date]""[Team] recalled AHL [date]" / "[Team] roster move [date]""[Player] doubtful questionable [date]"Pre-Game (5:30-6:00 PM ET):
"[Team] starter confirmed warmup [date]""[Team] late scratch [date]""[Player] warmup status [date]"West Coast (8:00 PM ET): Same as pre-game for Pacific starts.
No Morning Skate Fallback: When team skips morning skate (common on B2B):
Adjustments MULTIPLY (not add):
final_xGF = base_xGF * goalie_opp_factor * injury_factor * fatigue_factor
final_xGA = base_xGA * own_goalie_factor * def_injury_factor * fatigue_factor
Goalie change sets the new baseline — other adjustments apply to the new baseline.
Skater injury (top-6 F confirmed OUT):
player_xGF_share = player_xGF / team_xGF (from MoneyPuck, 5v5)
xGF_reduction = player_xGF_share * 0.40 (40% lost, 60% redistributed)
injury_factor = 1 - xGF_reduction
Cap: single player max 15%, cumulative max 25%
Defenseman OUT: Same formula but applied to xGA.
Fatigue penalties:
| Scenario | xGF multiplier | xGA multiplier |
|---|---|---|
| Home B2B (no travel) | 0.97 | 1.02 |
| Road B2B (same city) | 0.96 | 1.03 |
| Road B2B (cross-timezone) | 0.94 | 1.05 |
| 3rd game in 4 nights | 0.95 | 1.04 |
| Travel disruption (late arrival) | additional 0.98 | additional 1.03 |
| Cumulative fatigue cap | min 0.92 | max 1.08 |
Referee adjustment:
ref_penalty_factor = ref_avg_penalties / league_avg_penaltiesadjusted_PP_opps = base_PP_opps * ref_penalty_factorBounds:
interface NHLIntelAdjustment {
id: string; // UUID
game_id: string; // NHL API game ID
team: string; // 3-letter code
pass: "morning" | "afternoon" | "pregame" | "westcoast" | "emergency";
timestamp: string; // ISO 8601
finding_type: "goalie_confirmed" | "goalie_change" | "player_out" | "player_in" |
"line_change" | "travel_disruption" | "callup" | "trade" |
"referee_assignment" | "motivation_context";
severity: "CRITICAL" | "MODERATE" | "CONTEXT";
status: "CONFIRMED" | "EXPECTED" | "UNCONFIRMED" | "CONFLICT";
player_name: string | null;
player_position: "G" | "F" | "D" | null;
player_xgf_share: number | null;
adjustment_type: "xGF" | "xGA" | "SV%" | "PP%" | "full_recompute" | "sigma" | null;
adjustment_magnitude: number | null; // multiplier (e.g., 0.94 = 6% reduction)
adjustment_confidence: "HIGH" | "MEDIUM" | "LOW";
source: string;
source_tier: number; // 1-9 from hierarchy
source_url: string | null;
raw_text: string;
auto_apply: boolean;
supersedes: string | null; // ID of finding this replaces
invalidated: boolean;
invalidated_by: string | null;
// Audit fields
pre_adjustment_value: number | null;
post_adjustment_value: number | null;
applied_to_model: boolean;
applied_timestamp: string | null;
}
Commands:
/freeze NHL — stops all auto-adjustments. Findings stored as QUEUED_FROZEN./unfreeze NHL — resumes. Queued findings require explicit /apply queue NHL./rollback [id] — reverts specific adjustment, re-applies remaining chain.Late-breaking news after analyst submission:
Post-mortem integration: Weekly review tracks which adjustments were correct, which sources were reliable. Source tiers updated from empirical results.
Cost budget: No fixed budget. Build it right, measure actual costs after running, then adjust from there.
Market as intelligence: Yellow light only. Log unexplained line movement as a CONFLICT flag on the matchup card. Alert boss on Telegram. Do NOT auto-adjust model numbers — let boss decide whether to bet or skip that game.
Non-contact jersey rule: Yes. Add explicit extraction instruction to look for "non-contact jersey" mentions in morning skate reports.
In-play research: Pre-game only for now. No live/in-game research layer.
Beat reporter list: Full 32-team list from day one.
Kalshi liquidity check: Deferred. Focus is on accurate prediction-making (data collection + research). Execution-layer concerns like Kalshi liquidity come later.
| Detail | Value |
|---|---|
| Council date | 2026-04-01 |
| Advisory responses | 5 (all completed) |
| Peer reviews | 5 (all completed) |
| Strongest advisor | Opus (4/5 votes) |
| Biggest blind spots | Grok (coach hierarchy backwards), gpt-oss (assumes human staff) |
| Full council data | /home/ubuntu/edgeclaw/data/councils/2026-04-01/nhl-research/ |