Gaming & Esports Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: Opus (tied with Gemini at 1 genuine vote each; Sonnet endorsement of Opus breaks tie — deepest analysis at 38K chars) Status: PENDING BOSS RULING on open questions


COUNCIL SUMMARY

Where Advisors Agreed

  1. Map veto history is #1 missing data — feeds Map Veto MC but currently lacks structured training data
  2. Player stats need normalization — raw HLTV/VLR/OE data must map to unified schema per game
  3. Patch version tracking with meta scores is critical — PSS model needs structured patch DB
  4. Roster change history with dates and RCIS — currently no structured timeline
  5. LAN vs online split must be tracked per team — OLAF needs structured data source
  6. Cross-game Elo normalization is low-value — markets don't cross, player pools barely overlap
  7. FIFA and CoD should be deferred — limited data, thin markets, scanner-only approach
  8. CS2 and LoL have best data quality — HLTV and Oracle's Elixir are comprehensive
  9. Scrim intelligence needs confidence scoring — unreliable by nature, must weight by source
  10. Non-competitive markets (Game Awards, Steam) need completely different approach — no sharp book anchor

Where Advisors Disagreed

  1. Schema complexity: gpt-oss proposed enterprise star schema (Dim/Fact tables). Opus proposed lean, game-specific extensions. Council verdict: Lean schema with game-specific JSON columns where needed.
  2. Data acquisition philosophy: Gemini pushed for commercial feeds (GRID, Bayes). Others relied on public scraping. Council verdict: Start with public scraping, evaluate commercial feeds if latency proves to be a bottleneck.
  3. Demo/replay parsing: Gemini proposed automated demo parsing as primary data moat. Others didn't prioritize. Council verdict: Investigate as Phase 2 — high potential but significant engineering overhead.
  4. Scrim intelligence: Varying confidence in scrim data value. Council verdict: Track but with composite confidence score (reliability × recency × corroboration); ignore below 0.3 threshold.

Strongest Arguments (from peer review)

Opus wins with the most operationally actionable analysis:

Biggest Blind Spot

Pipeline fragility on patch days — When publishers update games, data telemetry, replay file structures, and APIs often break. Patch Day 1 is when market inefficiencies peak AND when data pipelines are most likely to be down. No advisor addressed failover strategies or "skeleton models" for the 48-hour period when primary data sources break after a patch.

What Everyone Missed (from peer reviews)

  1. Player props / SGP covariance — In-game resources (kills, gold) are finite and negatively correlated between teammates. Books price player props as independent. Covariance matrix for intra-team resource distribution = high-yield alpha.
  2. Patch-aware temporal data versioning — Hero matchup matrix from patch 7.33 is poisonous on 7.34. Need "as-of" date queries, patch-boundary training splits, no future data leakage in backtests.
  3. Match-fixing detection as upstream data product — Need suspicion_score on every match, not just as a kill switch. Especially in lower-tier CS2, FIFA, CoD.
  4. Dead rubber/strat-hiding — Teams in locked positions intentionally lose or test strategies. Match Importance Multiplier needed.
  5. Real-time streaming ingestion — Batch ETL (nightly scrapes) means always behind books with real-time feeds. Need streaming layer for post-match and live data.

BUILD PLAN

Phase 1: Core Data Tables

esports_teams: team_id, name, game, region, tier (1-3), roster_stability_days, lan_wr, online_wr, olaf_factor, active, updated_at esports_players: player_id, name, game, team_id, role, nationality, contract_status, faceit_elo, active esports_matches: match_id, tournament_id, game, team_a_id, team_b_id, format (BO1/BO3/BO5), lan_online, server_region, date, patch_version, suspicion_score esports_match_results: result_id, match_id, winner_id, map_score_a, map_score_b, total_rounds, duration_min, match_importance esports_map_results: map_id, match_id, map_number, map_name, team_a_score, team_b_score, winner_id, first_blood_team, first_tower_team, side_scores (JSON), game_specific (JSON) esports_map_vetoes: veto_id, match_id, veto_order, team_id, action (ban/pick/decider), map_name esports_rosters: roster_id, team_id, player_id, role, joined_date, left_date, change_type (permanent/stand-in/loan), rcis_score esports_patches: patch_id, game, version, release_date, depot_detected_date, severity, meta_fluidity_index, chaos_window_end, notes_parsed (JSON) esports_patch_team_impact: impact_id, patch_id, team_id, pss_score, pre_patch_wr, post_patch_wr, affected_entities (JSON) esports_player_stats: stat_id, match_id, map_id, player_id, game, kills, deaths, assists, rating, game_specific (JSON — ADR/KAST for CS2, ACS for Val, CS@15/GD@15 for LoL, GPM/XPM for Dota) esports_tournaments: tournament_id, name, game, tier (S/A/B/C), format, prize_pool, start_date, end_date, lan_online, location esports_brackets: bracket_id, tournament_id, round, match_id, seed_a, seed_b, bo_format esports_odds: odds_id, match_id, market_type, book, selection, odds, timestamp esports_scrim_intel: intel_id, team_a, team_b, source, confidence_score, date_reported, result_claimed, map, notes esports_hero_agent_meta: meta_id, game, patch_version, entity_name, pick_rate, ban_rate, win_rate, tier, updated_at esports_steam_noncomp: record_id, market_type (game_awards/steam_ranking/twitch), entity, metric_value, timestamp, notes

Phase 2: Custom Metrics

Metric Formula Notes
Map Pool Depth Count of maps with >45% WR in last 20 maps played Broader = more flexible in veto
Forced Map Disadvantage Prob From Map Veto MC: P(team plays worst map) Key input to map winner scanner
Adaptation Speed (Post-patch WR - Pre-patch WR) / field_avg_change Needs 3-4 patch cycles to calibrate
Scrim Confidence Composite reliability × recency × context × corroboration Ignore below 0.3
Tournament Fatigue Index matches_in_last_14d × travel_factor × timezone_shifts Caps at empirical ceiling
Roster Synergy Score Days_together × shared_maps × (1 - role_overlap_penalty) Decays for stand-ins
Match Importance Multiplier f(elimination_risk, seeding_impact, prize_differential) Dead rubber = reduce sizing
Hero/Agent Meta Shift Δ(pick_rate × win_rate) pre vs post patch per entity Feeds PSS
Resource Distribution Covariance Negative correlation matrix of kills/gold within team Player prop pricing edge

Phase 3: 8 Edge Scanners

Scanner Min Edge Unique Logic
Match Winner 4% Elo × map pool × RCIS × PSS × OLAF; Kalshi favorite bias fade
Map Winner 3% Map-specific Elo × veto MC; highest edge concentration (3-8%)
Tournament Outright 5% Bracket MC (20K sims) × form × fatigue × patch timing
Handicap 3% Map spread from match prob; check book internal consistency
Total Maps/Rounds 4% Format-adjusted simulation; over/under calibration by BO format
First Blood/Tower 5% Game-specific aggression (pistol WR for CS2, first tower for LoL)
Game Awards 6% Nomination history + sentiment + voting pattern analysis; no SCL anchor
Steam Rankings 6% Player count trends + release calendar + marketing spend; Kalshi-only

Phase 4: Dashboard


OPEN QUESTIONS FOR BOSS RULING

  1. Demo parsing investment: Build automated CS2/Valorant/Dota demo parser for proprietary metrics? High value but significant engineering.
  2. Commercial data feeds: Start with GRID/Bayes Esports ($$$) or prove concept with public scraping first?
  3. Player props modeling: Build covariance matrix for intra-team resource distribution?
  4. Match-fixing detection: Build suspicion_score as upstream data product?
  5. FIFA and CoD scope: Scanner-only (market odds comparison) or skip entirely?
  6. Scrim intelligence: Worth tracking given inherent unreliability?
  7. Patch-day failover: Build skeleton models for when primary data pipelines break?

COUNCIL METADATA

Detail Value
Council date 2026-04-01
Advisory responses 5 (all completed)
Peer reviews 5 (all completed)
Strongest advisor Opus (tied at 1/5 genuine — Sonnet endorsement breaks tie)
Runner-up Gemini (1/5 genuine from Grok — strongest on architecture/commercial feeds)
Biggest blind spot Pipeline fragility on patch days
Full council data /home/ubuntu/edgeclaw/data/councils/2026-04-01/gaming-esports-data-audit/
Source: ~/edgeclaw/results/panel-results/gaming-esports-data-audit-ruling.md