Gaming & Esports Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: Opus (tied with Gemini at 1 genuine vote each; Sonnet endorsement of Opus breaks tie — deepest analysis at 38K chars) Status: PENDING BOSS RULING on open questions

COUNCIL SUMMARY

Where Advisors Agreed

Map veto history is #1 missing data — feeds Map Veto MC but currently lacks structured training data
Player stats need normalization — raw HLTV/VLR/OE data must map to unified schema per game
Patch version tracking with meta scores is critical — PSS model needs structured patch DB
Roster change history with dates and RCIS — currently no structured timeline
LAN vs online split must be tracked per team — OLAF needs structured data source
Cross-game Elo normalization is low-value — markets don't cross, player pools barely overlap
FIFA and CoD should be deferred — limited data, thin markets, scanner-only approach
CS2 and LoL have best data quality — HLTV and Oracle's Elixir are comprehensive
Scrim intelligence needs confidence scoring — unreliable by nature, must weight by source
Non-competitive markets (Game Awards, Steam) need completely different approach — no sharp book anchor

Where Advisors Disagreed

Schema complexity: gpt-oss proposed enterprise star schema (Dim/Fact tables). Opus proposed lean, game-specific extensions. Council verdict: Lean schema with game-specific JSON columns where needed.
Data acquisition philosophy: Gemini pushed for commercial feeds (GRID, Bayes). Others relied on public scraping. Council verdict: Start with public scraping, evaluate commercial feeds if latency proves to be a bottleneck.
Demo/replay parsing: Gemini proposed automated demo parsing as primary data moat. Others didn't prioritize. Council verdict: Investigate as Phase 2 — high potential but significant engineering overhead.
Scrim intelligence: Varying confidence in scrim data value. Council verdict: Track but with composite confidence score (reliability × recency × corroboration); ignore below 0.3 threshold.

Strongest Arguments (from peer review)

Opus wins with the most operationally actionable analysis:

Explicitly deprioritized low-value ideas (cross-game Elo) with reasoning
Game-specific analysis identifies exact missing data per source (HLTV shows vetoes on match pages but not structured feeds)
Custom metric formulas with actual thresholds (scrim confidence composite, adaptation speed with sample-size caveat)
Match card design with exact field layouts and example values
Tiered approach: full cards for CS2/LoL/Valorant/Dota, simplified for FIFA/CoD
Priority ranking by expected ROI with conviction

Biggest Blind Spot

Pipeline fragility on patch days — When publishers update games, data telemetry, replay file structures, and APIs often break. Patch Day 1 is when market inefficiencies peak AND when data pipelines are most likely to be down. No advisor addressed failover strategies or "skeleton models" for the 48-hour period when primary data sources break after a patch.

What Everyone Missed (from peer reviews)

Player props / SGP covariance — In-game resources (kills, gold) are finite and negatively correlated between teammates. Books price player props as independent. Covariance matrix for intra-team resource distribution = high-yield alpha.
Patch-aware temporal data versioning — Hero matchup matrix from patch 7.33 is poisonous on 7.34. Need "as-of" date queries, patch-boundary training splits, no future data leakage in backtests.
Match-fixing detection as upstream data product — Need suspicion_score on every match, not just as a kill switch. Especially in lower-tier CS2, FIFA, CoD.
Dead rubber/strat-hiding — Teams in locked positions intentionally lose or test strategies. Match Importance Multiplier needed.
Real-time streaming ingestion — Batch ETL (nightly scrapes) means always behind books with real-time feeds. Need streaming layer for post-match and live data.

BUILD PLAN

Phase 1: Core Data Tables

esports_teams: team_id, name, game, region, tier (1-3), roster_stability_days, lan_wr, online_wr, olaf_factor, active, updated_at esports_players: player_id, name, game, team_id, role, nationality, contract_status, faceit_elo, active esports_matches: match_id, tournament_id, game, team_a_id, team_b_id, format (BO1/BO3/BO5), lan_online, server_region, date, patch_version, suspicion_score esports_match_results: result_id, match_id, winner_id, map_score_a, map_score_b, total_rounds, duration_min, match_importance esports_map_results: map_id, match_id, map_number, map_name, team_a_score, team_b_score, winner_id, first_blood_team, first_tower_team, side_scores (JSON), game_specific (JSON) esports_map_vetoes: veto_id, match_id, veto_order, team_id, action (ban/pick/decider), map_name esports_rosters: roster_id, team_id, player_id, role, joined_date, left_date, change_type (permanent/stand-in/loan), rcis_score esports_patches: patch_id, game, version, release_date, depot_detected_date, severity, meta_fluidity_index, chaos_window_end, notes_parsed (JSON) esports_patch_team_impact: impact_id, patch_id, team_id, pss_score, pre_patch_wr, post_patch_wr, affected_entities (JSON) esports_player_stats: stat_id, match_id, map_id, player_id, game, kills, deaths, assists, rating, game_specific (JSON — ADR/KAST for CS2, ACS for Val, CS@15/GD@15 for LoL, GPM/XPM for Dota) esports_tournaments: tournament_id, name, game, tier (S/A/B/C), format, prize_pool, start_date, end_date, lan_online, location esports_brackets: bracket_id, tournament_id, round, match_id, seed_a, seed_b, bo_format esports_odds: odds_id, match_id, market_type, book, selection, odds, timestamp esports_scrim_intel: intel_id, team_a, team_b, source, confidence_score, date_reported, result_claimed, map, notes esports_hero_agent_meta: meta_id, game, patch_version, entity_name, pick_rate, ban_rate, win_rate, tier, updated_at esports_steam_noncomp: record_id, market_type (game_awards/steam_ranking/twitch), entity, metric_value, timestamp, notes

Phase 2: Custom Metrics

Metric	Formula	Notes
Map Pool Depth	Count of maps with >45% WR in last 20 maps played	Broader = more flexible in veto
Forced Map Disadvantage Prob	From Map Veto MC: P(team plays worst map)	Key input to map winner scanner
Adaptation Speed	(Post-patch WR - Pre-patch WR) / field_avg_change	Needs 3-4 patch cycles to calibrate
Scrim Confidence Composite	reliability × recency × context × corroboration	Ignore below 0.3
Tournament Fatigue Index	matches_in_last_14d × travel_factor × timezone_shifts	Caps at empirical ceiling
Roster Synergy Score	Days_together × shared_maps × (1 - role_overlap_penalty)	Decays for stand-ins
Match Importance Multiplier	f(elimination_risk, seeding_impact, prize_differential)	Dead rubber = reduce sizing
Hero/Agent Meta Shift	Δ(pick_rate × win_rate) pre vs post patch per entity	Feeds PSS
Resource Distribution Covariance	Negative correlation matrix of kills/gold within team	Player prop pricing edge

Phase 3: 8 Edge Scanners

Scanner	Min Edge	Unique Logic
Match Winner	4%	Elo × map pool × RCIS × PSS × OLAF; Kalshi favorite bias fade
Map Winner	3%	Map-specific Elo × veto MC; highest edge concentration (3-8%)
Tournament Outright	5%	Bracket MC (20K sims) × form × fatigue × patch timing
Handicap	3%	Map spread from match prob; check book internal consistency
Total Maps/Rounds	4%	Format-adjusted simulation; over/under calibration by BO format
First Blood/Tower	5%	Game-specific aggression (pistol WR for CS2, first tower for LoL)
Game Awards	6%	Nomination history + sentiment + voting pattern analysis; no SCL anchor
Steam Rankings	6%	Player count trends + release calendar + marketing spend; Kalshi-only

Phase 4: Dashboard

Match board with roster status, patch context, edge alerts
Team drill-down with map pool heatmap, roster timeline, RCIS history
Map veto simulator with real-time probability
Patch tracker with chaos window countdown and team impact rankings
Meta monitor showing hero/agent pick/ban/win rate shifts
Scrim intelligence feed with confidence scoring
Tournament bracket with live advancement probabilities
Non-competitive market tracker (Game Awards, Steam)
P&L by game, market type, edge bucket

OPEN QUESTIONS FOR BOSS RULING

Demo parsing investment: Build automated CS2/Valorant/Dota demo parser for proprietary metrics? High value but significant engineering.
Commercial data feeds: Start with GRID/Bayes Esports ($$$) or prove concept with public scraping first?
Player props modeling: Build covariance matrix for intra-team resource distribution?
Match-fixing detection: Build suspicion_score as upstream data product?
FIFA and CoD scope: Scanner-only (market odds comparison) or skip entirely?
Scrim intelligence: Worth tracking given inherent unreliability?
Patch-day failover: Build skeleton models for when primary data pipelines break?

COUNCIL METADATA

Detail	Value
Council date	2026-04-01
Advisory responses	5 (all completed)
Peer reviews	5 (all completed)
Strongest advisor	Opus (tied at 1/5 genuine — Sonnet endorsement breaks tie)
Runner-up	Gemini (1/5 genuine from Grok — strongest on architecture/commercial feeds)
Biggest blind spot	Pipeline fragility on patch days
Full council data	`/home/ubuntu/edgeclaw/data/councils/2026-04-01/gaming-esports-data-audit/`

Source: ~/edgeclaw/results/panel-results/gaming-esports-data-audit-ruling.md