Human Intel Desk — Data Collection Spec (Mar 28, 2026)

What This Desk Does

Monitors human crowd behavior — social media sentiment, expert predictions, and betting public patterns — to detect when crowd wisdom or crowd mania is creating mispricings on Kalshi. When experts unanimously agree on an outcome but the market is still pricing uncertainty, that's an edge. When Reddit is in a frenzy, the market may be overpriced.


DATA SOURCES

Reddit Scraper (Social Signals)

Table: social_signals (551 rows) Columns: source, subreddit, post_id, title, content_preview, score, comments, author Schedule: Every 12 hours What: Comment volume, upvotes, and sentiment from r/wallstreetbets, r/sportsbook, and other relevant subreddits. Crowd wisdom signal when calm, mania detection when volume spikes. Status: COLLECTING

ESPN Expert Picks

Table: expert_picks (0 rows) Columns: source, sport, expert_name, game_date, home_team, away_team, pick, confidence Schedule: 11 AM daily What: Analyst game predictions with confidence levels and historical accuracy. When experts unanimously agree but Kalshi is pricing 50/50, the market is likely wrong. Status: NOT COLLECTING — table empty, scraper may be broken


ANALYST PANEL

Role Model Notes
Data Collector Llama 4 Maverick Winner — 56/60, 0 false positives. Gathers raw human signals
Signal Analyst Grok 4.1 Fast Winner — 37/45, 0 dangerous. Interprets crowd signals
Contrarian Sonnet 4.6 Reused. Challenges the crowd thesis
Data Validator DeepSeek R1 Reused. Verifies data quality
Resolution Auditor Gemini Flash Reused from Settlement AI. Checks outcomes

DATA GAPS

ESPN Expert Picks

Table: expert_picks — 0 rows What: Scraper exists (scrape-espn-picks) but table is empty. Either the scraper is failing silently or ESPN changed their page structure. Impact: Missing a key "expert consensus" signal. This was supposed to be the desk's strongest edge for sports markets.

Twitter/X Sentiment

What: Real-time tweet volume and sentiment around specific events, players, teams. Volume spikes often precede prediction market moves. Impact: Would complement Reddit data with faster, more real-time signals. Not currently collected.

Betting Public Percentages

What: Public betting percentages from Action Network, Covers, or similar. Shows what percentage of bets (not dollars) are on each side. Impact: When public is heavily on one side but sharp money disagrees, that's a classic fade opportunity. Not collected.

Podcast/Media Mentions

What: Frequency analysis of topic mentions across sports media, financial media, and podcasts. Impact: Early detection of narrative shifts that move prediction markets. Low priority.


COLLECTION SCHEDULE

Data Type Frequency Source
Reddit sentiment Every 12h Reddit API (free)
ESPN Expert Picks 11 AM daily ESPN scrape — NOT COLLECTING
Twitter/X sentiment TBD NOT BUILT
Betting public % TBD NOT BUILT
Source: ~/.claude/projects/-home-ubuntu-edgeclaw/memory/human-intel-data-inventory.md