Research Pipeline — Build Plan (Mar 14, 2026)

For the Coding Agent

You are building a multi-AI research pipeline inside EdgeClaw (TypeScript, Hono server, tsx runtime). The pipeline collects data, sends it to AI researchers, routes through analysts, judges, and produces trading predictions scored by Brier score.

Read these docs before writing any line of code:

Pipeline design: memory/research-pipeline.md (the master blueprint — flow, settlement, scoring)
Team roster: memory/research-pipeline-jobs.md (who does what, which AI model, costs)
Pipeline auditions: memory/pipeline-auditions.md (why each model was chosen)
File system: memory/research-pipeline-filesystem.md (folder structure for outputs)

Existing codebase to reuse:

src/llm/client.ts — LLM API calls (Claude, Grok, Gemini, Groq, Ollama). Already works.
src/llm/council-clients.ts — Council/panel system. Adaptable for desk panels.
src/core/council.ts — 5-seat council structure. Reusable for analyst panels.
src/core/code-pipeline.ts — Existing code pipeline (will be updated to new Flash→Flash→Opus flow).
src/cron/scheduler.ts — Cron scheduling. Will run the master pipeline cron.
src/db.ts — better-sqlite3 wrapper. Already installed and working.
src/vps/ssh-client.ts — SSH client (Vultr shutting down, Phoenix merged into edgeclaw — may become unused).
src/shared/manager.ts — Shared folder read/write.
src/audit/logger.ts — Cost tracking per API call.
src/adapters/telegram.ts — Telegram message sending for Boss notifications.
src/config.ts — All env vars and API keys.

Tech stack: TypeScript, Node.js, better-sqlite3, Hono (HTTP server), tsx runtime. Server: Oracle ARM/aarch64, 4 OCPUs, 23GB RAM, 194GB disk (152GB free). Vultr closing, Phoenix merged — EdgeClaw is now the single server. LLM routing: Direct API for Grok (xAI key). OpenRouter for DeepSeek, Qwen, Flash, Flash Lite, Gemini Pro, Sonar Pro. Cloudflare proxy for Gemini (geo-block bypass). Claude Sonnet/Opus via Anthropic key.

PHASE 0: Shared Infrastructure (Build First — Everything Depends on This)

These are shared modules that multiple desks need. Build once, use everywhere.

0.1 — Central Database Schema

Create a single SQLite database for the research pipeline: data/db/research-pipeline.db

Tables needed:

desks — desk config (name, schedule, active flag)
predictions — all picks (desk, horizon, prediction text, confidence, analyst scores, Gemini grade, Opus verdict, settlement status, Brier score)
analyst_scores — per-analyst performance history (desk, role, model, Brier score, hit rate)
settlement_log — resolved predictions with outcomes
research_runs — log of each pipeline run (desk, timestamp, models used, cost, status)
disagreements — Opus/Gemini disagreement tracking per research-pipeline.md spec

Each data collection desk will have its own tables (defined in their spec docs). But the above tables are universal.

0.2 — Prediction Market Adapter

One module for Kalshi + Polymarket API access. All desks share this.

Kalshi:

REST API: https://api.elections.kalshi.com/trade-api/v2
Auth: RSA-PSS signing (key exists in config)
Methods: getMarkets(filter), getOrderBook(ticker), getPositions()
WebSocket for live price streaming

Polymarket:

REST API (CLOB): https://clob.polymarket.com
Methods: getMarkets(filter), getOrderBook(tokenId)
No auth needed for read-only

Output: Unified market snapshot format regardless of source platform.

Collect: Prices, volume, order book depth, open interest. Per desk filter specs.

0.3 — FRED Data Connector

One module that pulls ALL FRED series used by any desk (Options, Stocks, Futures, Forex, Crypto).

Key series: SOFR, Treasury yields (DGS2, DGS10, DGS30), WALCL (Fed balance sheet), RRPONTSYD (Reverse Repo), BAMLH0A0HYM2 (HY spread), BAMLC0A0CM (IG spread), MOVE index, DXY.

Frequency: Daily pull at 6:30 AM ET. Store in shared table fred_data(series_id, date, value).

API key: Free registration at https://fred.stlouisfed.org/docs/api/api_key.html

0.4 — VIX/Volatility Suite

One module: VIX, VVIX, SKEW, VIX9D, VIX3M, VIX6M, VIX futures term structure.

Source: CBOE (free delayed data). Frequency: Every 15 min for VIX/futures during market hours. Daily for the rest. Store in: volatility_data(metric, timestamp, value)

0.5 — SEC EDGAR Scraper

One module for all SEC filing types used by Options + Stocks desks.

Filing types: Form 4, Form 144, 13F, 13D/13G, N-PORT, 10-K/10-Q (XBRL), CORRESP, 424B2. Method: EDGAR full-text search + RSS feeds. Free, no auth. Frequency: Daily scan at 6:30 PM ET (filings drop after market close). Store in: Per-filing-type tables. Note: 424B2 parsing requires NLP — defer to Phase 3.

0.6 — Odds/Sportsbook Scraper

One module for Pinnacle + FanDuel + DraftKings odds.

Primary: Firecrawl (self-hosted) or direct scraping. Fallback: The Odds API (key: f63a46439d104a3a78dee17580c96279). Rate limited: 500 calls/month. Quota manager: Track usage across all sports desks. Priority order: closing odds > early odds > mid-day refresh. FanDuel prop lines: Separate collection for player props anchor (see sports-desk-data-inventory.md Player Props section).

0.7 — LLM Router for Pipeline

Extend src/llm/client.ts to support all pipeline models via OpenRouter:

DeepSeek V3.1, DeepSeek R1
Qwen 3
Gemini 2.5 Flash, Flash Lite, Gemini 3.1 Pro (via Cloudflare proxy)
Sonar Pro (Perplexity)
Grok 4.1 Fast (direct xAI key)
Sonnet 4.6, Opus (Anthropic key)

Add JSON schema enforcement (Zod validation) for all analyst outputs. Add retry logic for Sonnet (known JSON reliability issues — see auditions doc). Add cost tracking per call (already exists in audit logger, wire it up).

0.8 — Pipeline File System

Create the folder structure defined in memory/research-pipeline-filesystem.md. Per-desk folders, researcher outputs separated (grok/sonar), analyst outputs separated (analyst-1/2/3).

PHASE 1: Data Collection — Sports (Prove the Architecture)

Build ONE desk end-to-end first. Sports (NHL/NBA/NCAAB) is the best candidate — it has existing data on Vultr, the most mature spec, and the most frequent runs (daily + live).

1.1 — Sports Data Collectors

Implement collectors from sports-desk-data-inventory.md:

Kalshi/Polymarket lines (use shared adapter from 0.2)
Pinnacle fair values (use shared scraper from 0.6)
Stats scrapers: Hockey-Reference, Basketball-Reference, Natural Stat Trick, KenPom, etc.
Prediction model scrapers: DRatings, Sagarin, Massey, etc.
Store in sports-specific SQLite tables

1.2 — Research Module

The core pipeline runner. For a given desk:

Call Grok (Researcher #1) with desk search strategy queries
Call Sonar Pro (Researcher #2) with desk search strategy queries
Save raw research to filesystem per research-pipeline-filesystem.md
Call Data Validator (Flash Lite) to pre-filter
Save validated research

1.3 — Analyst Module

Route validated research to 3 desk analysts:

Look up desk's analyst assignments from config (from pipeline-jobs.md roster)
Send research + desk prompt to each analyst model
Enforce JSON schema output (Zod)
Collect structured predictions with conviction scores
Save analyst outputs

1.4 — Judge Module (Gemini)

Send analyst outputs to both Gemini judges:

Format Opus briefing spec (from research-pipeline.md)
Send to Memory Gemini (@GGAnalystBot) — via Telegram relay OR direct API
Send to Wiped Gemini (@GGWipedBot) — via Telegram relay OR direct API
Collect grades, evidence folders, long shot folders
Track Gemini disagreements

Decision needed: Are Gemini judges called via Telegram relay (Boss forwards prompts to bots in EDGE TEAM group) or direct Gemini API? The pipeline design says Boss relays. If direct API, use Cloudflare proxy.

1.5 — Judge Module (Opus)

Same pattern as Gemini judges:

Send evidence folders to Wiped Opus (@CCWipedBot) and Memory Opus (@ClaudeAnalystBot)
If they agree: done. Save verdict.
If they disagree: send each other's reasoning for deliberation round
If still disagree: alert Boss via Telegram
Track Opus disagreements

Same relay question as Gemini judges.

1.6 — Settlement Module

Cron: hourly for intraday, daily for swing, weekly for long-term
Check outcomes via Kalshi API, Polymarket API, sports score APIs
Calculate Brier scores
Update analyst/Gemini/Opus leaderboards
Archive settled predictions with full chain

1.7 — Boss Notifications

Send morning briefing to EDGE TEAM Telegram group:

Today's best picks (from overnight pipeline run)
Yesterday's results (settled predictions with outcomes)
Weekly Brier score summary
Any Opus disagreements needing Boss decision

PHASE 2: Data Collection — Finance Desks

Once Sports is working end-to-end, clone the architecture to finance desks. These share a lot of infrastructure from Phase 0.

2.1 — Options Desk Collectors

From options-desk-data-inventory.md:

Options chain snapshots (OCC daily + Yahoo Finance intraday)
DIY GEX, Vanna, Charm calculators
Binary probability calculator (d2 from Black-Scholes)
Deribit crypto options feed
Daily state table (calendar flags)
Earnings calendar

2.2 — Stocks Desk Collectors

From stocks-desk-data-inventory.md:

SEC EDGAR pipeline (uses shared module from 0.5)
FINRA short interest + dark pool ATS
FinBERT headline NLP
Congressional trading disclosures
USPTO patent citations
Reddit sentiment aggregator

2.3 — Futures Desk Collectors

From futures-desk-data-inventory.md:

Government API collectors: EIA, USDA, CFTC COT, BLS, NOAA, Treasury
Futures curves + term structure
Cross-commodity correlation monitors (12 ratios)
China data (NBS PMI, GACC customs)
Conviction scoring system (0-100 with rejection rules)

2.4 — Crypto Desk Collectors

From crypto-desk-data-inventory.md:

CoinGecko/CryptoCompare spot prices
Deribit options/futures (shared with Options desk)
DeFi Llama TVL/yields
Coinglass liquidations/funding
Exchange flow tracking (DIY Nansen via known wallets)
USDC peg monitoring (every 5 min — critical for Polymarket)
ETF flow tracking
10 composite signal calculators

2.5 — Forex Desk Collectors

From forex-data-collection.md:

OANDA v20 API (account already exists on Phoenix VPS)
All indicators, ZZ trend lines, divergences
Cross-market correlations
24/5 silent monitoring

2.6 — Connect All Finance Desks to Pipeline

Wire each desk into the Research Module → Analyst → Judge → Settlement flow built in Phase 1.

PHASE 3: Remaining Desks + Advanced Features

3.1 — Weather Desk

From weather-lock-in-analysis.md:

Rebuild weather monitor on EdgeClaw (replacing broken Vultr script)
NWS API temperature tracking (20 cities, every 10 min)
Kalshi weather market tracking (shared adapter)
Lock-in detection algorithm
Order book depth tracking
Forecast modeling (the key to printing cash)

3.2 — Sports Sub-Desks

Soccer collectors (from soccer-desk-data-inventory.md)
MLB collectors (from mlb-desk-data-inventory.md)
UFC/MMA collectors (from ufc-desk-data-inventory.md)
Player Props analysis layer (distribution fitting, threshold analysis)
DFS data layer (DK salaries, ownership projections)

3.3 — Research-Only Desks

These don't need data collectors — they use research pipeline searches + other desks' data:

Arbitrage (cross-market from other desks)
Earnings (Options + Stocks data)
Breaking News (real-time search triggers)
Fringe (Polymarket non-finance)
Regulatory (search-based)
Macro (weekly, shares Futures/Forex data)
AI & Tools (research-based)
eBay (monthly)

3.4 — Advanced Data Sources

SEC Form 424B2 NLP parser (complex, deferred from Phase 0)
Satellite data (Sentinel-5P NO2, NASA FIRMS)
Ship AIS tracking
GridStatus.io electricity prices

3.5 — Code Pipeline Update

Update src/core/code-pipeline.ts to new audition-winner flow:

Old: Sonnet writes → Gemini Pro reviews → Opus audits
New: Gemini 2.5 Flash writes → Flash first review → Flash applies fixes → Opus CLI final review
Cost per cycle: ~$0.06

3.6 — Self-Improving Prompts

Per pipeline design:

Min 20 settled predictions before proposing changes
Max 1 proposal per desk per week
Boss approves all changes
Track prompt versions

PHASE 4: Dashboard, Monitoring, Optimization

4.1 — Performance Dashboard

Brier scores per analyst, per desk, per horizon
Cost tracking per pipeline run
Gemini/Opus disagreement rates
Settlement rates and timing

4.2 — Automated Alerts

Analyst Brier score dropping below threshold → flag
Data collector down → retry + alert
API rate limit approaching → throttle
Unusual prediction clustering (3 picks on same event) → correlation alert

4.3 — Weekly OpusGodBot Review

Wire up @OpusGodBot for weekly Sunday review:

Evaluate all data from the week
Keep/Archive/Delete decisions
Audit all judges and researchers
Quality report to Boss

DEPENDENCIES MAP

Phase 0 (Infrastructure)
  ├── 0.1 Database ─────────────────┐
  ├── 0.2 Prediction Market Adapter ─┤
  ├── 0.3 FRED Connector ───────────┤
  ├── 0.4 VIX Suite ────────────────┤
  ├── 0.5 SEC EDGAR ────────────────┤── All needed before any desk
  ├── 0.6 Odds Scraper ─────────────┤
  ├── 0.7 LLM Router ──────────────┤
  └── 0.8 File System ─────────────┘
                │
Phase 1 (Sports — prove it works)
  ├── 1.1 Sports Collectors ────────┐
  ├── 1.2 Research Module ──────────┤
  ├── 1.3 Analyst Module ───────────┤── Full end-to-end for 1 desk
  ├── 1.4 Gemini Judge Module ──────┤
  ├── 1.5 Opus Judge Module ────────┤
  ├── 1.6 Settlement Module ────────┤
  └── 1.7 Boss Notifications ──────┘
                │
Phase 2 (Finance desks — clone + customize)
  ├── 2.1-2.5 Desk-specific collectors
  └── 2.6 Wire into Phase 1 pipeline
                │
Phase 3 (Everything else)
  ├── 3.1-3.2 Weather + Sports sub-desks
  ├── 3.3 Research-only desks
  ├── 3.4 Advanced data sources
  ├── 3.5 Code pipeline update
  └── 3.6 Self-improving prompts
                │
Phase 4 (Polish)
  ├── 4.1 Dashboard
  ├── 4.2 Alerts
  └── 4.3 Weekly review

DEFERRED ITEMS (Come Back Later)

Crypto desk search strategy (not written yet)
Gemini relay port 3100 on Vultr (check status)
Multi-user memory isolation (tiered.ts user_id columns)
Polygon.io subscription decision (Month 2+)

REFERENCE DOCS INDEX

Doc	Location	What It Covers
Pipeline design	memory/research-pipeline.md	Master blueprint
Team roster	memory/research-pipeline-jobs.md	Model assignments + budget
Auditions	memory/pipeline-auditions.md	Why each model was chosen
File system	memory/research-pipeline-filesystem.md	Folder structure
Research briefs	memory/research-briefs.md	Opus briefing format
Desk categories	memory/desk-categories.md	Evidence categories
Search strategies (main)	memory/research-search-strategies.md	Sports + 7 other desks
Search strategies (forex)	memory/research-search-strategies-forex.md	Forex queries
Search strategies (options)	memory/research-search-strategies-options.md	Options queries
Search strategies (stocks)	memory/research-search-strategies-stocks.md	Stocks queries
Search strategies (futures)	memory/research-search-strategies-futures.md	Futures queries
Search strategies (weather)	memory/research-search-strategies-weather.md	Weather queries
Search strategies (arbitrage)	memory/research-search-strategies-arbitrage.md	Arbitrage queries
Search strategies (AI/tools)	memory/research-search-strategies-ai-tools.md	AI tools queries
Search strategies (player props)	memory/research-search-strategies-player-props.md	Player props
Search strategies (DFS)	memory/research-search-strategies-dfs.md	DFS queries
Sports data spec	memory/sports-desk-data-inventory.md	NHL/NBA/NCAAB collection
Soccer data spec	memory/soccer-desk-data-inventory.md	Soccer collection
MLB data spec	memory/mlb-desk-data-inventory.md	Baseball collection
UFC data spec	memory/ufc-desk-data-inventory.md	MMA collection
Options data spec	memory/options-desk-data-inventory.md	Options collection
Stocks data spec	memory/stocks-desk-data-inventory.md	Stocks collection
Futures data spec	memory/futures-desk-data-inventory.md	Futures collection
Crypto data spec	memory/crypto-desk-data-inventory.md	Crypto collection
Forex data spec	memory/forex-data-collection.md	Forex monitoring
Weather data spec	memory/weather-lock-in-analysis.md	Weather strategy
Code pipeline	memory/pipeline-auditions.md (Code Pipeline Flow)	Flash→Flash→Opus
Chat brain restructure	memory/edgeclaw-chat-brain.md	Chat cost reduction
System upgrades	memory/research-upgrades.md	Post-build improvements

Source: ~/.claude/projects/-home-ubuntu-edgeclaw/memory/build-plan.md