Crypto Desk Data Inventory — Merged Final Spec

Status: MERGED FINAL — Panel synthesis complete Base: Claude Opus 4.6 review (complete, unabridged) Panel additions: DeepSeek V3.1, Gemini 3.1 Pro, GPT-4.1, Grok 4-Fast, Grok 4.1, Sonar Reasoning Pro (cherry-picked unique contributions, each tagged) Date: 2026-03-15 Note: FTX references removed (collapsed 2022, no longer operational). All sources are free-tier or self-computed unless noted.


Purpose

This is a data collection spec for the Crypto Desk pipeline. It defines every data source to collect, at what frequency, via which API, stored in which SQLite table, and why it matters for pricing Kalshi and Polymarket crypto contracts. This document is the ground truth for the coding bot building the collector.


Overall Assessment (Opus Panel Lead)

Grade: A- → A+ with these fixes applied

"Exceptionally thorough spec that covers 95% of what matters for crypto prediction markets. The main gaps are in options-implied probability methodology, halving cycle awareness, and some fragile data source assumptions. The biggest risk is operational: 2,150+ API calls/day across 20+ free-tier sources requires robust retry/fallback logic that will consume more engineering time than the data collection itself."


SQLite Database Setup

Database File

/home/ubuntu/edgeclaw/data/db/crypto-desk.db

CRITICAL: Enable WAL Mode on First Run

[PANEL: Gemini 3.1 Pro] SQLite in default journal mode will produce SQLITE_BUSY crashes when multiple async workers write every 2–5 minutes. WAL mode is mandatory.

PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;
PRAGMA cache_size=-64000;   -- 64MB cache
PRAGMA temp_store=MEMORY;
PRAGMA mmap_size=268435456; -- 256MB mmap

Run once at startup and verify journal_mode returns wal.


Section 1: Spot Price Data (Top 50 Coins)

Cadence: Top 10 coins every 5 min; coins 11–50 every 30 min Source: CoinGecko /coins/markets (single batch call per_page=50) Coins: BTC, ETH, SOL, XRP, BNB, DOGE, ADA, AVAX, DOT, LINK (top 10 always); remainder by market cap Fallback: CryptoCompare https://min-api.cryptocompare.com/data/pricemultifull?fsyms=BTC,ETH,SOL&tsyms=USD

Rationale: Core price feed for all composite signals and prediction market comparisons.

CREATE TABLE IF NOT EXISTS spot_prices (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  coin_id TEXT NOT NULL,
  symbol TEXT NOT NULL,
  price_usd REAL NOT NULL,
  market_cap_usd REAL,
  volume_24h_usd REAL,
  price_change_pct_1h REAL,
  price_change_pct_24h REAL,
  price_change_pct_7d REAL,
  circulating_supply REAL,
  total_supply REAL,
  ath_usd REAL,
  ath_change_pct REAL,
  source TEXT NOT NULL DEFAULT 'coingecko',
  is_stale INTEGER NOT NULL DEFAULT 0
);
CREATE INDEX IF NOT EXISTS idx_spot_prices_coin_ts ON spot_prices(coin_id, timestamp DESC);

CoinGecko Rate Management (Opus aplus fix):


Section 2: OHLCV Historical Data

Cadence: Hourly for BTC, ETH, SOL; daily for remaining top 10 Source: CoinGecko /coins/{id}/ohlc

Rationale: Used to calculate realized volatility (RV) windows. Raw close prices are the input to Composite Signal 11 (Vol Risk Premium).

CREATE TABLE IF NOT EXISTS ohlcv (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  coin_id TEXT NOT NULL,
  open REAL NOT NULL,
  high REAL NOT NULL,
  low REAL NOT NULL,
  close REAL NOT NULL,
  volume REAL
);
CREATE INDEX IF NOT EXISTS idx_ohlcv_coin_ts ON ohlcv(coin_id, timestamp DESC);

Section 3: Global Market Data

Cadence: Every 15 min Source: CoinGecko /global

Fields: total_market_cap_usd, total_volume_usd, btc_dominance_pct, eth_dominance_pct, defi_volume_24h, defi_market_cap, stablecoin_volume_24h, active_cryptocurrencies

CREATE TABLE IF NOT EXISTS global_market (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  total_market_cap_usd REAL,
  total_volume_24h_usd REAL,
  btc_dominance_pct REAL,
  eth_dominance_pct REAL,
  defi_volume_24h_usd REAL,
  defi_market_cap_usd REAL,
  stablecoin_volume_24h_usd REAL,
  active_cryptocurrencies INTEGER
);

Section 4: Trending Coins

Cadence: Every 2 hours Source: CoinGecko /search/trending

CREATE TABLE IF NOT EXISTS trending_coins (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  rank INTEGER NOT NULL,
  coin_id TEXT NOT NULL,
  symbol TEXT NOT NULL,
  price_btc REAL,
  score REAL
);

Section 5: Fear & Greed Index

Cadence: Every 6 hours Source: https://api.alternative.me/fng/

CREATE TABLE IF NOT EXISTS fear_greed (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  value INTEGER NOT NULL,
  classification TEXT NOT NULL,
  next_update INTEGER
);

Classifications: Extreme Fear (0–24), Fear (25–44), Neutral (45–55), Greed (56–74), Extreme Greed (75–100).


Section 6: Deribit Options Chain

Cadence: Every 15 min Source: Deribit public API (no auth required)

Endpoints:

Fields per instrument: instrument_name, strike, expiry, type (call/put), mark_price, bid, ask, mark_iv, underlying_price, oi (open_interest), volume, delta, gamma, vega, theta, rho

CREATE TABLE IF NOT EXISTS deribit_options (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  currency TEXT NOT NULL,
  instrument_name TEXT NOT NULL,
  strike REAL NOT NULL,
  expiry INTEGER NOT NULL,
  option_type TEXT NOT NULL,
  mark_price REAL,
  bid REAL,
  ask REAL,
  mark_iv REAL,
  underlying_price REAL,
  open_interest REAL,
  volume REAL,
  delta REAL,
  gamma REAL,
  vega REAL,
  theta REAL,
  rho REAL
);
CREATE INDEX IF NOT EXISTS idx_deribit_opts_ts ON deribit_options(currency, timestamp DESC);
CREATE INDEX IF NOT EXISTS idx_deribit_opts_strike ON deribit_options(currency, strike, expiry);

Delta-OI Change Tracking (Opus aplus: MUST-HAVE):

CREATE TABLE IF NOT EXISTS deribit_oi_changes (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  currency TEXT NOT NULL,
  instrument_name TEXT NOT NULL,
  strike REAL NOT NULL,
  expiry INTEGER NOT NULL,
  option_type TEXT NOT NULL,
  oi_prev REAL,
  oi_now REAL,
  delta_oi REAL,
  volume_prev REAL,
  volume_now REAL,
  delta_volume REAL,
  volume_oi_ratio REAL,
  fresh_positioning INTEGER DEFAULT 0
);
CREATE INDEX IF NOT EXISTS idx_deribit_oi_changes_ts ON deribit_oi_changes(currency, timestamp DESC);

On each pull: compare current OI to previous snapshot. If delta_oi > 500 contracts in a 15-min window, set fresh_positioning=1. Feed fresh-only OI into GEX calculations.

Storage: ~58 MB/day raw. Aggregation schedule:


Section 7: Deribit DVOL Index

Cadence: Every 15 min Source: https://www.deribit.com/api/v2/public/get_volatility_index_data?currency=BTC&start_timestamp={ts}&end_timestamp={ts}

CREATE TABLE IF NOT EXISTS dvol (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  currency TEXT NOT NULL,
  dvol REAL NOT NULL,
  open REAL,
  high REAL,
  low REAL
);

Section 8: Deribit Futures (Basis and Term Structure)

Cadence: Every 15 min Source: Deribit get_instruments?kind=future + get_ticker

CREATE TABLE IF NOT EXISTS deribit_futures (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  currency TEXT NOT NULL,
  instrument_name TEXT NOT NULL,
  expiry INTEGER,
  is_perpetual INTEGER NOT NULL DEFAULT 0,
  mark_price REAL,
  index_price REAL,
  basis_usd REAL,
  basis_pct REAL,
  open_interest REAL,
  volume_24h REAL,
  funding_rate REAL,
  days_to_expiry REAL
);
CREATE INDEX IF NOT EXISTS idx_deribit_futures_ts ON deribit_futures(currency, timestamp DESC);

Perpetual Futures Basis Term Structure [PANEL: Sonar Reasoning Pro]: Calculate and store the term structure — 3-day, 7-day, 30-day futures vs spot. Persistent positive term structure (contango) is bearish signal (traders paying premium to stay long). Formula: annualized_basis = (futures_price - spot) / spot * 365 / days_to_expiry.


Section 9: Multi-Exchange Perpetual Funding Rates

Cadence: Every 1 hour (or at settlement for each exchange) Sources:

Settlement cadence: Binance 8h, OKX 8h, Bybit 1h, dYdX variable.

CREATE TABLE IF NOT EXISTS funding_rates (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  exchange TEXT NOT NULL,
  symbol TEXT NOT NULL,
  funding_rate REAL NOT NULL,
  next_funding_time INTEGER,
  mark_price REAL,
  index_price REAL
);
CREATE INDEX IF NOT EXISTS idx_funding_ts ON funding_rates(exchange, symbol, timestamp DESC);

Key signal [PANEL: Sonar Reasoning Pro]: Do NOT average funding rates across exchanges — the divergence is the signal. Track: which exchange is most overlevered (highest funding)? Is the gap persisting or closing? Calculate funding arbitrage spread: cost to carry long on low-funding exchange, short on high-funding exchange.


Section 10: Liquidation Data

Cadence: Every 5 min Primary method (Opus aplus fix — Coinglass free API is unreliable):

Derive liquidation estimates from OI drops across exchanges:

  1. Poll OI from Binance fapi/v1/openInterest, Bybit v5/market/tickers, OKX v5/public/open-interest every 5 min
  2. If OI drops >2% in a 5-min window with no corresponding spot volume increase >5% → classify as liquidation event
  3. Use long/short ratio (Binance fapi/v1/globalLongShortAccountRatio) to determine which side
CREATE TABLE IF NOT EXISTS liquidation_estimates (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  exchange TEXT NOT NULL,
  symbol TEXT NOT NULL,
  oi_before REAL,
  oi_after REAL,
  oi_drop_pct REAL,
  estimated_liq_usd REAL,
  side TEXT,
  confidence TEXT,
  source TEXT DEFAULT 'derived'
);

Alert logic [PANEL: Sonar Reasoning Pro]: Track direction, not just magnitude. $300M long liquidations = bearish (forced selling). $300M short liquidations = BULLISH (forced buying, bounce setup). Alert thresholds:

Secondary: scrape Coinglass web dashboard (coinglass.com/LiquidationData) every 15 min as validation only.


Section 11: DIY Gamma Exposure (GEX)

Cadence: Calculated every 15 min from Section 6 data Note [PANEL: Gemini 3.1 Pro]: Deribit options are inverse contracts (settled in BTC/ETH, not USD). Standard equity GEX math needs adjustment for non-linear inverse payoffs.

Corrected GEX formula (Opus aplus): For each Deribit instrument with Greeks:

GEX_per_instrument = OI * gamma * spot_price^2 * contract_multiplier * 0.01

Dealer assumption heuristic:

DTE weighting: multiply GEX by (30/DTE) for DTE < 30 (near-expiry gamma dominates).

Net GEX = sum of call GEX (positive) minus put GEX (negative), adjusted for dealer split.

CREATE TABLE IF NOT EXISTS crypto_gex (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  currency TEXT NOT NULL,
  total_gex REAL,
  gamma_flip_price REAL,
  max_gamma_strike REAL,
  dealer_regime TEXT,
  top_5_gamma_walls TEXT,
  fresh_oi_gex REAL,
  total_oi_gex REAL
);

Section 12: Exchange Flows / Whale Alert / DIY Nansen

Cadence: Every 10 min via Whale Alert; continuous via Etherscan/Tronscan for known wallets Source: Whale Alert https://api.whale-alert.io/v1/transactions?api_key={key}&min_value=1000000&start={since_ts}

Whale Alert fix (Opus aplus): Undocumented daily cap ~200 calls. At 10-min intervals = 144 calls/day (safe). Use start timestamp to only fetch new transactions since last poll. Deduplicate by tx_hash.

CREATE TABLE IF NOT EXISTS whale_transactions (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  tx_hash TEXT UNIQUE NOT NULL,
  blockchain TEXT,
  symbol TEXT,
  amount REAL,
  amount_usd REAL,
  from_address TEXT,
  from_label TEXT,
  to_address TEXT,
  to_label TEXT,
  transaction_type TEXT,
  source TEXT DEFAULT 'whale_alert'
);
CREATE INDEX IF NOT EXISTS idx_whale_ts ON whale_transactions(timestamp DESC);

Exchange Reserve Outflow Patterns [PANEL: Sonar Reasoning Pro]: Calculate 7-day rolling net flows per exchange — the pattern matters more than snapshots. Net inflows 7 days = bullish. Net outflows 7 days = bearish structural drain.

CREATE TABLE IF NOT EXISTS exchange_flow_summary (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  exchange TEXT NOT NULL,
  symbol TEXT NOT NULL,
  net_flow_7d_usd REAL,
  net_flow_30d_usd REAL,
  inflow_24h_usd REAL,
  outflow_24h_usd REAL,
  trend TEXT
);

Retail vs Institutional Volume Breakdown [PANEL: Grok 4.1 (aplus)]: Use Binance recent trades endpoint to categorize:

Track daily retail/institutional volume ratio. Retail spikes often mark local tops.


Section 13: Dormant Wallet / Known Address Monitoring

Cadence: Every 30 min via Etherscan/Blockchain.com (not BitcoinTreasuries.net scraping)

Critical fix [PANEL: Sonar Reasoning Pro]: BitcoinTreasuries.net has no API and has gone down for days at a time. Monitor Mt. Gox and government wallets directly via blockchain APIs — those wallet addresses are public and known.

CREATE TABLE IF NOT EXISTS watched_wallets (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  address TEXT NOT NULL,
  label TEXT NOT NULL,
  blockchain TEXT NOT NULL,
  category TEXT,
  notes TEXT,
  last_active_ts INTEGER,
  balance_btc REAL,
  balance_eth REAL
);

CREATE TABLE IF NOT EXISTS wallet_movements (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  address TEXT NOT NULL,
  label TEXT NOT NULL,
  tx_hash TEXT NOT NULL,
  direction TEXT NOT NULL,
  amount_native REAL,
  amount_usd REAL,
  counterparty_label TEXT,
  alert_level TEXT
);

Priority watched wallets:


Section 14: Bitcoin Network Statistics

Cadence: Every 30 min Source: https://api.blockchain.info/stats and https://mempool.space/api/v1/mining/hashrate/3d

CREATE TABLE IF NOT EXISTS btc_network_stats (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  hash_rate_eh REAL,
  difficulty REAL,
  block_height INTEGER,
  mempool_size_mb REAL,
  mempool_tx_count INTEGER,
  avg_fee_sat_per_vbyte REAL,
  blocks_since_last_halving INTEGER,
  days_since_last_halving REAL,
  pct_through_cycle REAL,
  cycle_phase TEXT,
  next_halving_est_date TEXT,
  current_block_reward REAL,
  miners_revenue_usd REAL,
  cumulative_miner_revenue_usd REAL
);

Bitcoin Halving Cycle Position Tracker (Opus aplus — MUST-HAVE):

Calculate from block height. Last halving block = 840,000 (April 2024). Next halving block = 1,050,000. Blocks per cycle = 210,000.

blocks_since_halving = current_block - 840000
pct_through_cycle = blocks_since_halving / 210000 * 100

Cycle phases:

This phase should modulate every composite signal. Bullish signals in EARLY_BULL: higher confidence. Bearish signals in DISTRIBUTION: higher confidence.


Section 15: Ethereum Network Statistics

Cadence: Every 30 min Source: Etherscan API https://api.etherscan.io/api?module=gastracker&action=gasoracle

CREATE TABLE IF NOT EXISTS eth_network_stats (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  gas_price_gwei_slow REAL,
  gas_price_gwei_standard REAL,
  gas_price_gwei_fast REAL,
  eth_price_usd REAL,
  pending_tx_count INTEGER,
  suggested_base_fee REAL
);

[PANEL: Gemini 3.1 Pro] Note: Ethereum L1 gas is no longer the primary proxy for network demand — retail and DEX volume have migrated to Solana and L2s. Track L1 gas for DeFi protocol cost signals, not as a retail activity proxy. See Section 44b for Solana/L2 metrics.


Section 16: Stablecoin Mint/Burn Events (Real-Time Detection)

Cadence: Every 10 min via Whale Alert filter; Etherscan/Tronscan event monitoring for known treasury addresses

Critical addition (Opus aplus — MUST-HAVE): Supply snapshots (Section 18) are collected every 2 hours. Discrete mint/burn events are detectable 30–60 minutes earlier. These events have documented 2–6 hour lead time on BTC price moves.

Filter Whale Alert for: from_label OR to_label containing "Tether Treasury", "Circle", or known USDC/USDT treasury addresses.

Etherscan: monitor USDC proxy contract 0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48 for Transfer events from/to address(0) (mints/burns).

[PANEL: DeepSeek V3.1] Tronscan: monitor USDT contract TR7NHqjeKQxGTCi8q8ZY4pL8otSzgjLj6t for similar mint/burn events. Redemption activity (USDT → treasury) precedes supply changes by 6–24 hours.

CREATE TABLE IF NOT EXISTS stablecoin_events (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  stablecoin TEXT NOT NULL,
  event_type TEXT NOT NULL,
  amount_usd REAL NOT NULL,
  chain TEXT NOT NULL,
  tx_hash TEXT UNIQUE,
  source_label TEXT,
  dest_label TEXT,
  alert_level TEXT
);

Alert thresholds:

Stablecoin Mint/Burn Velocity [PANEL: Sonar Reasoning Pro]: Track not just events but velocity — mints/burns per hour. Supply can swing $5B/day; the SPEED determines whether it's institutional redemption (bearish) or normal activity (neutral).

CREATE TABLE IF NOT EXISTS stablecoin_velocity (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  stablecoin TEXT NOT NULL,
  chain TEXT NOT NULL,
  mints_1h_usd REAL DEFAULT 0,
  burns_1h_usd REAL DEFAULT 0,
  net_flow_1h_usd REAL DEFAULT 0,
  mints_6h_usd REAL DEFAULT 0,
  burns_6h_usd REAL DEFAULT 0,
  velocity_trend TEXT
);

Section 17: Stablecoin Peg Monitoring

Cadence: Every 30 min default; escalate to 2-min intervals if deviation >0.3% Sources: CoinGecko, Binance, Kraken, DefiLlama

[PANEL: Grok 4.1 (aplus)]: Dynamic escalation logic — normal periods: every 15 min. Depeg detected (>0.3%): switch to 30-second intervals. Store only significant changes >0.1%.

[PANEL: Sonar Reasoning Pro]: In the March 2023 USDC depeg, 5-minute HTTP polling detected it 60 minutes in. Kraken's order book showed it 10 minutes earlier via large USDC sells at discount. The real fix is DEX pool monitoring + websocket listeners, not faster polling.

Alert thresholds [PANEL: Grok 4-Fast]: Use rolling 7-day standard deviation per stablecoin/chain. Alert at 2–3 sigma deviations. Chain-specific baselines: Polygon USDC tolerance ~0.3%, Ethereum USDC ~0.1%.

CREATE TABLE IF NOT EXISTS stablecoin_pegs (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  stablecoin TEXT NOT NULL,
  chain TEXT NOT NULL,
  price_usd REAL NOT NULL,
  deviation_pct REAL NOT NULL,
  rolling_7d_std REAL,
  sigma_deviation REAL,
  alert_level TEXT,
  source TEXT
);
CREATE INDEX IF NOT EXISTS idx_pegs_stablecoin_ts ON stablecoin_pegs(stablecoin, chain, timestamp DESC);

Section 18: Stablecoin Supply Tracking

Cadence: Every 2 hours Source: DefiLlama https://stablecoins.llama.fi/stablecoins and CoinGecko per-coin data

CREATE TABLE IF NOT EXISTS stablecoin_supply (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  stablecoin TEXT NOT NULL,
  chain TEXT,
  supply_usd REAL NOT NULL,
  supply_change_24h_usd REAL,
  supply_change_7d_usd REAL
);

Track USDT-to-USDC rotation: if USDT share of total stablecoin supply drops >2% in 7 days while USDC rises, signal risk-off rotation.


Section 19: USDT-USDC Supply Rotation Signal

Cadence: Calculated daily from Section 18 data

CREATE TABLE IF NOT EXISTS stablecoin_rotation (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  usdt_supply_usd REAL,
  usdc_supply_usd REAL,
  usdt_share_pct REAL,
  usdc_share_pct REAL,
  usdt_share_7d_change REAL,
  rotation_signal TEXT,
  stablecoin_ratio REAL
);

Section 20: Curve Pool Imbalance Monitoring

Cadence: Every 2 hours Critical fix (Opus aplus): Monitor top 5 Curve stablecoin pools by current TVL, NOT just the legacy 3pool which has lost 80%+ of TVL since 2023.

Source: https://api.llama.fi/protocol/curve-dex — filter pools containing USDT, USDC, DAI, FRAX, crvUSD. Sort by TVL. Monitor top 5.

Alert threshold fix [PANEL: Sonar Reasoning Pro]: By the time a pool reaches 60% imbalance (the old threshold), arbitrage is over and price has depegged. The signal is ACCELERATION of imbalance:

CREATE TABLE IF NOT EXISTS curve_pool_balance (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  pool_name TEXT NOT NULL,
  pool_address TEXT,
  tvl_usd REAL,
  token_shares_json TEXT,
  max_imbalance_ratio REAL,
  imbalanced_token TEXT,
  share_change_1h REAL,
  alert_level TEXT
);

Section 21: Stablecoin Market Share Summary

Cadence: Daily Calculated from: Sections 18 and 19 data


Section 22: DeFi TVL by Protocol and Chain

Cadence: Every 4 hours Source: DefiLlama https://api.llama.fi/protocols and https://api.llama.fi/v2/chains

CREATE TABLE IF NOT EXISTS defi_tvl (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  protocol TEXT,
  chain TEXT,
  tvl_usd REAL NOT NULL,
  tvl_change_1d_pct REAL,
  tvl_change_7d_pct REAL,
  category TEXT,
  data_scope TEXT
);

Solana and L2 Network Metrics [PANEL: Gemini 3.1 Pro — MUST-HAVE]:

ETH L1 gas is no longer the primary proxy for retail activity. Retail and DEX volume have migrated to Solana and L2s.

CREATE TABLE IF NOT EXISTS l2_network_metrics (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  chain TEXT NOT NULL,
  tvl_usd REAL,
  tps REAL,
  active_addresses_24h INTEGER,
  dex_volume_24h_usd REAL,
  fees_24h_usd REAL,
  data_source TEXT DEFAULT 'defillama'
);

Track: Base (Coinbase L2), Arbitrum, Optimism, Solana via https://api.llama.fi/v2/chains + L2Beat API https://l2beat.com/api/ for granular activity data.


Section 23: DEX Volume and Activity

Cadence: Every 4 hours Source: DefiLlama https://api.llama.fi/overview/dexs

CREATE TABLE IF NOT EXISTS dex_volumes (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  protocol TEXT NOT NULL,
  chain TEXT,
  volume_24h_usd REAL,
  volume_7d_usd REAL,
  volume_change_1d_pct REAL,
  fees_24h_usd REAL
);

Memecoin / Pump.fun Revenue [PANEL: Gemini 3.1 Pro — NICE-TO-HAVE]: Source: https://api.llama.fi/summary/fees/pump-fun (daily). When Pump.fun daily revenue spikes, retail euphoria is at extreme — cycle top warning signal.

CREATE TABLE IF NOT EXISTS memecoin_activity (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  protocol TEXT NOT NULL,
  daily_revenue_usd REAL,
  daily_fees_usd REAL,
  alert_level TEXT
);

DEX vs CEX Price Divergence [PANEL: Sonar Reasoning Pro]: Track BTC/ETH prices from Uniswap v3/v4 weighted averages vs CEX prices. CEX-DEX divergence reveals whether retail (DEX) or institutions (CEX) are driving moves. When retail is ahead of institutions: contrarian signal.

Source: Uniswap subgraph (free, 100K queries/month).


Section 24: DeFi Lending Rates

Cadence: Every 4 hours Source: DefiLlama https://yields.llama.fi/pools

Simplification [PANEL: Sonar Reasoning Pro]: The full endpoint returns 14,038 pools across 522 protocols. Collecting all of them is scope creep. Monitor top 5 by TVL: Aave, Curve, Lido, Compound, Balancer. These cover 70%+ of DeFi lending.

CREATE TABLE IF NOT EXISTS defi_lending_rates (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  protocol TEXT NOT NULL,
  chain TEXT,
  asset TEXT NOT NULL,
  apy REAL,
  tvl_usd REAL,
  pool_id TEXT
);

Section 25: Staking Yields

Cadence: Daily Source: DefiLlama yields API for ETH staking; Lido, Rocket Pool APIs

CREATE TABLE IF NOT EXISTS staking_yields (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  asset TEXT NOT NULL,
  protocol TEXT NOT NULL,
  apy REAL,
  total_staked_usd REAL
);

Section 26: BTC ETF Flow Tracker

Cadence: Daily at 6:00 PM ET (Farside posts after US market close) Source: Farside Investors https://farside.co.uk/btc.html

Farside scraping fix [PANEL: Gemini 3.1 Pro]: Farside sits behind Cloudflare. Standard fetch/axios gets blocked. Use tls-client library or Puppeteer to bypass bot protection.

CREATE TABLE IF NOT EXISTS etf_flows_btc (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  date TEXT NOT NULL,
  collected_ts INTEGER NOT NULL,
  ibit_flow_usd REAL,
  fbtc_flow_usd REAL,
  arkb_flow_usd REAL,
  bitb_flow_usd REAL,
  brrr_flow_usd REAL,
  gbtc_flow_usd REAL,
  ezbc_flow_usd REAL,
  hodl_flow_usd REAL,
  btcw_flow_usd REAL,
  btco_flow_usd REAL,
  total_flow_usd REAL,
  consecutive_positive_days INTEGER,
  consecutive_negative_days INTEGER
);

GBTC/ETHE Premium/Discount to NAV (Opus aplus): Source: Yahoo Finance (cold backup only) or calculate directly. Formula: gbtc_premium_pct = (gbtc_share_price / (btc_price * btc_per_share)) - 1 BTC per GBTC share = 0.00089011 (verify quarterly at Grayscale website).

CREATE TABLE IF NOT EXISTS etf_premium_discount (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  fund TEXT NOT NULL,
  share_price REAL,
  nav_per_share REAL,
  premium_pct REAL,
  alert_level TEXT
);

Alert thresholds: premium >+2% = MEDIUM (retail FOMO). Discount <-5% = HIGH (forced selling or capitulation).


Section 27: ETH ETF Flow Tracker

Cadence: Daily at 6:00 PM ET Source: Farside Investors https://farside.co.uk/eth.html

CREATE TABLE IF NOT EXISTS etf_flows_eth (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  date TEXT NOT NULL,
  collected_ts INTEGER NOT NULL,
  etha_flow_usd REAL,
  feth_flow_usd REAL,
  ethw_flow_usd REAL,
  ceth_flow_usd REAL,
  ezet_flow_usd REAL,
  ethe_flow_usd REAL,
  eth_flow_usd REAL,
  qeth_flow_usd REAL,
  total_flow_usd REAL
);

Section 28: Kalshi Crypto Contracts

Cadence: Every 5 min Source: Kalshi REST API https://api.elections.kalshi.com/trade-api/v2/markets?category=crypto

CREATE TABLE IF NOT EXISTS kalshi_markets (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  ticker TEXT NOT NULL,
  title TEXT,
  category TEXT,
  status TEXT,
  yes_bid REAL,
  yes_ask REAL,
  no_bid REAL,
  no_ask REAL,
  last_price REAL,
  volume_usd REAL,
  open_interest_usd REAL,
  close_time INTEGER,
  hours_to_expiry REAL,
  result TEXT
);
CREATE INDEX IF NOT EXISTS idx_kalshi_ticker_ts ON kalshi_markets(ticker, timestamp DESC);

Kalshi vs Polymarket Divergence Tracking [PANEL: Sonar Reasoning Pro]: When Kalshi prices the same event at 35% and Polymarket at 42%, one is wrong. Calculate |kalshi_price - polymarket_price| for matching contracts. Divergence >5% = HIGH (cross-venue arbitrage opportunity).


Section 29: Polymarket Crypto Contracts

Cadence: Every 5 min Source: Polymarket CLOB API https://clob.polymarket.com/markets + The Graph for on-chain data

CREATE TABLE IF NOT EXISTS polymarket_markets (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  condition_id TEXT NOT NULL,
  question TEXT,
  category TEXT,
  yes_price REAL,
  no_price REAL,
  volume_usd REAL,
  liquidity_usd REAL,
  close_time INTEGER,
  hours_to_expiry REAL,
  resolved INTEGER DEFAULT 0,
  resolution TEXT
);

Polymarket Order Book and Depth [PANEL: Sonar Reasoning Pro — CRITICAL GAP]: The CLOB API exposes /orderbook endpoint (free). Without order book data, you cannot detect manipulation, execution slippage, or real arbitrage. This is 40%+ of available predictive signal from Polymarket.

CREATE TABLE IF NOT EXISTS polymarket_orderbooks (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  condition_id TEXT NOT NULL,
  bid_levels_json TEXT,
  ask_levels_json TEXT,
  best_bid REAL,
  best_ask REAL,
  spread REAL,
  bid_depth_usd REAL,
  ask_depth_usd REAL,
  order_imbalance REAL
);

Section 30: Polymarket On-Chain Intelligence

Cadence: Real-time event monitoring Source: Polygon event logs for UMA oracle contract; CTF Exchange contract

UMA Oracle Dispute Monitoring [PANEL: Grok 4-Fast — MUST-HAVE; PANEL: Sonar Reasoning Pro — unconventional insight]:

When a Polymarket market is proposed for resolution, there is a 2-hour challenge window during which the outcome is KNOWN (unless disputed) but the market is still tradeable. Monitor UMA oracle proposals in real-time: https://thegraph.com/hosted-service/subgraph/uma/optimistic-oracle or PolygonScan event logs filtered for Polymarket contract addresses.

This captures 5–15% of potential Polymarket profit from oracle resolution arbitrage. Elevate to MUST-HAVE priority.

CREATE TABLE IF NOT EXISTS uma_oracle_events (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  event_type TEXT NOT NULL,
  condition_id TEXT,
  proposed_price REAL,
  dispute_deadline INTEGER,
  minutes_to_deadline REAL,
  is_disputed INTEGER DEFAULT 0,
  resolution TEXT
);

Polygon Network Health (part of Section 30):

CREATE TABLE IF NOT EXISTS polygon_health (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  gas_price_gwei REAL,
  pending_tx_count INTEGER,
  block_time_seconds REAL,
  tps REAL,
  usdc_on_polygon_usd REAL,
  polymarket_liquidity_score REAL,
  alert_level TEXT
);

Section 31: Binary Probability Calculator

Cadence: Calculated every 15 min from Sections 6, 7, 28, 29 data Critical fix (Opus aplus): Do not use flat ATM IV via N(d2). This systematically misprices OTM contracts because crypto has extreme fat tails.

Corrected methodology:

  1. For each prediction market strike K and expiry T, find the two Deribit options bracketing K
  2. Interpolate IV at strike K from the vol smile: IV_K = IV_K1 + (IV_K2 - IV_K1) * (K - K1) / (K2 - K1)
  3. Calculate N(d2) using IV_K (not ATM IV)
  4. Apply historical tail adjustment: compute empirical frequency that BTC moved more than (K/spot - 1)% over the same number of days from 4 years of OHLCV data
  5. Final probability = 0.6 * skew_adjusted_N(d2) + 0.4 * historical_frequency
  6. Compare to prediction market price. Edge = final_probability - market_price

[PANEL: Sonar Reasoning Pro]: Alternatively, use Deribit's implied distribution directly (CDF backed out from the vol smile via put spread slopes) instead of Black-Scholes. More accurate for tail risk.

Kalshi Spot-to-Future Spread [PANEL: Sonar Reasoning Pro — MUST-HAVE]: This is the core arbitrage. Calculate the explicit spread between what Kalshi/Polymarket prices and what traditional crypto derivatives imply. ~50 lines of code on top of existing data.

CREATE TABLE IF NOT EXISTS probability_bridge (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  venue TEXT NOT NULL,
  contract_ticker TEXT NOT NULL,
  strike REAL,
  expiry INTEGER,
  spot_price REAL,
  iv_at_strike REAL,
  bs_probability REAL,
  historical_frequency REAL,
  blended_probability REAL,
  market_price REAL,
  edge_pct REAL,
  vol_risk_premium REAL,
  alert_level TEXT
);
CREATE INDEX IF NOT EXISTS idx_prob_bridge_ts ON probability_bridge(venue, timestamp DESC);

Prediction Market Contract Expiry Calendar (Opus aplus):

CREATE TABLE IF NOT EXISTS contract_calendar (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  updated_ts INTEGER NOT NULL,
  venue TEXT NOT NULL,
  ticker TEXT NOT NULL,
  question TEXT,
  close_time INTEGER NOT NULL,
  hours_to_expiry REAL NOT NULL,
  strike_value REAL,
  current_price REAL,
  distance_to_strike_pct REAL,
  implied_probability REAL,
  oi_usd REAL,
  liquidity_score REAL,
  priority_rank REAL
);

Priority rank = (1/hours_to_expiry) * oi_usd * (1 - ABS(distance_to_strike_pct)). Contracts within 48 hours of expiry AND within 10% of strike: 3x priority boost. Generate daily "focus list" of top 10 contracts.


Section 32: Macro Correlations (BTC vs Equities, Gold, DXY)

Cadence: Daily Sources: FRED API (primary), Alpha Vantage free tier (secondary), Yahoo Finance (cold backup only)

Source fix (Opus aplus): Yahoo Finance uses an unofficial endpoint that breaks without notice. Replace with:

FRED API: https://fred.stlouisfed.org/graph/fredgraph.csv?id={series}

CREATE TABLE IF NOT EXISTS macro_correlations (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  btc_price REAL,
  nasdaq_price REAL,
  gold_price REAL,
  dxy_index REAL,
  vix REAL,
  treasury_10y REAL,
  treasury_2y REAL,
  yield_spread_10y_2y REAL,
  btc_nasdaq_corr_30d REAL,
  btc_gold_corr_30d REAL,
  btc_dxy_corr_30d REAL,
  btc_vix_corr_30d REAL,
  macro_regime TEXT
);

CFTC Commitments of Traders (COT) [PANEL: Grok 4-Fast — MUST-HAVE]: COT reports show institutional positioning in regulated BTC futures (CME). Leads spot moves by 1–3 days. Source: CFTC free API https://www.cftc.gov/dea/futures/deacbtcbk.htm or weekly CSV download.

CREATE TABLE IF NOT EXISTS cftc_cot (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  report_date TEXT NOT NULL,
  collected_ts INTEGER NOT NULL,
  noncommercial_long INTEGER,
  noncommercial_short INTEGER,
  noncommercial_net INTEGER,
  commercial_long INTEGER,
  commercial_short INTEGER,
  commercial_net INTEGER,
  open_interest_total INTEGER,
  net_change_noncommercial INTEGER
);

Section 33: Exchange Health Monitoring

Cadence: Every 15 min Sources: Exchange status pages; CoinGecko exchange tokens (BNB, OKB); exchange /ping endpoints

CREATE TABLE IF NOT EXISTS exchange_health (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  exchange TEXT NOT NULL,
  status TEXT,
  response_time_ms INTEGER,
  exchange_token_symbol TEXT,
  exchange_token_price_usd REAL,
  exchange_token_change_pct REAL,
  alert_level TEXT
);

Exchange token crash signal: BNB/OKB dropping >5% relative to BTC in a single hour = canary signal (exchange-specific stress or systemic event). Historical precedent: this pattern appeared before major exchange events.


Section 34: Proof of Reserves Monitoring

Cadence: Weekly Simplified [PANEL: GPT-4.1]: Full scraping of exchange PoR attestation pages is maintenance-heavy. Collect summary metrics only from public announcements and Nansen/Glassnode public dashboards where available.

CREATE TABLE IF NOT EXISTS proof_of_reserves (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  date TEXT NOT NULL,
  exchange TEXT NOT NULL,
  btc_reserve REAL,
  eth_reserve REAL,
  usdt_reserve REAL,
  usdc_reserve REAL,
  source TEXT,
  attestation_date TEXT
);

Section 35: Cross-Exchange Price Spreads

Cadence: Every 10 min default; escalate to 2 min if any spread exceeds 0.3% Sources: Binance, Coinbase, Kraken, Bybit, OKX, Upbit, Huobi tickers

Frequency fix [PANEL: Grok 4.1 (main)]: 2-minute frequency creates false spread signals due to clock synchronization issues (750ms response lag = 0.1–0.3% false spread during volatile periods). Reduce to 10-min with escalation.

Timestamp synchronization validation (Opus aplus): Record request_sent_ts and response_received_ts per exchange. Only compute spreads where all responses arrived within a 3-second window. Discard any comparison where max timestamp spread exceeds 3 seconds.

KRW/USD rate fix: Fetch KRW/USD exchange rate from exchangerate.host every 30 min (not hardcoded) for accurate Kimchi premium calculation.

CREATE TABLE IF NOT EXISTS exchange_spreads (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  max_response_lag_ms INTEGER,
  exchanges_included TEXT,
  btc_prices_json TEXT,
  max_spread_bps REAL,
  coinbase_premium_bps REAL,
  kimchi_premium_bps REAL,
  data_quality_flag TEXT
);

Composite Signal 11: Coinbase Premium Index (Promoted to Named Signal)

(Opus aplus): Previously buried as a parenthetical in Section 35. This deserves first-class status.

Formula: coinbase_premium_bps = (coinbase_btc_usd_mid - binance_btc_usdt_mid) / binance_btc_usdt_mid * 10000

CREATE TABLE IF NOT EXISTS coinbase_premium (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  coinbase_mid REAL,
  binance_mid REAL,
  premium_bps REAL,
  premium_5min_ma REAL,
  premium_1h_ma REAL,
  premium_24h_ma REAL,
  alert_level TEXT
);

Alert thresholds:


Section 36: X / Twitter / Social Sentiment

Cadence: Every 8 hours Source: Grok X search (via xAI API); fallback: Nitter instances

Simplification [PANEL: Grok 4.1 (main)]: Monitor top 5–10 accounts per category + semantic keyword search. Avoid maintaining 30+ individual account feeds.

[PANEL: Gemini 3.1 Pro]: There is no free programmatic X API. Free scraping via standard accounts results in rapid bans. Budget $20–30/month for Apify Twitter scraper actor if Grok X search is unavailable.

CREATE TABLE IF NOT EXISTS social_sentiment (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  source TEXT NOT NULL,
  query TEXT,
  post_count INTEGER,
  bullish_count INTEGER,
  bearish_count INTEGER,
  neutral_count INTEGER,
  sentiment_score REAL,
  top_posts_json TEXT
);

Section 37: Reddit Sentiment

Cadence: Every 4 hours Source: Reddit PRAW (pushshift as fallback); subreddits: r/Bitcoin, r/CryptoCurrency, r/CryptoMarkets, r/ethereum

CREATE TABLE IF NOT EXISTS reddit_sentiment (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  subreddit TEXT NOT NULL,
  post_count INTEGER,
  avg_score REAL,
  bullish_mentions INTEGER,
  bearish_mentions INTEGER,
  top_keywords_json TEXT,
  sentiment_score REAL
);

Section 38: Google Trends

Cadence: Weekly (simplified from daily) Source: pytrends library Term: "buy bitcoin" only (single retail euphoria/capitulation gauge)

Simplification (Opus aplus): Fear & Greed Index already incorporates Google Trends. Collecting daily raw data is double-counting. Weekly "buy bitcoin" at weekly frequency captures retail sentiment peaks without redundancy. Implement 3 retries with 30-second backoff (pytrends is notoriously flaky).

CREATE TABLE IF NOT EXISTS google_trends (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  week_start TEXT NOT NULL,
  collected_ts INTEGER NOT NULL,
  term TEXT NOT NULL,
  interest_value INTEGER
);

Section 39: GitHub Developer Activity

Status: DEFERRED to Month 2+ Multi-month leading indicator. For prediction markets with days-to-weeks horizons, adds near-zero predictive value. Build when longer-horizon contracts appear.


Section 40: NFT Blue Chip Floor Prices

Status: DEFERRED to Month 2+ NFT market has contracted 90%+ from peak. Claimed 1–3 day lead on altcoin sentiment is anecdotal; no robust backtesting. Validate historically before building.


Section 41: Crypto News RSS

Cadence: Every 30 min Sources: CoinDesk RSS, The Block RSS, Decrypt RSS, Cointelegraph RSS, Bloomberg Crypto RSS

CREATE TABLE IF NOT EXISTS news_items (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  source TEXT NOT NULL,
  title TEXT NOT NULL,
  link TEXT UNIQUE,
  summary TEXT,
  sentiment_score REAL,
  keywords_json TEXT,
  alert_triggered INTEGER DEFAULT 0
);

Section 42: SEC / CFTC Enforcement Monitoring

Cadence: Daily (not every 6 hours — regulatory developments don't happen in 6-hour cycles) Source inversion [PANEL: Sonar Reasoning Pro]: EDGAR is the CONFIRMATION, not the discovery. Major enforcement actions break on social media and SEC press releases 12–24 hours before EDGAR filings.

Corrected source priority:

  1. Primary: SEC RSS feeds (stable, officially supported):
  2. Secondary: EDGAR full-text search https://efts.sec.gov/LATEST/search-index?q=%22cryptocurrency%22%20OR%20%22digital%20asset%22%20OR%20%22bitcoin%22&dateRange=custom&startdt={7d_ago}&enddt={today} (Opus aplus fix — corrected URL)
  3. Real-time discovery: Grok X search for (@SECGov OR @CFTC) AND (crypto OR bitcoin OR enforcement OR digital asset) once daily at 9 AM ET

SEC keyword expansion [PANEL: Sonar Reasoning Pro]: Add obscure enforcement language beyond "cryptocurrency": "unlicensed commodities trading platform", "unregistered securities exchange", "derivatives activity without CFTC registration"

CREATE TABLE IF NOT EXISTS regulatory_events (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  source TEXT NOT NULL,
  event_type TEXT,
  title TEXT,
  link TEXT UNIQUE,
  summary TEXT,
  entity_targeted TEXT,
  jurisdiction TEXT,
  market_impact TEXT,
  alert_level TEXT
);

Section 43: Global Regulatory Tracker

Cadence: Daily (simplified from every 6 hours per region) Simplified scope (Opus aplus): Monitor US + China + "breaking international" keyword alerts only. EU MiCA is already implemented. Japan/HK/Singapore regulatory changes matter for long-term ecosystem but don't move prediction market prices within days-to-weeks.

Single Grok query: (crypto OR bitcoin OR stablecoin) AND (ban OR regulation OR enforcement) AND (China OR PBOC OR EU OR MiCA OR G7)

CREATE TABLE IF NOT EXISTS regulatory_calendar (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  date TEXT,
  collected_ts INTEGER NOT NULL,
  region TEXT NOT NULL,
  event_type TEXT,
  description TEXT,
  expected_market_impact TEXT,
  source TEXT,
  alert_level TEXT
);

Section 44: Polygon Network Health and USDC Supply

Cadence: Every 15 min Source: Polygonscan API (key required for >5 calls/sec, free key sufficient)

CREATE TABLE IF NOT EXISTS polygon_network (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  gas_price_gwei REAL,
  pending_tx_count INTEGER,
  avg_block_time_seconds REAL,
  tps REAL
);

CREATE TABLE IF NOT EXISTS polygon_usdc (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  usdc_total_supply_polygon REAL,
  usdc_in_polymarket_contracts REAL,
  usdc_polymarket_pct REAL
);

Oracle Health Monitoring [PANEL: GPT-4.1 — MUST-HAVE for prediction market desks]: DeFi and prediction market protocols depend on oracle health. Oracle outages or manipulation precede major market disruptions.

CREATE TABLE IF NOT EXISTS oracle_health (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  oracle TEXT NOT NULL,
  asset TEXT NOT NULL,
  price_usd REAL,
  deviation_from_spot_pct REAL,
  last_update_ts INTEGER,
  seconds_since_update INTEGER,
  alert_level TEXT
);

Sources:


Section 44b: Solana and L2 Metrics (New Section)

[PANEL: Gemini 3.1 Pro — MUST-HAVE] and [PANEL: GPT-4.1 — NICE-TO-HAVE]

Cadence: Every 4 hours Source: DefiLlama Chains API https://api.llama.fi/v2/chains; L2Beat https://l2beat.com/api/

Track: TVL, TPS, active addresses, DEX volume for: Solana, Base, Arbitrum, Optimism, zkSync

L2 activity is a leading indicator of ETH demand (L2s pay fees to L1 in ETH). Base (Coinbase's L2) activity specifically correlates with US retail engagement.

(Uses l2_network_metrics table defined in Section 22.)


Section 45: Token Unlocks and Emission Schedules

[PANEL: Gemini 3.1 Pro — MUST-HAVE]

Cadence: Daily Source: DefiLlama Unlocks API https://api.llama.fi/unlocks

Massive supply shocks (cliff unlocks >1% of circulating supply) are guaranteed binary events that suppress price. Knowing when a protocol unlocks 10% of its supply is direct, predictable alpha for Kalshi/Polymarket price-drop contracts.

CREATE TABLE IF NOT EXISTS token_unlocks (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  collected_ts INTEGER NOT NULL,
  unlock_date TEXT NOT NULL,
  protocol TEXT NOT NULL,
  token TEXT NOT NULL,
  amount_tokens REAL,
  amount_usd REAL,
  unlock_pct_of_supply REAL,
  category TEXT,
  cliff_unlock INTEGER DEFAULT 0,
  alert_level TEXT
);
CREATE INDEX IF NOT EXISTS idx_token_unlocks_date ON token_unlocks(unlock_date);

Alert: cliff unlock >1% of circulating supply within 7 days = HIGH (guaranteed selling pressure event).


Section 46: Bitcoin Hash Rate and Mining Metrics

Cadence: Every 4 hours Source: Blockchain.com https://api.blockchain.info/charts/hash-rate?timespan=30days&format=json

Hash Ribbon fix [PANEL: Grok 4-Fast]: Use difficulty-adjusted hashrate (not raw) for accurate capitulation signals. Blockchain.com already provides difficulty-adjusted data. Calculate MAs on adjusted values.

CREATE TABLE IF NOT EXISTS mining_metrics (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  hash_rate_eh REAL,
  difficulty REAL,
  difficulty_adj_hashrate REAL,
  estimated_hash_price_usd REAL,
  miner_revenue_usd_30d REAL,
  hash_ribbon_30d_ma REAL,
  hash_ribbon_60d_ma REAL,
  hash_ribbon_signal TEXT,
  puell_multiple REAL
);

Hash Ribbon: when 30d MA crosses above 60d MA = miner capitulation ending → historical BTC buy signal. Puell Multiple: daily_miner_revenue / 365d_avg_daily_revenue. >4 = overvalued. <0.5 = capitulation.


Section 47: Mining Difficulty and Adjustments

Cadence: Every 2 hours Source: Mempool.space API https://mempool.space/api/v1/mining/difficulty-adjustments

CREATE TABLE IF NOT EXISTS difficulty_adjustments (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  current_difficulty REAL,
  next_estimated_difficulty REAL,
  estimated_change_pct REAL,
  blocks_until_adjustment INTEGER,
  days_until_adjustment REAL,
  epoch_start_block INTEGER
);

Section 48: Energy Cost Leading Indicator

Status: DEFERRED to Month 2+ The 2–4 week lag between energy costs and miner selling is too long for prediction markets with typical 1–14 day horizons. Hash Ribbon (Section 46) captures miner stress more directly.


CME Bitcoin Futures Basis and OI (New Section)

(Opus aplus — MUST-HAVE)

Cadence: Every 30 min (delayed quotes, CME provides 10-min delay free) Source: CME Group delayed quotes https://www.cmegroup.com/markets/cryptocurrencies/bitcoin/bitcoin.quotes.html (scrape) OR CryptoQuant free tier (includes CME OI data)

Key signal: cme_basis_pct - deribit_basis_pct = institutional vs crypto-native demand spread. When CME basis > Deribit basis by >1%: TradFi institutions more bullish than crypto natives — strong institutional signal. CME OI hitting ATH preceded every major institutional-driven rally in 2023–2024.

CREATE TABLE IF NOT EXISTS cme_futures (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  contract_month TEXT NOT NULL,
  last_price REAL,
  settlement_price REAL,
  volume INTEGER,
  open_interest INTEGER,
  basis_pct REAL,
  basis_vs_deribit_pct REAL
);

On-Chain DEX Perpetuals / Leverage (New Section)

[PANEL: GPT-4.1 — NICE-TO-HAVE]

Cadence: Every 4 hours Sources: DefiLlama DEX perps API; GMX API (free); dYdX v4 API

DEX perps on Arbitrum, Polygon, Base can now front-run CEXs during regulatory events. GMX, dYdX v4, Synthetix on-chain OI and funding rates add emerging on-chain leverage signals.

CREATE TABLE IF NOT EXISTS dex_perps (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  protocol TEXT NOT NULL,
  chain TEXT NOT NULL,
  symbol TEXT NOT NULL,
  open_interest_usd REAL,
  funding_rate REAL,
  volume_24h_usd REAL,
  long_ratio REAL,
  short_ratio REAL
);

Composite Signals

All composites are calculated from stored raw data. No additional API calls.

Composite Signal 1: Master Crypto Sentiment Score

Inputs: Fear & Greed, funding rates (weighted by volume), exchange inflows/outflows, stablecoin SSR, DVOL, Reddit/X sentiment scores.

CREATE TABLE IF NOT EXISTS composite_master_sentiment (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  fear_greed_score REAL,
  funding_weighted REAL,
  exchange_flow_score REAL,
  ssr_score REAL,
  dvol_score REAL,
  social_sentiment_score REAL,
  master_score REAL,
  regime TEXT,
  confidence REAL
);

Composite Signal 2: Smart Money Score

Inputs: Whale Alert large exchange inflows (bearish signal: whales moving to exchange to sell), exchange reserve depletion trends, Coinbase premium, CME-Deribit basis spread, ETF flow momentum.

Composite Signal 3: Hash Ribbon

Inputs: 30d and 60d MAs of difficulty-adjusted hash rate from Section 46. Signal: 30d MA crossing above 60d MA = miner capitulation ending.

Composite Signal 4: Stablecoin Supply Ratio (SSR)

Formula: SSR = BTC_market_cap / stablecoin_total_supply. High SSR = limited dry powder available to buy BTC. Low SSR = large buying reserve.

Composite Signal 5: Leverage Ratio

Formula: leverage = total_perp_OI_USD / BTC_spot_market_cap. Ratio >0.25 = overleveraged, cascade risk.

Composite Signal 6: Polymarket Liquidity Score

Inputs: Polygon gas price, USDC on Polygon, Polymarket total open interest, average bid-ask spread across top 10 markets. Critical for all desks trading Polymarket.

Composite Signal 7: Risk Regime Classifier

Inputs: BTC-Nasdaq 30d correlation, VIX, DXY direction, funding rate trend, macro_regime from Section 32. Output: RISK_ON / RISK_OFF / DECORRELATED

Cross-desk signal: Expose as /api/regime endpoint. Every desk needs this context.

Composite Signal 8: NUPL Approximation

Corrected formula (Opus aplus): Use median of 4 MA windows + thermocap as independent floor.

realized_cap_proxy = MEDIAN(50d_MA, 100d_MA, 200d_MA, 365d_MA) * circulating_supply
thermocap = cumulative_miner_revenue (Blockchain.com charts/miners-revenue)
NUPL_approx = (market_cap - realized_cap_proxy) / market_cap
NUPL_thermocap = (market_cap - thermocap) / market_cap
final_NUPL = AVG(NUPL_approx, NUPL_thermocap)

Label ALL outputs as "NUPL-approx" with confidence band ±0.15.

Composite Signal 9: ETF Flow Momentum

Inputs: 5-day rolling sum of BTC ETF flows, 20-day rolling sum, GBTC premium/discount.

Composite Signal 10: BTC Dominance Rate of Change

(Opus aplus — MUST-HAVE for altcoin contract pricing)

CREATE TABLE IF NOT EXISTS btc_dominance_roc (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  btc_dom_current REAL,
  btc_dom_7d_ago REAL,
  roc_7d REAL,
  roc_14d REAL,
  roc_30d REAL,
  alt_season_flag INTEGER DEFAULT 0
);

alt_season_flag = 1 when roc_7d < -2.0 AND roc_14d < -3.0. Alert: roc_7d < -3.0 = HIGH (aggressive alt rotation — altcoin prediction market contracts more likely to hit).

Composite Signal 11: Realized Volatility vs Implied Vol (Vol Risk Premium)

(Opus aplus — MUST-HAVE for binary probability calculator calibration)

CREATE TABLE IF NOT EXISTS realized_vol (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  coin TEXT NOT NULL,
  rv_7d REAL,
  rv_14d REAL,
  rv_30d REAL,
  rv_90d REAL,
  dvol_current REAL,
  vol_risk_premium_7d REAL,
  vol_risk_premium_30d REAL
);

Formula: RV_Nd = std(ln(close_t / close_{t-1})) * sqrt(365) over trailing N days. Vol risk premium = DVOL - RV_30d.

When vol_risk_premium_30d > 15: BS model overestimates tail risk — discount OTM contract probabilities 10–20%. When vol_risk_premium_30d < 0: Market underpricing risk — increase OTM contract probabilities.

Composite Signal 12: Halving Cycle Phase

(Opus aplus — MUST-HAVE; feeds into all composite confidence levels)

Calculated from Section 14 block height data. Current phase modulates analyst AI confidence on all signals.

Composite Signal 13: ETH/BTC Skew Signal

[PANEL: DeepSeek V3.1 (aplus)]: ETH/BTC ratio derivatives are the purest measure of altcoin risk appetite. Source: BitMEX composite index https://www.bitmex.com/api/v1/instrument/compositeIndex?symbol=.ETHBON8H. Skew >1 signals ETH outperformance expectations; historically leads spot ETH/BTC by 12–36 hours.


Critical Alert Thresholds

CREATE TABLE IF NOT EXISTS alerts_log (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp INTEGER NOT NULL,
  alert_type TEXT NOT NULL,
  severity TEXT NOT NULL,
  source_section TEXT,
  metric TEXT,
  value REAL,
  threshold REAL,
  description TEXT,
  acknowledged INTEGER DEFAULT 0
);
CREATE INDEX IF NOT EXISTS idx_alerts_ts ON alerts_log(timestamp DESC, severity);
Alert Threshold Severity Section
USDC depeg >0.5% from $1.00 CRITICAL 17
USDT depeg >0.5% from $1.00 CRITICAL 17
Liquidation cascade (directional) Long liq >$200M, short liq <$50M in 1h CRITICAL 10
Exchange token crash BNB/OKB >5% drop vs BTC in 1h HIGH 33
Hash rate drop >10% in 24h HIGH 46
Stablecoin mint Single >$500M CRITICAL 16
Stablecoin burn Single >$500M HIGH 16
Large whale inflow to exchange >$100M HIGH 12
Government wallet move Any size CRITICAL 13
Curve pool imbalance acceleration >20 pp in 1 hour HIGH 20
DVOL spike >10 points in 1h HIGH 7
Coinbase premium >+100 bps sustained CRITICAL CS11
UMA oracle proposal Within 2h window HIGH 30
Contract near expiry + strike <48h AND <10% distance HIGH 31
Kalshi-Polymarket divergence >5% same event HIGH 28/29
Token cliff unlock >1% of supply within 7d HIGH 45
Alt season trigger BTC dominance roc_7d < -3.0 MEDIUM CS10
Mempool congestion >200MB MEDIUM 14

Alert deduplication [PANEL: GPT-4.1]: With high-frequency monitoring, a cascade event could trigger hundreds of alerts in minutes. Implement alert suppression: same alert_type cannot re-alert for 15 minutes after first trigger. Escalation logic: if same alert fires 3x in 1 hour, promote to next severity level.


API Infrastructure Requirements

Central API Request Queue

[PANEL: Gemini 3.1 Pro — required for pipeline survival]

Implement a central rate limiter using p-queue or Bottleneck in Node.js. Each API provider gets its own queue with per-minute rate limits:

Source Limit Queue Config
CoinGecko (Demo key) 30/min 8/min conservative
Deribit 200/min 150/min
Whale Alert 10/min 8/min
Etherscan 5/sec 3/sec
FRED 120/min 60/min
Alpha Vantage 5/min 4/min
DefiLlama ~100/min 60/min
Binance 1200/min 600/min

Hot Failover Chain

All critical data sources need pre-authenticated hot standby:

Data Quality Auditing

[PANEL: GPT-4.1]:

All timestamps stored as UTC Unix integers. All responses tagged with source field.

[PANEL: GPT-4.1]: Many endpoints are in ET or local time (ETF flows, SEC filings). Standardize ALL timestamps to UTC at ingestion point. Never store timezone-local timestamps.


Storage Estimates

Category Raw Rate Aggregation Annual Estimate
Spot prices ~5 KB/5min None for top 10 ~600 MB
OHLCV ~2 KB/hour None ~120 MB
Deribit options ~43 MB/day raw 15min→hourly at 7d; 4h at 30d; daily at 90d ~5–6 GB
Deribit OI changes ~15 MB/day Same as options ~2 GB
Funding rates ~1 KB/hour None ~50 MB
Cross-exchange spreads ~5 KB/10min 10min→1h at 7d ~200 MB
Liquidation estimates ~2 KB/5min None ~200 MB
Composites ~10 KB/15min None ~350 MB
Whale transactions ~50 KB/day None ~20 MB
ETF flows ~1 KB/day None <1 MB
Kalshi/Polymarket ~100 KB/5min None ~10 GB
News/regulatory ~20 KB/30min None ~350 MB
Total estimated ~18–20 GB/year

Plan for 25 GB disk allocation. SQLite handles this well with WAL mode and VACUUM schedule.


API Endpoints Quick Reference

Section Source Endpoint
1 CoinGecko https://api.coingecko.com/api/v3/coins/markets?vs_currency=usd&per_page=50&page=1
3 CoinGecko https://api.coingecko.com/api/v3/global
4 CoinGecko https://api.coingecko.com/api/v3/search/trending
5 Alternative.me https://api.alternative.me/fng/
6 Deribit https://www.deribit.com/api/v2/public/get_instruments?currency=BTC&kind=option&expired=false
7 Deribit https://www.deribit.com/api/v2/public/get_volatility_index_data?currency=BTC
8 Deribit https://www.deribit.com/api/v2/public/get_instruments?currency=BTC&kind=future&expired=false
9 Binance https://fapi.binance.com/fapi/v1/fundingRate?symbol=BTCUSDT
9 Bybit https://api.bybit.com/v5/market/funding/history?category=linear&symbol=BTCUSDT
9 OKX https://www.okx.com/api/v5/public/funding-rate?instId=BTC-USD-SWAP
10 Binance https://fapi.binance.com/fapi/v1/openInterest?symbol=BTCUSDT
10 Binance https://fapi.binance.com/fapi/v1/globalLongShortAccountRatio?symbol=BTCUSDT&period=1h
12 Whale Alert https://api.whale-alert.io/v1/transactions?api_key={key}&min_value=1000000&start={ts}
14 Blockchain.info https://api.blockchain.info/stats
14 Mempool.space https://mempool.space/api/v1/mining/hashrate/3d
15 Etherscan https://api.etherscan.io/api?module=gastracker&action=gasoracle
17 CoinGecko https://api.coingecko.com/api/v3/simple/price?ids=usd-coin,tether&vs_currencies=usd
22 DefiLlama https://api.llama.fi/protocols
22 DefiLlama https://api.llama.fi/v2/chains
23 DefiLlama https://api.llama.fi/overview/dexs
24 DefiLlama https://yields.llama.fi/pools
26/27 Farside https://farside.co.uk/btc.html (Puppeteer/tls-client)
28 Kalshi https://api.elections.kalshi.com/trade-api/v2/markets?category=crypto
29 Polymarket https://clob.polymarket.com/markets
30 The Graph https://thegraph.com/hosted-service/subgraph/uma/optimistic-oracle
32 FRED https://fred.stlouisfed.org/graph/fredgraph.csv?id=VIXCLS
32 Alpha Vantage https://www.alphavantage.co/query?function=GLOBAL_QUOTE&symbol=IXIC&apikey={key}
32 CFTC https://www.cftc.gov/dea/futures/deacbtcbk.htm
41 CoinDesk https://www.coindesk.com/arc/outboundfeeds/rss/
42 SEC https://www.sec.gov/rss/litigation/litreleases.xml
45 DefiLlama https://api.llama.fi/unlocks
46 Blockchain.info https://api.blockchain.info/charts/hash-rate?timespan=30days&format=json
46 Blockchain.info https://api.blockchain.info/charts/miners-revenue?timespan=alltime&format=json
47 Mempool.space https://mempool.space/api/v1/mining/difficulty-adjustments
CME CME Group https://www.cmegroup.com/markets/cryptocurrencies/bitcoin/bitcoin.quotes.html

Implementation Checklist

Week 1 (Core Trading Infrastructure)

Week 2 (Derivatives + Probability Engine)

Week 3 (On-Chain Intelligence + Institutional Flows)

Week 4 (DeFi + Macro + Cross-Desk Services)

Month 2+ (Deferred)


Cross-Desk Reusability Map

Service Consumer Desks Build As
USDC/USDT peg monitoring ALL Polymarket-trading desks Shared microservice with /api/peg-status
Risk regime classifier (RISK_ON/OFF/DECORRELATED) Stocks, Options, Futures, Forex Shared /api/regime endpoint
FRED macro data (yields, VIX, DXY) Stocks, Options, Futures, Forex Shared data service
Polymarket liquidity score Sports, Weather, Politics, all Polymarket desks Shared service
Binary probability calculator Options, Stocks (earnings events) Shared library
ETF flow data Stocks, Futures Shared database table
DVOL + VIX unified dashboard Options, Stocks Shared composite
SEC/CFTC enforcement scraper Stocks, Options Shared RSS feed
API rate limiter utilities All desks Shared infrastructure module
Alert deduplication system All desks Shared alerting service
Sentiment aggregation patterns Sports, Stocks Shared NLP module

Key Decisions and Tradeoffs

Decision Choice Rationale
Options probability Skew-adjusted BS + 40% historical tail Flat ATM IV systematically wrong for crypto fat tails
Liquidation source Derived from OI drops (primary), Coinglass web (secondary) Coinglass free API unreliable for production
Macro data FRED primary, Alpha Vantage secondary Yahoo Finance unofficial endpoint breaks without notice
ETF flows tls-client/Puppeteer for Farside Standard fetch blocked by Cloudflare
Whale tracking Known-wallet Etherscan (primary), Whale Alert (secondary) Whale Alert daily cap ~200 calls; BitcoinTreasuries has no API
GEX formula Inverse-contract adjusted + dealer heuristic + DTE weighting Standard equity GEX wrong for Deribit inverse contracts
Curve pools Top 5 by TVL (not just 3pool) 3pool lost 80%+ TVL; monitoring it produces false signals
Coin monitoring Top 10 at 5-min, rest at 30-min 95% of contracts are on top 10; saves 40% API calls
DeFi lending Top 5 protocols only 14,038 pools = scope creep; top 5 covers 70%
Google Trends Weekly, "buy bitcoin" only Fear & Greed already incorporates it; avoid double-counting
X/Twitter Grok search + top 10 accounts No free programmatic X API exists
Regulatory scope US + China + breaking international only Only US/China moves short-term prediction market prices
NUPL Median of 4 MA windows + thermocap Single 200d MA off by 20–40% during rapid price moves
Storage Aggressive aggregation after 7/30/90 days Options data ~58 MB/day raw; aggregation keeps to 5–6 GB/year

What Makes This A+

Per Sonar Reasoning Pro (unconventional summary): The spec should be framed by "what moves prediction market prices" first, then "where to get that data" — not the reverse. The ranked answer:

  1. Deribit IV + spot prices → binary probability bridge to Kalshi/Polymarket — the entire derivatives pipeline exists for this one calculation
  2. Order book depth at Kalshi/Polymarket venues — cannot detect manipulation, slippage, or arbitrage without it
  3. UMA oracle dispute monitoring — 5–15% of potential Polymarket profit, free, one-time setup
  4. Deribit implied distribution (not Black-Scholes) — fat tails make BS systematically wrong
  5. Central alerting API for peg status and risk regime — shared infrastructure that prevents redundant builds

Everything else is signal enrichment.


Merged final complete. Panel models consulted: Claude Opus 4.6 (base + aplus), DeepSeek V3.1, Gemini 3.1 Pro, GPT-4.1, Grok 4-Fast, Grok 4.1, Sonar Reasoning Pro. All FTX references removed. No additions removed from Opus base. Panel-unique contributions tagged throughout.


TODO: Upgrade Kalshi REST to WebSocket Feed

Status: NOT BUILT — add when this desk goes live for execution

Current state: All Kalshi data (prices, order books, trades) is fetched via REST API polling on cron schedules. This is fine for edge detection and monitoring, but NOT sufficient for live trade execution.

Why WebSocket matters:

Note: This desk primarily trades on other exchanges, but Kalshi has event markets (rate decisions, index levels, crypto milestones) that overlap with this desk. WebSocket upgrade applies when trading any Kalshi markets.

Added 2026-03-29 — upgrade REST to WS when desk moves to live execution

Source: ~/edgeclaw/results/spec-panel/crypto-desk/spec-final.md