Speculative Fund: AI-Augmented Quant System

Date: 2026-03-11 Context: Standalone system for the speculative fund (~5% of portfolio). Separate from BTC/STRD holdings. Built from first principles in Haskell.

1. Design Philosophy

This is not an extension of Invest.hs. It’s a standalone system — Omni.Fund.Quant.

Principles:

All Haskell. No Python anywhere — not for data, not for math, not for evals.
Agent = eyes, Haskell = brain. LLMs gather/structure information; typed code does all math.
Eval-first. Nothing trades real money until evals demonstrate positive IC on held-out data.
Small and tight. Start with a narrow asset universe. Expand only when the pipeline works.

2. Asset Universe (v1)

For a speculative fund with signal-driven rebalancing, we want:

Liquid enough to trade without market impact at our scale (~$50-100K positions)
Diverse enough to have cross-asset signals
Narrow enough to validate the pipeline before scaling

Proposed Universe: 20-30 liquid US equities + sector ETFs

Sector ETFs (11 — one per GICS sector):

Ticker	Sector	Rationale
XLK	Technology	Largest sector, high signal-to-noise
XLV	Healthcare	Defensive, regulatory signals
XLF	Financials	Yield curve sensitive, macro signal
XLE	Energy	Commodity-driven, macro signal
XLI	Industrials	Cyclical, leading indicator
XLY	Cons. Discretionary	Consumer sentiment signal
XLP	Cons. Staples	Defensive rotation signal
XLC	Communication	Mixed growth/value
XLRE	Real Estate	Rate sensitive
XLB	Materials	Commodity/inflation signal
XLU	Utilities	Rate sensitive, flight-to-safety

Why ETFs first, not individual stocks:

EDGAR insider signals aggregate across the sector (more data)
Earnings surprise effects aggregate (PEAD at sector level is smoother)
Less idiosyncratic noise — easier to validate signals
No single-name risk blowing up the fund
Once the pipeline validates on ETFs, we can drill into individual names

Later expansion (v2+): Add 20-30 individual stocks — mega-caps where EDGAR filings are frequent and earnings signal is strong (AAPL, MSFT, AMZN, GOOGL, NVDA, etc.)

Cross-asset signals (always in universe):

Ticker/Series	What	Signal role
SPY	S&P 500	Market benchmark
TLT	20yr Treasury	Duration/rate signal
GLD	Gold	Risk-off signal
BTC-USD	Bitcoin	Risk-on/speculative sentiment
VIX	Volatility index	Fear gauge (data only, not traded)

Total v1 universe: ~16 tradeable instruments + VIX for signal.

Position sizing

The speculative fund is ~5% of total portfolio. If total is ~$2.5M, spec fund is ~$125K. With 16 instruments, average position is ~$8K. Plenty of liquidity at this scale. Kelly will concentrate into fewer positions (typically 5-8 with meaningful weight).

3. System Architecture

┌─────────────────────────────────────────────────────────────┐
│  Omni.Fund.Quant                                             │
│                                                               │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │ Data Layer                                               │ │
│  │  Omni.Fund.Data.Market  — price/volume (REST API)       │ │
│  │  Omni.Fund.Data.Edgar   — insider filings (SEC API)     │ │
│  │  Omni.Fund.Data.Fred    — macro series (FRED API)       │ │
│  └──────────┬──────────────────────────────────────────────┘ │
│             │                                                 │
│  ┌──────────▼──────────────────────────────────────────────┐ │
│  │ Signal Layer (pure Haskell math + Op agents)            │ │
│  │                                                          │ │
│  │  Deterministic signals:                                  │ │
│  │    - Momentum (trailing returns, cross-sectional rank)   │ │
│  │    - Volatility (realized vs implied, regime)            │ │
│  │    - Mean reversion (z-score off moving average)         │ │
│  │    - Macro regime (yield curve, M2, unemployment)        │ │
│  │                                                          │ │
│  │  Agent-augmented signals:                                │ │
│  │    - Insider interpretation (EDGAR + LLM)                │ │
│  │    - Earnings/filing analysis (10-K/8-K + LLM)          │ │
│  │    - Cross-signal synthesis (LLM coherence check)        │ │
│  └──────────┬──────────────────────────────────────────────┘ │
│             │                                                 │
│  ┌──────────▼──────────────────────────────────────────────┐ │
│  │ Alpha Layer                                              │ │
│  │  - Signal combination: weighted composite alpha          │ │
│  │  - Black-Litterman Bayesian update of μ, Σ              │ │
│  │  - Correlated Kelly criterion: f* = Σ⁻¹μ               │ │
│  └──────────┬──────────────────────────────────────────────┘ │
│             │                                                 │
│  ┌──────────▼──────────────────────────────────────────────┐ │
│  │ Portfolio Layer                                          │ │
│  │  - Target weights from Kelly                             │ │
│  │  - Delta vs current positions → trade list               │ │
│  │  - Correlated MC simulation (Cholesky GBM)              │ │
│  │  - Constraints: max position, no leverage, turnover cap  │ │
│  └──────────┬──────────────────────────────────────────────┘ │
│             │                                                 │
│  ┌──────────▼──────────────────────────────────────────────┐ │
│  │ Eval Layer                                               │ │
│  │  - Walk-forward IC measurement per signal                │ │
│  │  - Signal decay curves                                   │ │
│  │  - Portfolio sim: A/B vs equal-weight benchmark          │ │
│  │  - Confidence calibration (for LLM signals)              │ │
│  │  - All in Haskell. Test.QuickCheck + custom harness.     │ │
│  └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Module structure

Omni/Fund/Quant.hs           -- Main entry point, ties layers together
Omni/Fund/Quant/Signal.hs    -- Signal types + deterministic signal computations
Omni/Fund/Quant/Alpha.hs     -- Signal combination, Black-Litterman, Kelly
Omni/Fund/Quant/Portfolio.hs  -- Position management, trade generation, MC sim
Omni/Fund/Quant/Eval.hs      -- Walk-forward IC, decay curves, portfolio sim
Omni/Fund/Quant/Universe.hs  -- Asset universe definition

Omni/Fund/Data/Market.hs     -- Price/volume REST API client
Omni/Fund/Data/Edgar.hs      -- SEC EDGAR API client
Omni/Fund/Data/Fred.hs       -- FRED API client

Omni/Fund/Quant/Agent.hs     -- Op programs for signal agents (Phase 3)

4. Core Types

-- Universe.hs
data Asset = Asset
  { assetTicker  :: Text
  , assetName    :: Text
  , assetSector  :: Sector
  , assetType    :: AssetType  -- ETF | Stock | Commodity | Index
  , assetTradeable :: Bool     -- VIX is signal-only, not traded
  }

data Sector = Technology | Healthcare | Financials | Energy | Industrials
            | ConsDisc | ConsStaples | Communication | RealEstate
            | Materials | Utilities | CrossAsset
  deriving (Eq, Ord, Show, Enum, Bounded)

-- Signal.hs
data Signal = Signal
  { sigAsset      :: Text       -- ticker
  , sigType       :: SignalType
  , sigValue      :: Double     -- normalized score (z-score or [-1, 1])
  , sigConfidence :: Double     -- [0, 1]
  , sigSource     :: Text       -- provenance tag
  , sigTimestamp  :: UTCTime
  }

data SignalType
  = Momentum          -- trailing return rank
  | MeanReversion     -- z-score off MA
  | Volatility        -- realized vol regime
  | InsiderSentiment  -- EDGAR Form 4
  | MacroRegime       -- yield curve, M2, etc.
  | EarningsSurprise  -- PEAD
  | CrossSignal       -- LLM cross-signal synthesis
  deriving (Eq, Ord, Show, Enum, Bounded)

data SignalBundle = SignalBundle
  { sbTimestamp  :: UTCTime
  , sbSignals    :: [Signal]
  , sbCorrelation :: Matrix Double  -- N×N realized correlation
  , sbAlphaScores :: Vector Double  -- composite alpha per asset
  }

-- Alpha.hs
data AlphaModel = AlphaModel
  { amPriorMu    :: Vector Double     -- prior expected returns
  , amPriorSigma :: Matrix Double     -- prior covariance
  , amTau        :: Double            -- BL confidence scalar
  , amViews      :: [(Matrix Double, Vector Double, Matrix Double)]
                                      -- (P, Q, Ω) view triples
  , amPosteriorMu :: Vector Double    -- posterior expected returns
  , amPosteriorSigma :: Matrix Double -- posterior covariance
  }

-- Portfolio.hs
data Position = Position
  { posAsset   :: Text
  , posShares  :: Double
  , posValue   :: Double
  , posWeight  :: Double  -- fraction of portfolio
  }

data TradeOrder = TradeOrder
  { toAsset    :: Text
  , toAction   :: TradeAction  -- Buy | Sell | Hold
  , toShares   :: Double
  , toValueUsd :: Double
  }

data TradeAction = Buy | Sell | Hold

5. Data Libraries

All three are thin typed wrappers around REST/JSON APIs, built with http-conduit. No authentication needed for EDGAR. FRED needs a free API key. Market data via Alpha Vantage (free tier: 25 requests/day) or Twelve Data.

5.1 Omni.Fund.Data.Market

-- Core operations
getDailyPrices :: Text -> Int -> IO (Either ApiError [DailyBar])
getIntradayPrices :: Text -> Interval -> Int -> IO (Either ApiError [Bar])

-- Derived (pure functions on [DailyBar])
trailingReturn :: Int -> [DailyBar] -> Double
realizedVol :: Int -> [DailyBar] -> Double
correlationMatrix :: [[DailyBar]] -> Matrix Double
meanReversionZ :: Int -> [DailyBar] -> Double
sharpeRatio :: Int -> Double -> [DailyBar] -> Double

data DailyBar = DailyBar
  { dbDate   :: Day
  , dbOpen   :: Double
  , dbHigh   :: Double
  , dbLow    :: Double
  , dbClose  :: Double
  , dbVolume :: Integer
  }

5.2 Omni.Fund.Data.Edgar

-- Insider trades (Form 4)
getInsiderTransactions :: Text -> IO (Either ApiError [InsiderTx])
getRecentFilings :: Text -> FilingType -> IO (Either ApiError [Filing])

data InsiderTx = InsiderTx
  { itDate        :: Day
  , itFiler       :: Text
  , itRole        :: Text       -- CEO, CFO, Director, etc.
  , itTxType      :: TxType     -- Purchase | Sale | OptionExercise
  , itShares      :: Integer
  , itPricePerShare :: Double
  , itTotalValue  :: Double
  }

data FilingType = Form4 | Form10K | Form10Q | Form8K

5.3 Omni.Fund.Data.Fred

-- Macro series
getSeries :: Text -> IO (Either ApiError [FredObs])
getMultiple :: [Text] -> IO (Either ApiError (Map Text [FredObs]))

data FredObs = FredObs { foDate :: Day, foValue :: Double }

-- Key series IDs
yieldCurve10y2y = "T10Y2Y"   -- 10yr-2yr spread
m2MoneySupply   = "M2SL"
unemploymentRate = "UNRATE"
cpi             = "CPIAUCSL"
fedFundsRate    = "FEDFUNDS"
vix             = "VIXCLS"

6. Signal Computations

6.1 Deterministic Signals (pure Haskell, no LLM)

-- Momentum: cross-sectional rank of trailing N-day returns
momentumSignal :: Int -> Map Text [DailyBar] -> Map Text Signal
momentumSignal lookback priceMap =
  let returns = Map.map (trailingReturn lookback) priceMap
      ranked = crossSectionalZScore returns
  in Map.mapWithKey (\t z -> Signal t Momentum z 0.7 "momentum" now) ranked

-- Mean reversion: z-score of current price vs N-day SMA
meanReversionSignal :: Int -> Map Text [DailyBar] -> Map Text Signal

-- Volatility regime: ratio of short-term to long-term vol
volRegimeSignal :: Map Text [DailyBar] -> Map Text Signal

-- Macro regime: composite of FRED indicators
macroRegimeSignal :: Map Text [FredObs] -> Signal
-- yield curve inversion → bearish
-- M2 acceleration → bullish
-- unemployment rising → bearish
-- Outputs a single "macro regime" signal applied to all assets

6.2 Agent-Augmented Signals (Phase 3, Op programs)

-- In Quant/Agent.hs

signalScan :: [Asset] -> Op SignalState SignalBundle
signalScan universe = do
  -- Fan out: deterministic signals run in parallel with agent signals
  results <- Op.par
    [ deterministic universe   -- pure math, wrapped in Op.io
    , insiderAgent universe    -- EDGAR + LLM interpretation
    , macroAgent               -- FRED + LLM regime assessment
    ]
  -- Combine all signals
  let allSignals = concat results
  -- Optional: LLM cross-signal coherence check
  coherence <- crossSignalAssessment allSignals
  pure (buildBundle allSignals coherence)

insiderAgent :: [Asset] -> Op SignalState [Signal]
insiderAgent universe = do
  txns <- Op.io $ mapM (Edgar.getInsiderTransactions . assetTicker) universe
  -- Cluster detection: multiple insiders buying = stronger signal
  let clusters = detectClusters (concat txns)
  -- LLM interprets clusters (e.g., "CEO bought $5M after earnings miss — conviction buy")
  interpretations <- Op.infer (Op.Model "claude-sonnet-4-20250514")
    defaultContextRequest
      { crObservation = formatClusters clusters
      , crGoal = Just "Interpret insider activity and assign signal strength [-1,1]"
      }
  pure (parseInsiderSignals interpretations)

7. Alpha Combination

7.1 Signal → Alpha (weighted combination)

-- Simple weighted sum for v1. ML-based combination for v2+.
combineSignals :: Map SignalType Double -> [Signal] -> Map Text Double
combineSignals weights signals =
  -- Group by asset, weighted sum of signal values by type
  let byAsset = groupBy sigAsset signals
  in Map.map (\sigs -> sum [sigValue s * Map.findWithDefault 0 (sigType s) weights | s <- sigs]) byAsset

-- Default weights (hand-tuned initially, then learned from IC data)
defaultSignalWeights :: Map SignalType Double
defaultSignalWeights = Map.fromList
  [ (Momentum, 0.30)
  , (MeanReversion, 0.20)
  , (Volatility, 0.15)
  , (InsiderSentiment, 0.15)
  , (MacroRegime, 0.10)
  , (EarningsSurprise, 0.10)
  ]

7.2 Black-Litterman Update

-- Uses hmatrix for linear algebra

blUpdate
  :: Vector Double    -- prior μ (equilibrium returns from CAPM or trailing)
  -> Matrix Double    -- prior Σ (realized covariance)
  -> Double           -- τ (uncertainty on prior, ~0.05)
  -> Matrix Double    -- P (picking matrix: which assets views refer to)
  -> Vector Double    -- Q (view returns)
  -> Matrix Double    -- Ω (view uncertainty diagonal)
  -> (Vector Double, Matrix Double)  -- (posterior μ, posterior Σ)
blUpdate mu sigma tau p q omega =
  let tauSigma = scale tau sigma
      inv_term = inv (p <> tauSigma <> tr p + omega)
      mu' = mu + tauSigma <> tr p <> inv_term <> (q - p #> mu)
      sigma' = sigma + tauSigma - tauSigma <> tr p <> inv_term <> p <> tauSigma
  in (mu', sigma')

7.3 Correlated Kelly

kellyOptimalCorrelated
  :: Double           -- risk-free rate
  -> Vector Double    -- expected excess returns (μ - r_f)
  -> Matrix Double    -- covariance matrix Σ
  -> Vector Double    -- optimal fractions f* = Σ⁻¹(μ - r_f)
kellyOptimalCorrelated rf mu sigma =
  let excess = mu - scalar rf
      fStar = inv sigma #> excess
      -- Clamp: no shorts, no single position > 30%
      clamped = cmap (min 0.30 . max 0.0) fStar
      -- Scale if sum > 1 (no leverage)
      total = sumElements clamped
      scaled = if total > 1.0 then scale (1.0 / total) clamped else clamped
  in scaled

8. Portfolio Management

-- Generate trade list from Kelly weights and current positions
generateTrades
  :: Map Text Position   -- current positions
  -> Vector Double       -- kelly weights
  -> [Asset]             -- universe (for ticker mapping)
  -> Double              -- total portfolio value
  -> Double              -- turnover cap (max % to trade per rebalance)
  -> [TradeOrder]
generateTrades current weights universe totalVal turnoverCap =
  let targets = zip (map assetTicker universe) (toList weights)
      trades = map (\(ticker, w) ->
        let targetVal = totalVal * w
            currentVal = maybe 0 posValue (Map.lookup ticker current)
            delta = targetVal - currentVal
            action | abs delta < totalVal * 0.01 = Hold  -- ignore <1% moves
                   | delta > 0 = Buy
                   | otherwise = Sell
        in TradeOrder ticker action 0 delta
        ) targets
      -- Enforce turnover cap
      totalTurnover = sum [abs (toValueUsd t) | t <- trades, toAction t /= Hold]
      scale' = if totalTurnover > totalVal * turnoverCap
               then totalVal * turnoverCap / totalTurnover
               else 1.0
  in map (\t -> t { toValueUsd = toValueUsd t * scale' }) trades

-- Correlated Monte Carlo (Cholesky decomposition)
simulateCorrelated
  :: Matrix Double     -- Cholesky factor L where Σ = LL'
  -> Vector Double     -- expected returns μ
  -> Vector Double     -- current portfolio values
  -> SimConfig
  -> StdGen
  -> SimResult

9. Web Dashboard

New page at /fund/quant (or standalone /quant). Not integrated with the invest page.

Sections:

Signal Dashboard — current signal readings per asset, color-coded by strength
Alpha Scores — composite alpha per asset after BL update
Kelly Weights — target allocation from Kelly optimizer
Trade List — current positions vs targets, suggested trades
MC Fan Chart — correlated GBM simulation from current state
Signal History — trailing IC per signal type, decay curves
Eval Metrics — live Sharpe, max drawdown, IC, hit rate

10. Implementation Phases

Phase 1: Data + Deterministic Signals (2-3 weeks)

Omni.Fund.Data.Market — Alpha Vantage or Twelve Data REST client
Omni.Fund.Data.Edgar — SEC EDGAR REST client (Form 4)
Omni.Fund.Data.Fred — FRED REST client
Omni.Fund.Quant.Universe — asset universe definition
Omni.Fund.Quant.Signal — momentum, mean-reversion, vol regime, macro regime
Omni.Fund.Quant.Eval — walk-forward IC measurement framework
Eval: IC > 0.02 on momentum signal for sector ETFs, 2020-2025
Daily data pull on systemd timer, writes signal JSON

No LLM. No portfolio management. Just data access + signal computation + eval.

Phase 2: Alpha + Portfolio (2-3 weeks)

Add hmatrix dependency for linear algebra
Omni.Fund.Quant.Alpha — BL update, correlated Kelly
Omni.Fund.Quant.Portfolio — position management, trade generation, correlated MC
Eval: walk-forward portfolio sim on 2020-2025 data, compare to equal-weight benchmark
Basic web dashboard at /fund/quant

Phase 3: Agent Signals (2-3 weeks)

Omni.Fund.Quant.Agent — Op programs for insider + macro agents
Eval: IC with agent signals vs without (must show improvement)
Confidence calibration test
Cross-signal coherence agent
Live paper-trading period (1-3 months, no real money)

Phase 4: Live Trading (ongoing)

Connect to brokerage API (Interactive Brokers? Alpaca?)
Automated trade execution with human approval gate
Weekly signal quality monitoring
Quarterly full re-eval on expanding test set

11. Design Decisions (Locked)

Linear algebra: Pure Haskell for v1. 16×16 matrix — performance irrelevant. Revisit hmatrix if/when universe scales to 100+ assets.
Market data: Twelve Data (free 800 req/day). Most generous free tier, more than enough for 16 daily bars.
Signal storage: JSON file per day (append-only JSONL) for v1. SQLite when we need historical queries for decay curves.
Rebalance frequency: Weekly. ~52 rebalances/year.
Asset universe: 16 instruments (11 sector ETFs + SPY/TLT/GLD/BTC-USD + VIX signal-only). Confirmed.
Benchmark: Equal-weight portfolio of the universe, rebalanced monthly. Not SPY — we want to measure signal value, not market exposure.

References

AlphaAgent: https://arxiv.org/abs/2502.16789
SEC EDGAR APIs: https://www.sec.gov/search-filings/edgar-application-programming-interfaces
FRED API: https://fred.stlouisfed.org/docs/api/fred/
Black-Litterman: https://pyportfolioopt.readthedocs.io/en/latest/BlackLitterman.html
hmatrix: https://hackage.haskell.org/package/hmatrix
Walk-forward analysis: https://www.interactivebrokers.com/campus/ibkr-quant-news/the-future-of-backtesting-a-deep-dive-into-walk-forward-analysis/
Twelve Data API: https://twelvedata.com/docs