t-369.20 - omni

t-369.20·WorkTask····Omni/Agent.hs

Parent:t-369·Created1 month ago·Updated1 month ago

Dependencies

t-369.19 [Blocks]
t-369.7 [Blocks]

Description

Edit

Test code-only agent swarm with STM-based shared state coordination.

Context

Previous spikes validated:

t-369.17: Single code-only agent works
t-369.19: Parallel agents work (no communication)

Now test: Can agents coordinate via shared memory to solve problems better?

Hypothesis

For certain tasks, agents sharing state via STM will: 1. Find better solutions than independent agents 2. Find solutions faster by avoiding redundant work 3. Exhibit emergent coordination behavior

Experiment 1: Collaborative Optimization

Task: Find maximum of a complex function with multiple local maxima.

data OptState = OptState
  { bestSolution :: TVar (Maybe (Double, Double, Double, Double))  -- (x, y, z, score)
  , exploredRegions :: TVar (Set (Int, Int, Int))  -- discretized regions
  , iterationCount :: TVar Int
  }

-- The function to optimize (has multiple local maxima)
-- f(x,y,z) = sin(x)*cos(y)*sin(z) + exp(-(x-2)^2 - (y-3)^2 - (z-1)^2)
-- Domain: x,y,z ∈ [0, 5]

optimizationSwarm :: Int -> IO (Double, Double, Double, Double)
optimizationSwarm numAgents = do
  shared <- initOptState
  
  -- Spawn agents
  agents <- replicateM numAgents $ async $ optimizerAgent shared
  
  -- Run for fixed time or until convergence
  waitAll agents
  
  readTVarIO (bestSolution shared)

optimizerAgent :: OptState -> IO ()
optimizerAgent shared = loop 20
  where
    loop 0 = pure ()
    loop n = do
      -- Read current state
      (best, explored) <- atomically $ (,)
        <$> readTVar (bestSolution shared)
        <*> readTVar (exploredRegions shared)
      
      -- Think: generate code to search unexplored region
      code <- think model $ mconcat
        [ "Optimize f(x,y,z) = sin(x)*cos(y)*sin(z) + exp(-(x-2)^2-(y-3)^2-(z-1)^2)"
        , "\nDomain: x,y,z in [0,5]"
        , "\nCurrent best: " <> show best
        , "\nExplored regions: " <> show (Set.size explored)
        , "\nPick an UNEXPLORED region and find the maximum there."
        , "\nOutput: x,y,z,score"
        ]
      
      -- Execute
      result <- execute sandbox code
      
      -- Update shared state
      case parseResult result of
        Just (x, y, z, score) -> atomically $ do
          -- Mark region explored
          let region = (floor x, floor y, floor z)
          modifyTVar (exploredRegions shared) (Set.insert region)
          
          -- Update best if better
          currentBest <- readTVar (bestSolution shared)
          when (maybe True ((< score) . (\(_,_,_,s) -> s)) currentBest) $
            writeTVar (bestSolution shared) (Just (x, y, z, score))
        
        Nothing -> pure ()
      
      loop (n - 1)

Experiment 2: Collaborative Sudoku

Task: Solve Sudoku with agents working on different constraints.

data SudokuState = SudokuState
  { grid :: TVar (Array (Int,Int) (Maybe Int))
  , candidates :: TVar (Array (Int,Int) (Set Int))
  , solvedCount :: TVar Int
  }

sudokuSwarm :: [[Maybe Int]] -> IO [[Int]]
sudokuSwarm initial = do
  shared <- initSudokuState initial
  
  -- Spawn agents for different strategies
  agents <- sequence
    [ async $ rowAgent shared
    , async $ colAgent shared
    , async $ boxAgent shared
    , async $ nakedPairsAgent shared
    ]
  
  -- Wait for solved
  atomically $ do
    count <- readTVar (solvedCount shared)
    when (count < 81) retry
  
  readFinalGrid shared

Experiment 3: Collaborative Research

Task: Research a topic, with agents sharing discovered facts.

data ResearchState = ResearchState
  { facts :: TVar (Map Text [Text])  -- topic -> facts
  , questions :: TVar [Text]          -- unanswered questions
  , sources :: TVar (Set Text)        -- visited sources
  }

researchSwarm :: Text -> Int -> IO Text
researchSwarm topic numAgents = do
  shared <- initResearchState topic
  
  -- Spawn researcher agents
  researchers <- replicateM numAgents $ async $ researcherAgent shared
  
  -- Spawn synthesizer (waits for enough facts)
  synthesizer <- async $ synthesizerAgent shared
  
  -- Wait for synthesis
  wait synthesizer

researcherAgent :: ResearchState -> IO ()
researcherAgent shared = loop 10
  where
    loop 0 = pure ()
    loop n = do
      (knownFacts, openQs, visited) <- atomically $ (,,)
        <$> readTVar (facts shared)
        <*> readTVar (questions shared)
        <*> readTVar (sources shared)
      
      -- Think: what to research next
      code <- think model $ mconcat
        [ "Known facts: " <> show knownFacts
        , "\nOpen questions: " <> show openQs
        , "\nAlready visited: " <> show (Set.size visited)
        , "\nFind NEW facts. Output JSON: {facts: [...], source: ...}"
        ]
      
      result <- execute sandbox code
      
      -- Update shared state
      case parseFacts result of
        Just (newFacts, source) -> atomically $ do
          modifyTVar (sources shared) (Set.insert source)
          forM_ newFacts $ \(topic, fact) ->
            modifyTVar (facts shared) (Map.insertWith (++) topic [fact])
        Nothing -> pure ()
      
      loop (n - 1)

Benchmark Comparisons

For each experiment, compare:

| Config | Description | |--------|-------------| | Single | 1 agent, N iterations | | Parallel | N agents, no communication | | Swarm | N agents, STM shared state |

Metrics

Solution quality (optimization score, Sudoku cells solved, fact count)
Wall time to solution
Total cost (tokens)
Efficiency (quality per dollar)
Coordination overhead (STM retries, wasted work)

Deliverables

1. Omni/Agent/Experiments/SwarmSTM.hs - Core swarm infrastructure 2. Omni/Agent/Experiments/SwarmOptimization.hs - Optimization experiment 3. Omni/Agent/Experiments/SwarmSudoku.hs - Sudoku experiment 4. Omni/Agent/Experiments/SwarmResearch.hs - Research experiment 5. Omni/Agent/Experiments/SWARM_RESULTS.md - Findings

Success Criteria

Swarm is "worth it" if:

[ ] Solution quality ≥ 10% better than parallel (for at least one task)
[ ] OR time to solution ≥ 30% faster than parallel
[ ] Coordination overhead < 20% of total cost

What We Learn

1. When does shared state help? - Which task types benefit? 2. Optimal swarm size - Diminishing returns? 3. Communication patterns - What should be shared? 4. Failure modes - Deadlocks? Livelocks? Starvation?

Files

Omni/Agent/Experiments/SwarmSTM.hs
Omni/Agent/Experiments/SwarmOptimization.hs
Omni/Agent/Experiments/SwarmSudoku.hs
Omni/Agent/Experiments/SwarmResearch.hs
Omni/Agent/Experiments/SWARM_RESULTS.md

Swarm with STM shared memory experiment