t-345 - omni

t-345·WorkTask····Omni/Agent/Engine.hs

Created1 month ago·Updated4 weeks ago

Description

Overview

Integrate Recursive Language Models (RLMs) from the paper "Recursive Language Models" (https://arxiv.org/html/2512.24601v1) to enable detailed analysis and review of large workflow markdown files that exceed normal context windows.

Paper Summary

RLMs solve the long-context problem by treating large inputs as "external environment objects" and using Python code generation to recursively decompose and process them. Key findings:

Performance: Double-digit improvements on 4 benchmarks at 10M+ token scales
Efficiency: Outperforms both base models AND typical workarounds (summary agents, retrieval)
Cost: Maintains similar costs to standard approaches
Insight: Performance degradation depends on task complexity scaling (constant/linear/quadratic), not just prompt length
Capability: Handles complex multi-step reasoning that breaks even frontier models

Target Use Case

Enable AI-powered workflow code review for large markdown files. Instead of just detecting automation opportunities, provide detailed line-by-line feedback on existing workflows:

Line-specific suggestions: "line 3: consider adding error handling for fetch_sales_data() failures"
Cross-workflow pattern analysis: "this looks similar to your quarterly report workflow - could you extract a shared template?"
Optimization opportunities: "you're fetching sales data twice across workflows X and Y - consider caching"
Best practice enforcement: "missing retry logic on external API calls"

Integration Point

Hook into Omni/Agent/Engine.hs around the Provider.chat call (line ~1419). Add prompt size estimation and route large prompts through RLM decomposition instead of direct LLM calls.

Benefits

Handle arbitrarily large workflow libraries without context limits
Maintain reasoning quality across massive document corpuses
Enable detailed, contextual feedback on specific workflow components
Support cross-file analysis for optimization suggestions

Priority

P3 - Eventually valuable but not blocking current development. The recursive decomposition approach could be transformative for workflow analysis at scale.

Timeline (7)

💬[human]4 weeks ago

Hybrid Context Strategy for Free Monad Agent

Building on the RLM concept, apply adaptive context to the entire message history:

Core Idea

Instead of just relying on recency, extend memory search to all messages: 1. Recency window: Last 10-20 actual messages (temporal locality) 2. Semantic window: Query searchChatHistorySemantic with current user message to pull relevant older messages

Fusion Approach

getAdaptiveContext :: UserId -> ChatId -> Text -> Int -> IO [Message]
getAdaptiveContext uid chatId currentMessage maxTokens = do
  -- always include recent messages (temporal locality)
  recent <- getRecentMessages uid chatId 20
  
  -- semantic retrieval for older relevant context
  semanticHits <- searchChatHistorySemantic currentMessage 10
  
  -- filter out any semantic hits already in recent
  let recentIds = Set.fromList (map cmId recent)
      oldButRelevant = filter (not . (\`Set.member\` recentIds) . cheId . fst) semanticHits
  
  -- budget-aware merge: recent first, then fill with semantic hits
  pure (budgetedMerge recent oldButRelevant maxTokens)

Design Questions

1. Similarity threshold: Discard low-scoring hits? (e.g., cosine < 0.7) 2. Recency penalty: Score decay for distant messages? score * recencyDecay(age) 3. Topic coherence: Multiple unrelated threads could pull confusing context

Hybrid Strategy

Auto-inject semantic context baseline via getAdaptiveContext
Agent can dig deeper via existing searchChatHistorySemantic tool if needed
Best of both worlds: automatic context enrichment + explicit retrieval capability

Integration Point

In Engine.hs, modify context building before Provider.chat call to use getAdaptiveContext instead of just recent messages.

Files to Modify

Omni/Agent/Memory.hs - add getAdaptiveContext
Omni/Agent/Engine.hs - use new context builder
Omni/Agent/Types.hs - maybe add AdaptiveContextConfig type

💬[human]4 weeks ago

Three-window context model:

1. *Temporal window* (passive, auto-injected): recent N messages, captures conversation continuity and 'what we're talking about right now'

2. *Semantic window* (passive, auto-injected): vector similarity search across all history, captures 'what have we discussed before that's relevant'

3. *Agentic window* (active, tool-driven): agent can explicitly query for more context when passive windows aren't enough, using existing searchChatHistorySemantic tool

The first two are automatic context enrichment; the third gives the agent control when it needs to dig deeper. Each can be independently tuned (window size, similarity threshold, recency decay).

💬[human]4 weeks ago

Context Window Matrix

| | Semantic | Temporal | |-----------|----------|----------| | Active | A | B | | Passive | C | D |

A (active-semantic): Agent queries for related context via search_chat_history tool — agent decides WHAT to search for. ✅ exists

B (active-temporal): Agent requests specific past messages by time range ("show me messages from last tuesday", "get the 5 messages before X"). Would need get_messages_by_time(start, end) or similar. ❌ to implement

C (passive-semantic): Auto-injected similar messages via embedding search — system retrieves relevant older context based on current message. ❌ to implement (main hybrid addition)

D (passive-temporal): Auto-injected recent N messages (sliding window). ✅ exists

Goal: implement B and C to complete all four quadrants of contextual retrieval.

🔄[human]Open → Done4 weeks ago

Integrate RLM (Recursive Language Models) for long-context workflow analysis

Description

Overview

Paper Summary

Target Use Case

Integration Point

Benefits

Priority

Timeline (7)

Hybrid Context Strategy for Free Monad Agent

Core Idea

Fusion Approach

Design Questions

Hybrid Strategy

Integration Point

Files to Modify

Context Window Matrix