Integrate Recursive Language Models (RLMs) from the paper "Recursive Language Models" (https://arxiv.org/html/2512.24601v1) to enable detailed analysis and review of large workflow markdown files that exceed normal context windows.
RLMs solve the long-context problem by treating large inputs as "external environment objects" and using Python code generation to recursively decompose and process them. Key findings:
Enable AI-powered workflow code review for large markdown files. Instead of just detecting automation opportunities, provide detailed line-by-line feedback on existing workflows:
Hook into Omni/Agent/Engine.hs around the Provider.chat call (line ~1419). Add prompt size estimation and route large prompts through RLM decomposition instead of direct LLM calls.
P3 - Eventually valuable but not blocking current development. The recursive decomposition approach could be transformative for workflow analysis at scale.
Three-window context model:
1. *Temporal window* (passive, auto-injected): recent N messages, captures conversation continuity and 'what we're talking about right now'
2. *Semantic window* (passive, auto-injected): vector similarity search across all history, captures 'what have we discussed before that's relevant'
3. *Agentic window* (active, tool-driven): agent can explicitly query for more context when passive windows aren't enough, using existing searchChatHistorySemantic tool
The first two are automatic context enrichment; the third gives the agent control when it needs to dig deeper. Each can be independently tuned (window size, similarity threshold, recency decay).
| | Semantic | Temporal | |-----------|----------|----------| | Active | A | B | | Passive | C | D |
A (active-semantic): Agent queries for related context via search_chat_history tool — agent decides WHAT to search for. ✅ exists
B (active-temporal): Agent requests specific past messages by time range ("show me messages from last tuesday", "get the 5 messages before X"). Would need get_messages_by_time(start, end) or similar. ❌ to implement
C (passive-semantic): Auto-injected similar messages via embedding search — system retrieves relevant older context based on current message. ❌ to implement (main hybrid addition)
D (passive-temporal): Auto-injected recent N messages (sliding window). ✅ exists
Goal: implement B and C to complete all four quadrants of contextual retrieval.
Hybrid Context Strategy for Free Monad Agent
Building on the RLM concept, apply adaptive context to the entire message history:
Core Idea
Instead of just relying on recency, extend memory search to all messages: 1. Recency window: Last 10-20 actual messages (temporal locality) 2. Semantic window: Query
searchChatHistorySemanticwith current user message to pull relevant older messagesFusion Approach
Design Questions
1. Similarity threshold: Discard low-scoring hits? (e.g., cosine < 0.7) 2. Recency penalty: Score decay for distant messages?
score * recencyDecay(age)3. Topic coherence: Multiple unrelated threads could pull confusing contextHybrid Strategy
getAdaptiveContextsearchChatHistorySemantictool if neededIntegration Point
In Engine.hs, modify context building before
Provider.chatcall to usegetAdaptiveContextinstead of just recent messages.Files to Modify
getAdaptiveContext