The current memory system (Omni/Agent/Memory.hs) is tightly coupled to the Telegram/Ava use case. This task is to refactor it into a backend-agnostic memory abstraction that lives in the core agent library, allowing any agent (Telegram, CLI, web, etc.) to plug in their preferred storage backend.
Omni/Agent/Memory.hs - 2000+ lines, sqlite-vss based, provides:Omni/Agent/Memory/Backend.hs)-- Backend interface - swappable storage layer
data MemoryBackend m = MemoryBackend
{ mbStore :: MemoryEntry -> m ()
, mbSearch :: Text -> Int -> m [MemoryEntry] -- semantic search
, mbRecent :: Int -> m [MemoryEntry] -- temporal (most recent N)
, mbByTag :: [Tag] -> m [MemoryEntry]
, mbPrune :: m () -- cleanup/consolidation
, mbDelete :: MemoryId -> m ()
}
-- Core memory entry type (backend-agnostic)
data MemoryEntry = MemoryEntry
{ meId :: MemoryId
, meContent :: Text
, meContext :: Text -- how/why learned
, meTags :: [Tag]
, meCreatedAt :: UTCTime
, meAccessedAt :: UTCTime
, meAccessCount :: Int
}
Omni/Agent/Memory/Embedding.hs)-- Embedding provider interface
data EmbeddingProvider m = EmbeddingProvider
{ epEmbed :: Text -> m (Vector Float)
, epBatchEmbed :: [Text] -> m [Vector Float]
, epDimensions :: Int
}
-- Implementations
ollamaEmbedding :: OllamaConfig -> EmbeddingProvider IO
openaiEmbedding :: OpenAIConfig -> EmbeddingProvider IO
noopEmbedding :: EmbeddingProvider IO -- for testing, returns zeros
-- In-memory (good for testing/dev)
inMemoryBackend :: IORef [MemoryEntry] -> EmbeddingProvider IO -> MemoryBackend IO
-- SQLite with sqlite-vss (current implementation, refactored)
sqliteBackend :: Connection -> EmbeddingProvider IO -> MemoryBackend IO
-- Future backends (not in scope for this task, but interface should support):
-- postgresBackend, qdrantBackend, neo4jBackend
Omni/Agent/Op.hs or new Omni/Agent/Op/Memory.hs)Add memory operations to the free monad DSL:
data AgentOp next where
-- ... existing ops ...
Remember :: Text -> Text -> [Tag] -> (MemoryId -> next) -> AgentOp next
Recall :: Text -> Int -> ([MemoryEntry] -> next) -> AgentOp next
Forget :: MemoryId -> (() -> next) -> AgentOp next
Omni/Agent/Interpreter/Sequential.hs)The interpreter needs a MemoryBackend in its config:
data InterpreterConfig = InterpreterConfig
{ icProvider :: Provider
, icMemory :: MemoryBackend IO -- NEW
, icTools :: [Tool]
, ...
}
interpretOp :: InterpreterConfig -> AgentOp a -> IO a
interpretOp cfg (Remember content ctx tags k) = do
memId <- mbStore (icMemory cfg) (mkEntry content ctx tags)
k memId
-- etc.
Omni/Agent/Telegram.hs)Ava plugs in the sqlite backend:
runAvaAgent :: TelegramConfig -> IO ()
runAvaAgent cfg = do
conn <- initMemoryDb
let backend = sqliteBackend conn (ollamaEmbedding defaultOllamaConfig)
let interpreterCfg = defaultInterpreterConfig { icMemory = backend }
-- ... run agent with this config
1. Define core types - Create Omni/Agent/Memory/Types.hs with backend-agnostic types
2. Define backend interface - Create Omni/Agent/Memory/Backend.hs with the typeclass/record
3. Extract embedding abstraction - Create Omni/Agent/Memory/Embedding.hs
4. Implement sqlite backend - Refactor current Memory.hs into Omni/Agent/Memory/Sqlite.hs
5. Implement in-memory backend - Create Omni/Agent/Memory/InMemory.hs for testing
6. Add Op instructions - Extend Omni/Agent/Op.hs with memory operations
7. Update interpreter - Handle new ops in Omni/Agent/Interpreter/Sequential.hs
8. Update Ava - Wire up sqlite backend in Omni/Agent/Telegram.hs
9. Add tests - Unit tests for backends, integration test for full flow
This task focuses on the storage/retrieval abstraction. A follow-up task should address:
Design sketch for context window (NOT in scope for this task):
data ContextWindow = ContextWindow
{ cwMaxTokens :: Int
, cwRecencyWeight :: Float -- 0.0-1.0
, cwSemanticWeight :: Float -- 0.0-1.0
, cwSimilarityThreshold :: Float
}
buildContext :: MemoryBackend m -> ContextWindow -> Text -> m [MemoryEntry]
New:
Omni/Agent/Memory/Types.hsOmni/Agent/Memory/Backend.hsOmni/Agent/Memory/Embedding.hsOmni/Agent/Memory/Sqlite.hsOmni/Agent/Memory/InMemory.hsModify:
Omni/Agent/Op.hs - add memory opsOmni/Agent/Interpreter/Sequential.hs - handle memory opsOmni/Agent/Telegram.hs - wire up backendOmni/Agent/Memory.hs - deprecate or re-export from new modulesContext window = dynamically constructed Bayesian prior for each inference. The IR represents this prior as structured, labeled, optimization-aware data.
1. Data, not strings - Sections are structured with metadata 2. Optimization-aware - Supports compression, composition, analysis 3. Traceable - Every section has provenance 4. Composable - Explicit composition semantics (Bayesian)
ContextRequest -- Agent specifies intent
↓ (hydrate)
PromptIR -- Structured, labeled, optimization-ready
↓ (compile)
Prompt -- Flat messages for API
↓ (LLM call)
Response
-- Section with full optimization metadata
data Section = Section
{ secId :: Text -- Unique ID for tracing
, secLabel :: Text -- Display label
, secSource :: SectionSource -- Provenance
, secContent :: Text -- The actual content
-- Token/Rate metrics
, secTokens :: Int -- Current token count
, secMinTokens :: Maybe Int -- Minimum viable (for compression)
-- Relevance/Priority metrics
, secPriority :: Priority -- For budget trimming
, secRelevance :: Maybe Float -- 0.0-1.0, task-specific relevance
, secRecency :: Maybe UTCTime -- When this info was current
-- For Bayesian composition
, secCompositionMode :: CompositionMode
-- For analysis/caching
, secEmbedding :: Maybe (Vector Float) -- Precomputed embedding
, secHash :: Maybe Text -- Content hash for dedup/caching
}
-- How this section composes with others (Bayesian semantics)
data CompositionMode
= Hierarchical -- Hyperprior (system prompt, base instructions)
| Constraint -- Product-of-experts (must satisfy)
| Additive -- Mixture (adds info, can be dropped)
| Contextual -- Bayesian update (observation shifting posterior)
-- Section provenance
data SectionSource
= SourceStatic Text -- Static (file name or "code")
| SourceTemporal -- Recent conversation
| SourceSemantic Float -- Semantic search (with relevance score)
| SourceKnowledge -- Long-term memory/facts
| SourceState Text -- Runtime state ("project", "time")
| SourceConditional Text -- Conditional on auth/config
data Priority = Critical | High | Medium | Low
-- Tool definition (part of IR, affects behavior significantly)
data ToolDef = ToolDef
{ tdName :: Text
, tdDescription :: Text
, tdSchema :: Value
, tdPriority :: Priority
, tdEmbedding :: Maybe (Vector Float)
}
-- The full IR
data PromptIR = PromptIR
{ pirSections :: [Section]
, pirTools :: [ToolDef]
, pirObservation :: Text
, pirMeta :: PromptMeta
}
-- Metadata for tracing and optimization
data PromptMeta = PromptMeta
{ pmTotalTokens :: Int
, pmBudget :: TokenBudget
, pmStrategy :: ContextStrategy
, pmTimestamp :: UTCTime
, pmCompressionRatio :: Maybe Float
, pmEstimatedEntropy :: Maybe Float -- For risk detection (t-455)
, pmCacheHit :: Bool
}
-- Budget with principled allocation
data TokenBudget = TokenBudget
{ tbTotal :: Int
, tbReserveRatio :: Float -- Keep N% headroom (t-432)
, tbAllocation :: BudgetAllocation
}
data BudgetAllocation
= FixedRatios { baSystem, baContext, baObservation :: Float }
| InformationWeighted -- Allocate by information content
| RelevanceWeighted -- Allocate by task relevance
data ContextRequest = ContextRequest
{ crObservation :: Text -- Current user input
, crGoal :: Maybe Text -- What agent is trying to do
, crStrategy :: ContextStrategy -- How to hydrate
, crBudget :: TokenBudget
}
data ContextStrategy = ContextStrategy
{ csTemporalWindow :: Int -- Recent N messages
, csSemanticLimit :: Int -- Max semantic results
, csSemanticThreshold :: Float -- Min similarity (0.0-1.0)
, csRecencyDecay :: Float -- Decay factor (e.g., 0.995)
, csIncludeKnowledge :: Bool -- Include long-term facts
}
Composition (t-398):
compose :: PromptIR -> PromptIR -> PromptIR
-- Hierarchical sections from first IR are hyperpriors
-- Constraint sections are AND'd (product of experts)
-- Additive sections are collected (mixture)
-- Contextual sections are sequenced (Bayesian updates)
Compression (t-399):
compress :: TokenBudget -> PromptIR -> IO PromptIR
-- Rank sections by (priority × relevance × recency)
-- Drop/summarize lowest-value until under budget
-- Track compression ratio in metadata
Analysis (t-397):
estimateImpact :: Section -> IO Float
-- Use embedding magnitude as proxy for information content
equivalent :: Float -> PromptIR -> PromptIR -> IO Bool
-- Check behavioral equivalence via embedding similarity
| Research Task | IR Feature |
|--------------|------------|
| t-398 (Composition) | CompositionMode, compose |
| t-399 (Compression) | secMinTokens, compress |
| t-397 (Analysis) | secEmbedding, estimateImpact |
| t-432 (Compaction) | secRelevance, secRecency, tbReserveRatio |
| t-455 (Best-of-N) | pmEstimatedEntropy |
avaPromptIR = PromptIR
{ pirSections =
[ Section "base" "## Core Instructions" (SourceStatic "telegram-system.md")
basePrompt 500 Nothing Critical Nothing Nothing Hierarchical Nothing Nothing
, Section "time" "## Current Date and Time" (SourceState "clock")
"Saturday, Jan 25..." 20 Nothing High Nothing (Just now) Contextual Nothing Nothing
, Section "project" "## Current Project" (SourceState "project")
"Working dir: /home/ben/omni/live..." 100 Nothing High Nothing Nothing Constraint Nothing Nothing
, Section "user" "## Current User" (SourceState "user")
"You are talking to: Ben" 15 Nothing High Nothing Nothing Contextual Nothing Nothing
, Section "memories" "## What you know about this user" SourceKnowledge
"Ben prefers..." 200 (Just 50) Medium (Just 0.8) Nothing Additive Nothing Nothing
, Section "recent" "## Recent conversation" SourceTemporal
"[10:30] User: ..." 800 (Just 200) High Nothing (Just now) Contextual Nothing Nothing
, Section "semantic" "## Related past messages" (SourceSemantic 0.82)
"[Jan 15] ..." 300 Nothing Medium (Just 0.82) (Just oldTime) Additive Nothing Nothing
]
, pirTools = [skillTool, readFileTool, runBashTool, rememberTool, recallTool, ...]
, pirObservation = "What were we discussing about the agent architecture?"
, pirMeta = PromptMeta 1965 defaultBudget defaultStrategy now Nothing Nothing False
}
| Commit | File | Lines | Purpose | |--------|------|-------|---------| | 21f111e | Omni/Agent/Prompt/IR.hs | 545 | Core IR types with optimization metadata | | 69ad87d | Omni/Agent/Prompt/Hydrate.hs | 533 | ContextRequest → PromptIR | | 32142b6 | Omni/Agent/Prompt/Compile.hs | 436 | PromptIR → CompiledPrompt | | d472b64 | Omni/Agent/Op.hs | +30 | Infer now takes ContextRequest, added InferRaw | | d472b64 | Sequential.hs | +178 | Interpreter handles hydration/compilation |
infer(model, ContextRequest)
↓
[hydrate]
↓
PromptIR (sections + tools + metadata)
↓
[compile]
↓
CompiledPrompt (messages + tools)
↓
[LLM call]
↓
Response
For new agents (dynamic context):
infer :: Model -> ContextRequest -> Op s Response
For legacy/raw prompts:
inferRaw :: Model -> Prompt -> Op s Response
The Prompt IR pipeline is now fully integrated into the interpreter:
[Agent Program]
│
├── infer(model, ContextRequest) ──→ hydrate → compile → LLM
│
└── inferRaw(model, Prompt) ──→ (legacy path) → LLM
Op.hs:
Infer :: Model -> ContextRequest -> ... (new API)InferRaw :: Model -> Prompt -> ... (legacy API)ContextRequest, ContextStrategy, TokenBudgetSequential.hs:
seqHydrationConfig :: Maybe HydrationConfig in SeqConfighandleBudgetInferResult and compiledToMessages helpersPrograms/*.hs:
infer → inferRaw (no semantic change)To use the new context pipeline, Telegram.hs needs:
1. A HydrationConfig with Memory-backed context sources
2. Call sites using infer + ContextRequest
The infrastructure is ready; this is now an integration task.
| Commit | File | Lines | Purpose | |--------|------|-------|---------| | 21f111e | Prompt/IR.hs | 545 | Core IR types (Section, ToolDef, CompositionMode, ContextRequest) | | 69ad87d | Prompt/Hydrate.hs | 533 | ContextRequest → PromptIR via context sources | | 32142b6 | Prompt/Compile.hs | 436 | PromptIR → CompiledPrompt with budget enforcement | | d472b64 | Op.hs | +30 | Infer takes ContextRequest, added InferRaw | | d472b64 | Sequential.hs | +178 | Interpreter handles hydration/compilation | | 3dbf484 | Programs/*.hs | - | Migrated to inferRaw | | d844e27 | Prompt/MemorySources.hs | 141 | Context sources backed by Memory.hs | | eca7a0d | Prompt/MemorySources.hs | +112 | buildHydrationConfig helper |
ContextRequest (intent)
↓
[hydrate] ← MemorySources (temporal, semantic, knowledge)
↓
PromptIR (labeled sections + tools + metadata)
↓
[compile] ← budget enforcement, priority ordering
↓
CompiledPrompt (flat messages)
↓
LLM call
To enable in Telegram.hs:
import Omni.Agent.Prompt.MemorySources as MS
-- Build hydration config
let hydrationCfg = MS.buildHydrationConfig
systemPrompt
tools
[MS.mkProjectSection proj dir, MS.mkTimeSection now tz]
userId
chatId
-- Add to SeqConfig
let seqConfig = (Seq.defaultSeqConfig provider seqTools)
{ Seq.seqHydrationConfig = Just hydrationCfg }
The infrastructure is complete. Full migration requires:
1. Modify OpAgent.runAgent to use infer instead of inferRaw, OR
2. Create a new IR-native agent program
The current system works (using inferRaw + manual context), and the new IR system is ready for incremental adoption.
Follow-up task created: t-480 (Integrate Prompt IR into Ava with observability). This covers wiring up the HydrationConfig in Telegram.hs and adding tracing to validate the system in real usage.
Design Decisions (2026-01-24)
Core Concept
Context window = dynamically constructed prior for each inference, not an append-only log. Like Bayesian priors - optimized for current task, not necessarily chronological.
Decisions Made
1. Granularity: Per-inference (every LLM call gets freshly constructed context)
2. Initial Sources:
3. Architecture: Option B - Inside Op layer
InfertakesContextRequestinstead ofPrompt4. Style: Prefer data over functions/callbacks
5. Reference: Ava's current implementation is the target behavior to support
Open: Prompt IR Design
Need structured intermediate representation: