Add context compaction when approaching token limit

t-392·WorkTask·
·
·
Created1 month ago·Updated1 month ago

Description

Edit

Add context compaction when approaching token limit.

Problem

Long-running agents can exceed context limits. Currently there's a token guardrail that kills the run.

Solution

1. When approaching limit (e.g. 80%), compact the conversation history 2. Summarize older messages, keep recent ones verbatim 3. Continue the run with compacted context

Architecture (IMPORTANT)

Use the NEW Op-based architecture, NOT Engine.hs (legacy):

| Component | File | Purpose | |-----------|------|---------| | Agent program | Omni/Agent/Programs/Agent.hs | Main agent loop, manages messages | | Interpreter | Omni/Agent/Interpreter/Sequential.hs | Executes Op programs | | Op DSL | Omni/Agent/Op.hs | Free monad operations | | State | AgentState in Programs/Agent.hs | Contains asMessages |

Implementation Approach

1. In Programs/Agent.hs, check token count before each inference 2. If approaching limit, call a compaction function 3. Compaction summarizes older messages using LLM, keeps recent N verbatim 4. Update asMessages with compacted history 5. Continue the loop

Key Functions to Modify

  • Omni/Agent/Programs/Agent.hs:runAgent - main loop
  • Omni/Agent/Programs/Agent.hs:agentLoop - iteration logic
  • May need new compactMessages helper

DO NOT modify Engine.hs - it's legacy code being phased out.

Timeline (1)

🔄[human]Open → Done1 month ago