Add context compaction when approaching token limit.
Long-running agents can exceed context limits. Currently there's a token guardrail that kills the run.
1. When approaching limit (e.g. 80%), compact the conversation history 2. Summarize older messages, keep recent ones verbatim 3. Continue the run with compacted context
Use the NEW Op-based architecture, NOT Engine.hs (legacy):
| Component | File | Purpose |
|-----------|------|---------|
| Agent program | Omni/Agent/Programs/Agent.hs | Main agent loop, manages messages |
| Interpreter | Omni/Agent/Interpreter/Sequential.hs | Executes Op programs |
| Op DSL | Omni/Agent/Op.hs | Free monad operations |
| State | AgentState in Programs/Agent.hs | Contains asMessages |
1. In Programs/Agent.hs, check token count before each inference
2. If approaching limit, call a compaction function
3. Compaction summarizes older messages using LLM, keeps recent N verbatim
4. Update asMessages with compacted history
5. Continue the loop
Omni/Agent/Programs/Agent.hs:runAgent - main loopOmni/Agent/Programs/Agent.hs:agentLoop - iteration logiccompactMessages helper