t-793 - omni

Timeline (8)

🔄[human]Open → InProgress4 days ago

💬[human]4 days ago

Proposed design (MVP): use existing agent checkpoint/resume plumbing to persist full AgentState per persistent session and auto-restore on restart.

Why this approach:

Replaying DB messages is lossy (assistant entries are summaries only) and can re-trigger tool side effects.
AgentState already contains full conversation/messages and is serializable.

Design: 1) Persist a stable session snapshot every completed turn

In Omni.Agent.Programs.Agent.runAgent, emit a final checkpoint name after each turn completes (success/error/cancel/max-iter paths).
This writes with full AgentState.

2) Auto-restore on persistent process startup

In agentd persistent wrapper (renderAgentExecScript), set per-agent checkpoint dir:

If exists, pass plus to .
If missing/corrupt, agent logs warning and starts fresh (current behavior fallback).

3) Keep lifecycle semantics simple

Restart should preserve prior conversation automatically.
Remove/purge should also remove checkpoint directory for that agent.

4) Hardening (follow-up)

Validate checkpoint model compatibility ( vs runtime model) before restore; if mismatch, ignore checkpoint and start fresh.
Optional to bypass restore intentionally.

Scope/files likely touched:

Omni/Agent/Programs/Agent.hs (final checkpoint)
Omni/Agentd/Daemon.hs (persistent wrapper args + cleanup)
Omni/Agent.hs (optional resume compatibility guard)
tests in Agent/Agentd modules for checkpoint+resume wiring

Acceptance test:

send message A -> response
restart persistent agent
send message B that depends on A context
verify response references A (context survived restart).

💬[human]4 days ago

Design note (corrected):

Proposed MVP: use existing checkpoint/resume plumbing to persist full AgentState for each persistent agent and auto-restore on restart.

Why:

Replaying DB messages is lossy (assistant entries are summaries) and may re-trigger tool side effects.
AgentState already contains full conversation messages and is serializable.

Plan: 1) Persist stable snapshot each completed turn

In Omni.Agent.Programs.Agent.runAgent, write a final checkpoint named session-latest after each turn completes (success/error/cancel/max-iter).
Snapshot path becomes <checkpoint-dir>/session-latest.json.

2) Auto-restore on startup in persistent wrapper

In renderAgentExecScript, set per-agent checkpoint dir under $STATE_DIR/checkpoints/$AGENTD_AGENT_NAME.
Pass --checkpoint-dir always.
If session-latest.json exists, pass --resume with that file.
If missing/corrupt, fallback to fresh state (existing behavior).

3) Lifecycle

restart preserves context automatically.
remove/purge should also delete that agent checkpoint dir.

4) Hardening (follow-up)

Validate cpModel vs runtime model before restore.
Optional restart fresh mode to bypass restore intentionally.

Likely files:

Omni/Agent/Programs/Agent.hs
Omni/Agentd/Daemon.hs
Omni/Agent.hs (optional model-compat guard)

Acceptance test:

send prompt A
restart agent
send prompt B that depends on A
response should reference A (context survived restart)

💬[human]4 days ago

Implemented restart-safe persistent session restore in commit a711c7bf.

What changed:

Omni/Agent/Programs/Agent.hs: runAgent now checkpoints 'session-latest' at end of each turn on both success and budget-exhausted paths.
Omni/Agentd/Daemon.hs: persistent wrapper now creates per-agent checkpoint dir, always passes --checkpoint-dir, and conditionally passes --resume session-latest.json when present.
Omni/Agentd/Daemon.hs: rm/purge runtime cleanup now removes per-agent checkpoint directory.
Omni/Agentd/SPEC.md: documented checkpoint/resume behavior.

Validation:

typecheck.sh Omni/Agent/Programs/Agent.hs
typecheck.sh Omni/Agentd/Daemon.hs
bild --test Omni/Agent/Programs/Agent.hs
bild --test Omni/Agentd/Daemon.hs

Manual acceptance test (live): 1) create/start temporary persistent agent resume-smoke 2) send prompt A: remember APPLE-739 3) observed checkpoint files: init.json + session-latest.json under ~/.local/state/agentd-agents/checkpoints/resume-smoke/ 4) restart agent; verified process argv contains --resume .../session-latest.json 5) send prompt B asking for remembered code; assistant returned APPLE-739 6) agentd rm resume-smoke removed env/fifo/checkpoint runtime artifacts

🔄[human]InProgress → Review4 days ago

Persist agent session state across persistent-agent restarts

Dependencies

Description

Git Commits

Timeline (8)