Benchmark Ava's memory system against the LoCoMo (Long-term Conversational Memory) eval to measure how well it performs on long-term conversational memory tasks compared to state of the art.
LoCoMo (Snap Research, ACL 2024) is a benchmark for evaluating very long-term conversational memory in LLM agents. It consists of 10 annotated conversations with 300+ turns spanning 32 sessions each. Tasks include question answering, event summarization, and multimodal reasoning.
Key reference scores:
1. Clone the LoCoMo dataset from GitHub 2. Write an adapter that feeds LoCoMo conversation sessions into Ava's memory system (remember, recall, link_memories, query_graph) 3. Run the QA eval: for each test question, use Ava's recall/query_graph tools to retrieve relevant memories, then answer 4. Score using LoCoMo's provided evaluation metrics 5. Compare against published baselines
EasyLocomo may simplify the eval harness setup. Zep's harness is another reference implementation.
Our memory system uses semantic search + knowledge graph links. This is more sophisticated than Letta's filesystem approach but simpler than some vector DB solutions. The interesting question is whether the graph structure (especially contradiction/supersession tracking) gives us an edge on the harder questions that require reasoning over evolving facts.
Pipeline: verification failed: Build failed for Omni/Agent/Memory.hs (exit 1): 7[10000;10000H7[10000;10000Hthese 15 derivations will be built: /nix/store/0m5fa2krxa2d7m1rd67xnplb90yj9vbw-hs-mod-Omni_Agent_Prompt_IR.drv /nix/store/wrciq3ha3bcvsvdjjjphm4ispziykj2k-hs-mod-Omni_Agent_Trace.drv /nix/store/ppsjkiss9gjb8xvsmalw6lfbk98ajbjk-hs-mod-Omni_Agent_Op.drv /nix/store/kkmxaw0ks2y1ndih75clcsjsq4wc7dy9-hs-mod-Omni_Agent_Models.drv /nix/store/rvrifzh4ra6glx7w7p4znb7z0pgjbwi5-hs-mod-Omni_Agent_Provider.drv /nix/store/xyn2scqg0ygjhz73md9gbw8al9ragcsb-hs-mod-Omni_Agent_Prompt_Hydrate.drv /nix/store/ylwsiw8a9dr2ljc9siwn876bv7gizmnf-hs-mod-Omni_Agent_Prompt_Compile.drv /nix/store/11gi7wrxbsiq1x5g4c7cnn7lq3r4vf16-hs-mod-Omni_Agent_Interpreter_Sequential.drv /nix/store/rwzsc50issjjj8k3i0x582z269j4vv93-hs-mod-Omni_Agent_Programs_Compaction.drv /nix/store/837lb1y7w5fqpa8nfxkds9driy1z4z28-hs-mod-Omni_Agent_Programs_Agent.drv /nix/store/lgpvbrwgjmlp7d0vphgigwn6ik39klbf-hs-mod-Omni_Time.drv /nix/store/x0q0anj5xyg8pmdl2hq1cw17p6jh4qa9-hs-mod-Omni_Agent_Engine.drv /nix/store/riwas6d6ghspjx64h8qckldkaf1s89bi-hs-mod-Omni_Agent_Op_Bridge.drv /nix/store/zb6am8iyw1spnvd1h9dmyzcszq36gc5b-hs-mod-Omni_Agent_Memory.drv /nix/store/mwhg9r78d2wnhaz5ygy55h3w18gm66x8-omni-agent-memory.drv building '/nix/store/kkmxaw0ks2y1ndih75clcsjsq4wc7dy9-hs-mod-Omni_Agent_Models.drv'... building '/nix/store/0m5fa2krxa2d7m1rd67xnplb90yj9vbw-hs-mod-Omni_Agent_Prompt_IR.drv'... building '/nix/store/wrciq3ha3bcvsvdjjjphm4ispziykj2k-hs-mod-Omni_Agent_Trace.drv'... building '/nix/store/lgpvbrwgjmlp7d0vphgigwn6ik39klbf-hs-mod-Omni_Time.drv'...
Omni/Agent/Models.hs:35:1: error:
Could not find module Data.Yaml'
Use -v (or :set -v in ghci) to see a list of the files searched for.
|
35 | import qualified Data.Yaml as Yaml
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error: builder for '/nix/store/kkmxaw0ks2y1ndih75clcsjsq4wc7dy9-hs-mod-Omni_Agent_Models.drv' failed with exit code 1;
last 7 log lines:
>
> Omni/Agent/Models.hs:35:1: error:
> Could not find module Data.Yaml'
> Use -v (or :set -v in ghci) to see a list of the files searched for.
> |
> 35 | import qualified Data.Yaml as Yaml
> | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For full logs, run:
nix log /nix/store/kkmxaw0ks2y1ndih75clcsjsq4wc7dy9-hs-mod-Omni_Agent_Models.drv
error: 1 dependencies of derivation '/nix/store/mwhg9r78d2wnhaz5ygy55h3w18gm66x8-omni-agent-memory.drv' failed to build
[1A[1G[2K[+] Omni/Agent/Memory.hs [1A[1G[2K[0m[…] Omni/Agent/Memory.hs[0m[1B
[1A[1G[2K[+] Omni/Agent/Memory.hs [1A[1G[2K[~] Omni/Agent/Memory.hs: warning: you did not specify '--add-root'; the res...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: /nix/store/mwhg9r78d2wnhaz5ygy55h3w18gm66x8-omni-a...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: these 15 derivations will be built:…[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: /nix/store/0m5fa2krxa2d7m1rd67xnplb90yj9vbw-hs-m...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: /nix/store/zb6am8iyw1spnvd1h9dmyzcszq36gc5b-hs-m...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: building '/nix/store/kkmxaw0ks2y1ndih75clcsjsq4wc7...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: building '/nix/store/0m5fa2krxa2d7m1rd67xnplb90yj9...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: building '/nix/store/wrciq3ha3bcvsvdjjjphm4ispziyk...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: building '/nix/store/lgpvbrwgjmlp7d0vphgigwn6ik39k...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: Omni/Agent/Models.hs:35:1: error: Could not fin...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: error: builder for '/nix/store/kkmxaw0ks2y1ndih75c...[1B[1A[1G[2K[~] Omni/Agent/Memory.hs: error: 1 dependencies of derivation '/nix/store/mw...[1B[0m[38;5;1m[2Kfail: bild: realise: Omni/Agent/Memory.hs [0m[0m [0m[1A[1G[2K[0m[38;5;1m[x] Omni/Agent/Memory.hs[0m[1B 1
Pipeline scheduler: started run=pipeline-omni-agent-memory-hs-t-602-1771561818 domain=Omni/Agent/Memory.hs
Pipeline scheduler: run=pipeline-omni-agent-memory-hs-t-602-1771561818 domain=Omni/Agent/Memory.hs status=done cost=41c (fund-spend=failed)
Ava triage: pipeline auto-run reached status=done but the agent made NO git commits and reported blockers (missing files, path mismatches, or need clarification). This task is not actually in review — there's nothing to review. Resetting status to Open so it can be re-scoped.
ORPHAN COMMIT: coder agent produced commit a6739cd61cb80c88e9d68a50fb176b3aaf69ebf4 on 2026-02-19 but it was never merged into live. Reachable only via branchless reflog. Pipeline scheduler bug — see separate task. To recover: git cherry-pick a6739cd61cb80c88e9d68a50fb176b3aaf69ebf4 from omni/live (expect conflicts after 6+ weeks of drift). Otherwise re-implement from scratch.
Pipeline: dev completed (run=dev-t-602-1771511912, cost=0.0c)