Draft Phase 0 evaluation spec for Factor/Compress/Compose experiments and create initial spec doc.
Goal
Define a minimal, concrete spec format and metrics so t-394/395/396 can be tested before theory work.
Deliverables
1. Spec document (Markdown) describing:
- Task suite format (YAML/JSON)
- Validator types (command, regex, json schema, file diff)
- Metrics (fidelity, trace drift, cost, latency, variance)
- Run protocol (baseline vs variants, repeat count)
2. Example spec file with 3-5 test cases.
Notes
- Prefer to align with existing eval harness conventions in Omni/Agent/Eval.
- Keep it minimal and executable via agentd or Op runner later.
- Use ASCII only.
Drafted Phase 0 spec and example suite: Omni/Agent/Eval/Phase0.md and Omni/Agent/Eval/phase0-suite.yaml