t-675 - omni

t-675·Epic····omni.hs

Created1 month ago·Updated1 month ago·pipeline runs →

Execution Summary

0/5

Tasks Completed

$0.00

Total Cost

Total Time

Design

Edit

Incremental rollout of a multi-LLM up-speccing pipeline for agent tasks. The core insight: well-specified tasks can be outsourced to small models (Sonnet/Haiku), while underspecified tasks need large model (Opus/GPT-5.3) passes to reach a quality threshold before handoff.

The pipeline uses an executor GO/NO-GO gate: the executor model (Sonnet/Haiku) is asked to restate the task and confirm it can implement without clarifying questions. If NO-GO, the spec is sent back to the large model to address the questions. Passes continue until convergence (GO) or a circuit breaker (max 5 passes).

Key metric: passes-to-convergence per task. Also track task success rate, revision count, and time-to-done across all MVPs.

MVP stages: 0. Baseline metrics on current manual process 1. Executor GO/NO-GO gate (single extra LLM call, bounces to human on NO-GO) 2. Single-model auto-spec loop (Opus answers executor questions automatically) 3. Diverse multi-model review (Opus drafts, GPT-5.3 devil's advocate, executor gate)

Design principles:

Don't fix pass count; let executor readiness signal drive convergence
Executor restates the task in its own words (catches confident misunderstanding)
Large model triages executor questions (skip style questions, only amend for correctness)
Sequential model roles, not adversarial (Opus drafts, GPT reviews)

Child Tasks

t-675.1 - MVP 0: Baseline metrics instrumentation [Open]
t-675.2 - MVP 1: Executor GO/NO-GO gate [Open]
t-675.3 - MVP 2: Single-model auto-spec loop [Open]
t-675.5 - Multi-model LLM routing infrastructure [Open]
t-675.4 - MVP 3: Diverse multi-model review [Open]

Multi-LLM Task Specification Pipeline

Execution Summary

Design

Child Tasks

Timeline (0)