t-679 - omni

t-679·WorkTask···

Created1 month ago·Updated1 month ago·pipeline runs →

Description

Goal

Replace manual task up-speccing with an automated multi-pass pipeline that iteratively refines task specifications until an executor model (Sonnet/Haiku) can implement them without clarifying questions.

Background

Current workflow: Ben up-specs tasks manually with Opus, files them, coder agent runs. Many coder failures trace back to underspecified tasks. This epic adds an automated refinement loop.

Architecture

After initial task creation: 1. Send spec to executor model with prompt: "Can you implement this without clarifying questions? If YES: GO. If NO: list every question." 2. If NO-GO: feed questions back to the spec-writing model (Opus/GPT-5.3) to amend the spec 3. Repeat until GO or max 5 passes (circuit breaker) 4. Key addition: executor must "restate the task in its own words" to catch confident misunderstanding

Milestones (see subtasks)

MVP 0: Baseline metrics on current pipeline
MVP 1: Executor gate (GO/NO-GO check before coder starts)
MVP 2: Single-model auto-spec (questions feed back to Opus automatically)
MVP 3: Diverse review (second large model as devil's advocate)

Key Metrics

Task success rate (first-attempt pass rate)
Revision count per task
Time-to-done
Passes-to-convergence (starting MVP 2)

Design Principles

Pass count is emergent, not fixed. Convergence = executor says GO (zero questions).
Model-as-judge > deterministic checklist for detecting ambiguity
Large model triages executor questions (skip pedantic ones, amend for correctness)
If using two large models, sequential roles (drafter + devil's advocate), not adversarial

Epic: Automated Task Up-Speccing Pipeline