t-683 - omni

t-683·WorkTask···

Created1 month ago·Updated1 month ago·pipeline runs →

Description

Parent epic: t-679 Depends on: MVP 2 (auto-spec loop)

Goal

Add a second large model as devil's advocate during the up-spec process. Use data from MVP 2 to determine which task types benefit from this extra pass.

Mechanism

1. Opus drafts/amends the spec (as in MVP 2) 2. Before sending to executor, route to GPT-5.3 with prompt: "Review this task spec. What's ambiguous, missing, or likely to cause implementation failure? Be adversarial." 3. GPT-5.3's critique feeds back to Opus for one more amendment round 4. Then proceed to executor gate as normal

Design decisions

Sequential roles (drafter → devil's advocate), NOT adversarial ping-pong
Gate on complexity: only run diverse review for tasks above a complexity threshold (informed by MVP 2 data showing which tasks needed 3+ passes)
Track whether diverse review actually improves first-attempt success vs MVP 2

Acceptance criteria

Two different large models participate in spec refinement
Complexity-based routing: simple tasks skip diverse review
Metrics show whether diverse review adds value over MVP 2
No oscillation: the models don't endlessly revise each other (one critique round max)

MVP 3: Diverse model review (devil's advocate)

Description

Goal

Mechanism

Design decisions

Acceptance criteria

Timeline (0)