Parent epic: t-679
Depends on: MVP 2 (auto-spec loop)
Goal
Add a second large model as devil's advocate during the up-spec process. Use data from MVP 2 to determine which task types benefit from this extra pass.
Mechanism
1. Opus drafts/amends the spec (as in MVP 2)
2. Before sending to executor, route to GPT-5.3 with prompt: "Review this task spec. What's ambiguous, missing, or likely to cause implementation failure? Be adversarial."
3. GPT-5.3's critique feeds back to Opus for one more amendment round
4. Then proceed to executor gate as normal
Design decisions
- Sequential roles (drafter → devil's advocate), NOT adversarial ping-pong
- Gate on complexity: only run diverse review for tasks above a complexity threshold (informed by MVP 2 data showing which tasks needed 3+ passes)
- Track whether diverse review actually improves first-attempt success vs MVP 2
Acceptance criteria
- Two different large models participate in spec refinement
- Complexity-based routing: simple tasks skip diverse review
- Metrics show whether diverse review adds value over MVP 2
- No oscillation: the models don't endlessly revise each other (one critique round max)