t-298.5 - omni

t-298.5·WorkTask····Omni/Ide/pi-code.hs

Parent:t-298·Created1 month ago·Updated1 month ago

Description

The coder LLM claimed 'build successful' without actually running bild. We added post-coder verification which catches this, but ideally the coder should:

1. Actually run bild as instructed in the prompt 2. Fix any compilation errors before finishing 3. Not declare success until verified

The prompt says 'You MUST run bild' but the LLM ignores it. Consider:

Stronger prompt wording
Few-shot examples showing proper workflow
System prompt emphasis on verification
Or just rely on post-coder verification (current approach)

Timeline (3)

💬[human]1 month ago

After review: The post-coder verification approach is more reliable than trying to get the LLM to reliably run bild. The current implementation already verifies compilation after the coder finishes and fails the phase if it doesn't compile. Prompt engineering to make LLMs follow instructions is unreliable - external verification is the right pattern.

🔄[human]Open → Done1 month ago

Coder should run bild itself before declaring success

Description

Timeline (3)