Bug: agentd dev workflow prompt for t-575 hangs with no events

t-585·WorkTask·
·
·
Created1 week ago·Updated1 week ago

Dependencies

Description

Edit

Dogfooding Omni/Ide/dev-review-release.sh on t-575 surfaced a runtime hang in the dev run.

Observed:

  • Dev loop starts run dev-t-575-20260211-112602 and then stalls indefinitely.
  • Container stays Up with no stdout/stderr events.
  • /var/log/agentd/dev-t-575-20260211-112602/events.jsonl remains empty.
  • Inside container, PID 1 is sleeping (State: S, wchan=futex_wait_queue) with ~0% CPU.

Repro: 1. Generate prompt file via dev loop (_/tmp/dogfood-e2e/dev/_/tmp/dev-review-release/dev-t-575-20260211-112602.md). 2. Run: agentd run <prompt-file> -n repro --fg --provider claude-code --max-iter 80 --max-cost 300 --timeout 120 3. Process hangs with no output until externally timed out.

Additional signal:

  • Same prompt with small max-iter (e.g. 2 or 10) returns quickly but only writes Error: Maximum iterations reached to events, no useful work.

Impact:

  • Dev role can wedge for long periods and block review/integrator progression.

Timeline (4)

🔄[human]Open → InProgress1 week ago
💬[human]1 week ago

Implemented agentd runtime fixes in Omni/Agentd.hs to remove non-progressing/hanging workflow behavior: (1) preserve markdown workflow prompts as file paths so agent frontmatter is honored, (2) map host prompt paths into container paths (/workspace or /repo), (3) add '--' before prompt argument, (4) run docker pipeline via bash -o pipefail so non-zero agent exits propagate correctly. Also removed redundant AGENTS imports from Omni/Ide/Workflows/{dev,reviewer,integrator}.md to reduce prompt bloat and avoid long first-iteration stalls. Verified with foreground and background runs: failures now surface immediately (max-iter/cost/timeout) and successful dev/review/integrator dogfood completed for t-575.

🔄[human]InProgress → Done1 week ago