Pipeline scheduler: verify commits reach live before marking status=done (fixes orphan-commit bug)

t-762·WorkTask·
·
·
·Omni/Pipeline.hs
Created1 week ago·Updated1 week ago·pipeline runs →

Description

Edit

Pipeline scheduler drops orphan commits and falsely reports status=done

Summary

The pipeline scheduler runs coder agents successfully — agents produce real commits with real diffs — but the scheduler then fails to integrate those commits into any branch. The commits are left as orphans (reachable only via the branchless reflog) and the task is moved to Review with a status=done comment, even though nothing landed in live.

This caused ~17 tasks worth of real coder output to silently rot in the reflog for ~6 weeks, polluting the review queue and wasting agent spend.

How it was discovered

During an Ava-led triage of the 130-item Review queue (2026-04-07), I found 30 tasks where the last comment was:

Pipeline scheduler: run=pipeline-... domain=... status=done cost=Nc

…and yet none of those tasks had any new code in live. After investigating omni/live's full reflog (git log --all --reflog), 17 of the 30 tasks did in fact have a Coder Agent commit — all dated 2026-02-19, all with parent 7cc0c96a, all sitting on detached HEADs preserved only by git-branchless. None of them are referenced by any branch, and they all conflict with current live.

Affected tasks (orphan-commit category, 17 total)

Each task has a comment recording the orphan SHA:

| Task | SHA | Title fragment | |---------|--------------|---------------------------------------------| | t-481 | 68c0a843b09d | refresh embeddings on duplicate merge | | t-486 | 6f0f84e8d207 | Enforce PromptIR token budgets | | t-513 | faeb91a4f13f | agent graph DAG primitives | | t-520 | f5939fd92d9d | Graph: TUI state and navigation | | t-521 | d6b35a01f39e | Graph: Brick TUI draw module | | t-522 | 35bac56ce0b7 | Graph: TUI event handling | | t-529 | db7b1208ca3d | Graph: TUI for run tree navigation | | t-533 | e2eb0d84282f | pre-compaction memory flush hook | | t-535 | 6f27cfdf27f2 | user_profiles + tool support | | t-538.2 | 2d8b649cde37 | dual Telegram webhook/socket ingress | | t-538.3 | b5ee61714d71 | socket notification schema and handler | | t-549.3 | 9ae4c987fa59 | interactive event debugger | | t-602 | a6739cd61cb8 | LoCoMo benchmark harness | | t-620 | 6e5b438bb146 | system Chromium fallback for DevBrowser | | t-630 | 560bee62cc16 | local exec for claude-code/codex providers | | t-655 | 472c0f2cb0d2 | namespace path validation | | t-265.7 | bed1e711224f | persistent follow-up scheduler |

Plus:

  • *t-631* (multi-domain pipeline support): branch t-631 exists with commit 12ea4d00, parent in live, but never merged.
  • *t-362, t-622*: pipeline ran, no commits anywhere, no recoverable output. The scheduler still moved them to Review.

Root cause hypothesis

The scheduler appears to use "agent process exited" or "agent emitted some terminal event" as the signal for status=done and Review-promotion, *without* verifying that:

1. A commit was actually made on the working branch. 2. The working branch was successfully merged/rebased into live. 3. The merge result is reachable from HEAD.

Without those checks, an agent that succeeds locally but whose post-run integration step fails (or never runs) looks identical to an agent that completed successfully and shipped.

Acceptance criteria

The scheduler must, before marking a task Review and posting status=done:

1. *Verify a commit was produced.* Look up the agent's working branch / commit SHA from agentd, not from agent self-report. If no commit exists, post status=no-commit and either retry or move task to NeedsHelp with the agent's last messages. 2. *Verify the commit is reachable from a branch.* If it's only reachable via the reflog/branchless, that's an integration failure — post status=integration-failed with the orphan SHA, and either retry the integration step or move to NeedsHelp. 3. *Verify the commit reaches live.* If integration produced a branch but the branch wasn't merged, post status=unmerged with the branch name. (For pipelines that intentionally leave a PR open, this can be a separate status=pr-open.) 4. *Never post status=done unless the commit is in live*. Done means done.

Bonus: a task pipeline orphans CLI subcommand or report that scans the reflog for commits authored by Coder Agent that are not reachable from live, and lists them with task IDs (parsed from the trailer). This would let us recover or audit historical orphans.

Files to investigate (read-only research suggested before coding)

  • Omni/Pipeline/Scheduler.hs (or wherever pipeline status transitions live)
  • Omni/Pipeline/Git.hs (the git wrapper that handles commit/integration)
  • Omni/Task.hs (status transition rules)
  • agentd's run-completion hook (where it tells the pipeline "done")

Cost so far

Coarse estimate from the comment trail: ~$8 of agent spend across the 30 reset tasks, plus 6+ weeks of confusion in the review queue. The fix is high-leverage — it protects every future pipeline run.

Timeline (0)

No activity yet.