Agent for t-201 ran bild --test on 4 separate files and lint on 5 files individually. This wastes tokens and time.
The prompt should clarify: 1. bild --test on a namespace tests all its dependencies - no need to test each file 2. lint --fix on changed files is sufficient (or just rely on git hooks) 3. Don't re-run tests after they already passed
Current prompt tells agent to run bild --test until it passes, but doesn't explain the build system handles dependencies.
I'll analyze the current Worker.hs implementation to understand how the prompt is structured and then improve it to avoid redundant test/lint runs.
Omni/Agent/Worker.hs1: {-# LANGUAGE OverloadedStrings #-}
2: {-# LANGUAGE NoImplicitPrelude #-}
3:
4: module Omni.Agent.Worker where
5:
6: import Alpha
7: import qualified Data.Aeson as Aeson
8: import qualified Data.Aeson.Key as AesonKey
9: import qualified Data.ByteString.Lazy as BSL
10: import Data.IORef (modifyIORef', newIORef, readIORef)
11: import qualified Data.List as List
12: import qualified Data.Text as Text
13: import qualified Data.Text.Encoding as TE
14: import qualified Data.Time
15: import qualified Omni.Agent.Core as Core
16: import qualified Omni.Agent.Engine as Engine
17: import qualified Omni.Agent.Log as AgentLog
18: import qualified Omni.Agent.Tools as Tools
19: import qualified Omni.Fact as Fact
20: import qualified Omni.Task.Core as TaskCore
21: import qualified System.Directory as Directory
22: import qualified System.Environment as Env
23: import qualified System.Exit as Exit
24: import System.FilePath ((</>))
25: import qualified System.Process as Process
26:
27: start :: Core.Worker -> Maybe Text -> IO ()
28: start worker maybeTaskId = do
29: if Core.workerQuiet worker
30: then putText ("[worker] Starting for " <> Core.workerName worker)
31: else do
32: AgentLog.init (Core.workerName worker)
33: AgentLog.log ("[worker] Starting for " <> Core.workerName worker)
34: case maybeTaskId of
35: Just tid -> logMsg worker ("[worker] Target task: " <> tid)
36: Nothing -> logMsg worker "[worker] No specific task, will pick from ready queue"
37: runOnce worker maybeTaskId
38:
39: -- | Log message respecting quiet mode
40: logMsg :: Core.Worker -> Text -> IO ()
41: logMsg worker msg =
42: if Core.workerQuiet worker
43: then putText msg
44: else AgentLog.log msg
45:
46: -- | Convert key-value pairs to JSON metadata string
47: toMetadata :: [(Text, Text)] -> Text
48: toMetadata pairs =
49: let obj = Aeson.object [(AesonKey.fromText k, Aeson.String v) | (k, v) <- pairs]
50: in TE.decodeUtf8 (BSL.toStrict (Aeson.encode obj))
51:
52: runOnce :: Core.Worker -> Maybe Text -> IO ()
53: runOnce worker maybeTaskId = do
54: -- Find work
55: targetTask <- case maybeTaskId of
56: Just tid -> do
57: TaskCore.findTask tid </ TaskCore.loadTasks
58: Nothing -> do
59: readyTasks <- TaskCore.getReadyTasks
60: case readyTasks of
61: [] -> pure Nothing
62: (task : _) -> pure (Just task)
63:
64: case targetTask of
65: Nothing -> do
66: case maybeTaskId of
67: Just tid -> do
68: unless (Core.workerQuiet worker) <| AgentLog.updateActivity ("Task " <> tid <> " not found.")
69: logMsg worker ("[worker] Task " <> tid <> " not found.")
70: Nothing -> do
71: unless (Core.workerQuiet worker) <| AgentLog.updateActivity "No work found."
72: logMsg worker "[worker] No ready tasks found."
73: Just task -> do
74: processTask worker task
75:
76: processTask :: Core.Worker -> TaskCore.Task -> IO ()
77: processTask worker task = do
78: let repo = Core.workerPath worker
79: let tid = TaskCore.taskId task
80: let quiet = Core.workerQuiet worker
81: let say = logMsg worker
82:
83: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Just tid})
84: say ("[worker] Claiming task " <> tid)
85:
86: -- Claim task
87: TaskCore.logActivity tid TaskCore.Claiming Nothing
88: TaskCore.updateTaskStatus tid TaskCore.InProgress []
89: say "[worker] Status -> InProgress"
90:
91: -- Run agent with timing
92: startTime <- Data.Time.getCurrentTime
93: activityId <- TaskCore.logActivityWithMetrics tid TaskCore.Running Nothing Nothing (Just startTime) Nothing Nothing Nothing
94:
95: say "[worker] Starting engine..."
96: (exitCode, output, costCents) <- runWithEngine worker repo task
97:
98: endTime <- Data.Time.getCurrentTime
99: say ("[worker] Agent exited with: " <> tshow exitCode)
100:
101: -- Update the activity record with metrics (convert Double to Int by rounding)
102: TaskCore.updateActivityMetrics activityId Nothing (Just endTime) (Just (round costCents)) Nothing
103:
104: case exitCode of
105: Exit.ExitSuccess -> do
106: TaskCore.logActivity tid TaskCore.Reviewing Nothing
107: say "[worker] Running formatters..."
108: _ <- runFormatters repo
109:
110: -- Try to commit (this runs git hooks which may fail)
111: let commitMsg = formatCommitMessage task output
112: say "[worker] Attempting commit..."
113: commitResult <- tryCommit repo commitMsg
114:
115: case commitResult of
116: CommitFailed commitErr -> do
117: say ("[worker] Commit failed: " <> commitErr)
118:
119: -- Save failure context and reopen task for retry
120: maybeCtx <- TaskCore.getRetryContext tid
121: let attempt = maybe 1 (\c -> TaskCore.retryAttempt c + 1) maybeCtx
122:
123: if attempt > 3
124: then do
125: say "[worker] Task failed 3 times, needs human intervention"
126: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("reason", "max_retries_exceeded")]))
127: TaskCore.updateTaskStatus tid TaskCore.Open []
128: else do
129: let currentReason = "attempt " <> tshow attempt <> ": commit_failed: " <> commitErr
130: let accumulatedReason = case maybeCtx of
131: Nothing -> currentReason
132: Just ctx -> TaskCore.retryReason ctx <> "\n" <> currentReason
133: TaskCore.setRetryContext
134: TaskCore.RetryContext
135: { TaskCore.retryTaskId = tid,
136: TaskCore.retryOriginalCommit = "",
137: TaskCore.retryConflictFiles = [],
138: TaskCore.retryAttempt = attempt,
139: TaskCore.retryReason = accumulatedReason,
140: TaskCore.retryNotes = maybeCtx +> TaskCore.retryNotes
141: }
142: TaskCore.logActivity tid TaskCore.Retrying (Just (toMetadata [("attempt", tshow attempt)]))
143: TaskCore.updateTaskStatus tid TaskCore.Open []
144: say ("[worker] Task reopened (attempt " <> tshow attempt <> "/3)")
145: NoChanges -> do
146: -- No changes = task already implemented, mark as Done
147: say "[worker] No changes to commit - task already done"
148: TaskCore.clearRetryContext tid
149: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "no_changes")]))
150: TaskCore.updateTaskStatus tid TaskCore.Done []
151: say ("[worker] ✓ Task " <> tid <> " -> Done (no changes)")
152: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
153: CommitSuccess -> do
154: -- Commit succeeded, set to Review
155: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "committed")]))
156: TaskCore.updateTaskStatus tid TaskCore.Review []
157: say ("[worker] ✓ Task " <> tid <> " -> Review")
158: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
159: Exit.ExitFailure code -> do
160: say ("[worker] Engine failed with code " <> tshow code)
161: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("exit_code", tshow code)]))
162: -- Don't set back to Open here - leave in InProgress for debugging
163: say "[worker] Task left in InProgress (engine failure)"
164:
165: -- | Run lint --fix to format and fix lint issues
166: runFormatters :: FilePath -> IO (Either Text ())
167: runFormatters repo = do
168: let cmd = (Process.proc "lint" ["--fix"]) {Process.cwd = Just repo}
169: (code, _, _) <- Process.readCreateProcessWithExitCode cmd ""
170: case code of
171: Exit.ExitSuccess -> pure (Right ())
172: Exit.ExitFailure _ -> pure (Right ()) -- lint --fix may exit non-zero but still fix things
173:
174: data CommitResult = CommitSuccess | NoChanges | CommitFailed Text
175: deriving (Show, Eq)
176:
177: -- | Try to commit, returning result
178: tryCommit :: FilePath -> Text -> IO CommitResult
179: tryCommit repo msg = do
180: -- Stage all changes
181: let addCmd = (Process.proc "git" ["add", "."]) {Process.cwd = Just repo}
182: (addCode, _, addErr) <- Process.readCreateProcessWithExitCode addCmd ""
183: case addCode of
184: Exit.ExitFailure _ -> pure <| CommitFailed (Text.pack addErr)
185: Exit.ExitSuccess -> do
186: -- Check for changes
187: let checkCmd = (Process.proc "git" ["diff", "--cached", "--quiet"]) {Process.cwd = Just repo}
188: (checkCode, _, _) <- Process.readCreateProcessWithExitCode checkCmd ""
189: case checkCode of
190: Exit.ExitSuccess -> pure NoChanges
191: Exit.ExitFailure 1 -> do
192: -- There are changes, commit them
193: let commitCmd = (Process.proc "git" ["commit", "-m", Text.unpack msg]) {Process.cwd = Just repo}
194: (commitCode, _, commitErr) <- Process.readCreateProcessWithExitCode commitCmd ""
195: case commitCode of
196: Exit.ExitSuccess -> pure CommitSuccess
197: Exit.ExitFailure _ -> pure <| CommitFailed (Text.pack commitErr)
198: Exit.ExitFailure c -> pure <| CommitFailed ("git diff failed with code " <> tshow c)
199:
200: -- | Run task using native Engine
201: -- Returns (ExitCode, output text, cost in cents)
202: runWithEngine :: Core.Worker -> FilePath -> TaskCore.Task -> IO (Exit.ExitCode, Text, Double)
203: runWithEngine worker repo task = do
204: -- Read API key from environment
205: maybeApiKey <- Env.lookupEnv "OPENROUTER_API_KEY"
206: case maybeApiKey of
207: Nothing -> pure (Exit.ExitFailure 1, "OPENROUTER_API_KEY not set", 0)
208: Just apiKey -> do
209: -- Check for retry context
210: maybeRetry <- TaskCore.getRetryContext (TaskCore.taskId task)
211:
212: -- Build the full prompt
213: let ns = fromMaybe "." (TaskCore.taskNamespace task)
214: let basePrompt = buildBasePrompt task ns repo
215:
216: -- Add retry context if present
217: let retryPrompt = buildRetryPrompt maybeRetry
218:
219: let prompt = basePrompt <> retryPrompt
220:
221: -- Read AGENTS.md
222: agentsMd <-
223: fmap (fromMaybe "") <| do
224: exists <- Directory.doesFileExist (repo </> "AGENTS.md")
225: if exists
226: then Just </ readFile (repo </> "AGENTS.md")
227: else pure Nothing
228:
229: -- Get relevant facts from the knowledge base
230: relevantFacts <- getRelevantFacts task
231: let factsSection = formatFacts relevantFacts
232:
233: -- Build system prompt
234: let systemPrompt =
235: prompt
236: <> "\n\nREPOSITORY GUIDELINES (AGENTS.md):\n"
237: <> agentsMd
238: <> factsSection
239:
240: -- Build user prompt from task comments
241: let userPrompt = formatTask task
242:
243: -- Select model based on task complexity (simple heuristic)
244: let model = selectModel task
245:
246: -- Generate session ID for event logging
247: sessionId <- TaskCore.generateSessionId
248: let tid = TaskCore.taskId task
249:
250: -- Helper to log events to DB
251: -- For text content, store as-is; for structured data, JSON-encode
252: let logEventText = TaskCore.insertAgentEvent tid sessionId
253: logEventJson eventType value = do
254: let contentJson = TE.decodeUtf8 (BSL.toStrict (Aeson.encode value))
255: TaskCore.insertAgentEvent tid sessionId eventType contentJson
256:
257: -- Build Engine config with callbacks
258: totalCostRef <- newIORef (0 :: Double)
259: let quiet = Core.workerQuiet worker
260: sayLog msg = if quiet then putText msg else AgentLog.log msg
261: engineCfg =
262: Engine.EngineConfig
263: { Engine.engineLLM =
264: Engine.defaultLLM
265: { Engine.llmApiKey = Text.pack apiKey
266: },
267: Engine.engineOnCost = \tokens cost -> do
268: modifyIORef' totalCostRef (+ cost)
269: sayLog <| "Cost: " <> tshow cost <> " cents (" <> tshow tokens <> " tokens)"
270: logEventJson "Cost" (Aeson.object [("tokens", Aeson.toJSON tokens), ("cents", Aeson.toJSON cost)]),
271: Engine.engineOnActivity = \activity -> do
272: sayLog <| "[engine] " <> activity,
273: Engine.engineOnToolCall = \toolName args -> do
274: sayLog <| "[tool] " <> toolName
275: logEventText "ToolCall" (toolName <> ": " <> args),
276: Engine.engineOnAssistant = \msg -> do
277: sayLog <| "[assistant] " <> Text.take 200 msg
278: logEventText "Assistant" msg,
279: Engine.engineOnToolResult = \toolName success output -> do
280: let statusStr = if success then "ok" else "failed"
281: sayLog <| "[result] " <> toolName <> " (" <> statusStr <> "): " <> Text.take 100 output
282: logEventText "ToolResult" output,
283: Engine.engineOnComplete = do
284: sayLog "[engine] Complete"
285: logEventText "Complete" "",
286: Engine.engineOnError = \err -> do
287: sayLog <| "[error] " <> err
288: logEventText "Error" err
289: }
290:
291: -- Build Agent config
292: let agentCfg =
293: Engine.AgentConfig
294: { Engine.agentModel = model,
295: Engine.agentTools = Tools.allTools,
296: Engine.agentSystemPrompt = systemPrompt,
297: Engine.agentMaxIterations = 100
298: }
299:
300: -- Run the agent
301: result <- Engine.runAgent engineCfg agentCfg userPrompt
302: totalCost <- readIORef totalCostRef
303:
304: case result of
305: Left err -> pure (Exit.ExitFailure 1, "Engine error: " <> err, totalCost)
306: Right agentResult -> do
307: let output = Engine.resultFinalMessage agentResult
308: pure (Exit.ExitSuccess, output, totalCost)
309:
310: -- | Build the base prompt for the agent
311: buildBasePrompt :: TaskCore.Task -> Text -> FilePath -> Text
312: buildBasePrompt task ns repo =
313: "You are a Worker Agent.\n"
314: <> "Your goal is to implement the following task:\n\n"
315: <> formatTask task
316: <> "\n\nCRITICAL INSTRUCTIONS:\n"
317: <> "1. Analyze the codebase to understand where to make changes.\n"
318: <> "2. Implement the changes by editing files.\n"
319: <> "3. BEFORE finishing, you MUST run: bild --test "
320: <> ns
321: <> "\n"
322: <> "4. Fix ALL errors from bild --test (including lint issues).\n"
323: <> "5. Keep running bild --test until it passes with no errors.\n"
324: <> "6. Do NOT update task status or manage git.\n"
325: <> "7. Only exit after bild --test passes.\n\n"
326: <> "IMPORTANT: The git commit will fail if lint finds issues.\n"
327: <> "You must fix all lint suggestions.\n\n"
328: <> "Context:\n"
329: <> "- Working directory: "
330: <> Text.pack repo
331: <> "\n"
332: <> "- Namespace: "
333: <> ns
334: <> "\n"
335:
336: -- | Build retry context prompt
337: buildRetryPrompt :: Maybe TaskCore.RetryContext -> Text
338: buildRetryPrompt Nothing = ""
339: buildRetryPrompt (Just ctx) =
340: "\n\n## RETRY CONTEXT (IMPORTANT)\n\n"
341: <> "This task was previously attempted but failed. Attempt: "
342: <> tshow (TaskCore.retryAttempt ctx)
343: <> "/3\n"
344: <> "Reason: "
345: <> TaskCore.retryReason ctx
346: <> "\n\n"
347: <> ( if null (TaskCore.retryConflictFiles ctx)
348: then ""
349: else
350: "Conflicting files from previous attempt:\n"
351: <> Text.unlines (map (" - " <>) (TaskCore.retryConflictFiles ctx))
352: <> "\n"
353: )
354: <> "Original commit: "
355: <> TaskCore.retryOriginalCommit ctx
356: <> "\n\n"
357: <> maybe "" (\notes -> "## HUMAN NOTES/GUIDANCE\n\n" <> notes <> "\n\n") (TaskCore.retryNotes ctx)
358: <> "INSTRUCTIONS FOR RETRY:\n"
359: <> "- The codebase has changed since your last attempt\n"
360: <> "- Re-implement this task on top of the CURRENT codebase\n"
361: <> "- If there were merge conflicts, the conflicting files may have been modified by others\n"
362: <> "- Review the current state of those files before making changes\n"
363:
364: -- | Select model based on task complexity (1-5 scale)
365: -- Uses OpenRouter model identifiers for Claude models
366: selectModel :: TaskCore.Task -> Text
367: selectModel task = selectModelByComplexity (TaskCore.taskComplexity task)
368:
369: -- | Select model based on complexity level
370: selectModelByComplexity :: Maybe Int -> Text
371: selectModelByComplexity Nothing = "anthropic/claude-sonnet-4.5"
372: selectModelByComplexity (Just 1) = "anthropic/claude-haiku-4.5"
373: selectModelByComplexity (Just 2) = "anthropic/claude-haiku-4.5"
374: selectModelByComplexity (Just 3) = "anthropic/claude-sonnet-4.5"
375: selectModelByComplexity (Just 4) = "anthropic/claude-sonnet-4.5"
376: selectModelByComplexity (Just 5) = "anthropic/claude-opus-4.5"
377: selectModelByComplexity (Just _) = "anthropic/claude-sonnet-4.5"
378:
379: formatTask :: TaskCore.Task -> Text
380: formatTask t =
381: "Task: "
382: <> TaskCore.taskId t
383: <> "\n"
384: <> "Title: "
385: <> TaskCore.taskTitle t
386: <> "\n"
387: <> "Type: "
388: <> Text.pack (show (TaskCore.taskType t))
389: <> "\n"
390: <> "Status: "
391: <> Text.pack (show (TaskCore.taskStatus t))
392: <> "\n"
393: <> "Priority: "
394: <> Text.pack (show (TaskCore.taskPriority t))
395: <> "\n"
396: <> maybe "" (\p -> "Parent: " <> p <> "\n") (TaskCore.taskParent t)
397: <> maybe "" (\ns -> "Namespace: " <> ns <> "\n") (TaskCore.taskNamespace t)
398: <> "Created: "
399: <> Text.pack (show (TaskCore.taskCreatedAt t))
400: <> "\n"
401: <> "Updated: "
402: <> Text.pack (show (TaskCore.taskUpdatedAt t))
403: <> "\n"
404: <> (if Text.null (TaskCore.taskDescription t) then "" else "Description:\n" <> TaskCore.taskDescription t <> "\n\n")
405: <> formatDeps (TaskCore.taskDependencies t)
406: <> formatComments (TaskCore.taskComments t)
407: where
408: formatDeps [] = ""
409: formatDeps deps = "\nDependencies:\n" <> Text.unlines (map formatDep deps)
410: formatDep dep = " - " <> TaskCore.depId dep <> " [" <> Text.pack (show (TaskCore.depType dep)) <> "]"
411: formatComments [] = ""
412: formatComments cs = "\nComments/Notes:\n" <> Text.unlines (map formatComment cs)
413: formatComment c = " [" <> Text.pack (show (TaskCore.commentCreatedAt c)) <> "] " <> TaskCore.commentText c
414:
415: formatCommitMessage :: TaskCore.Task -> Text -> Text
416: formatCommitMessage task agentOutput =
417: let tid = TaskCore.taskId task
418: subject = cleanSubject (TaskCore.taskTitle task)
419: body = cleanBody agentOutput
420: in if Text.null body
421: then subject <> "\n\nTask-Id: " <> tid
422: else subject <> "\n\n" <> body <> "\n\nTask-Id: " <> tid
423: where
424: cleanSubject s =
425: let trailingPunct = ['.', ':', '!', '?', ',', ';', ' ', '-']
426: stripped = Text.dropWhileEnd (`elem` trailingPunct) s
427: truncated = Text.take 72 stripped
428: noPunct = Text.dropWhileEnd (`elem` trailingPunct) truncated
429: capitalized = case Text.uncons noPunct of
430: Just (c, rest) -> Text.cons (toUpper c) rest
431: Nothing -> noPunct
432: in capitalized
433:
434: cleanBody :: Text -> Text
435: cleanBody output =
436: let stripped = Text.strip output
437: in if Text.null stripped
438: then ""
439: else
440: let lns = Text.lines stripped
441: cleaned = [Text.take 72 ln | ln <- lns]
442: in Text.intercalate "\n" cleaned
443:
444: -- | Get facts relevant to a task based on namespace/project
445: getRelevantFacts :: TaskCore.Task -> IO [TaskCore.Fact]
446: getRelevantFacts task = do
447: let namespace = fromMaybe "Omni" (TaskCore.taskNamespace task)
448: projectFacts <- Fact.getFactsByProject namespace
449: let sorted = List.sortBy (comparing (Down <. TaskCore.factConfidence)) projectFacts
450: pure (take 10 sorted)
451:
452: -- | Format facts for inclusion in the prompt
453: formatFacts :: [TaskCore.Fact] -> Text
454: formatFacts [] = ""
455: formatFacts facts =
456: Text.unlines
457: [ "\n\nKNOWLEDGE BASE FACTS:",
458: "(These are learned patterns/conventions from previous work)",
459: ""
460: ]
461: <> Text.unlines (map formatFact facts)
462:
463: -- | Format a single fact for the prompt
464: formatFact :: TaskCore.Fact -> Text
465: formatFact f =
466: "- "
467: <> TaskCore.factContent f
468: <> ( if null (TaskCore.factRelatedFiles f)
469: then ""
470: else " [" <> Text.intercalate ", " (TaskCore.factRelatedFiles f) <> "]"
471: )
Now I can see the prompt is built in the buildBasePrompt function (lines 311-334). The current instructions tell the agent to:
1. Run bild --test <namespace>
2. Fix ALL errors
3. Keep running bild --test until it passes
But it doesn't explain:
bild --test on a namespace already tests all dependenciesLet me check the Omni/Bild documentation to understand the build system better:
Omni/Bild/README.md1: # Bild 2: 3: `bild` is the universal build tool. It can build and test everything in the repo. 4: 5: Examples: 6: ```bash 7: bild --test Omni/Bild.hs # Build and test a namespace 8: bild --time 0 Omni/Cloud.nix # Build with no timeout 9: bild --plan Omni/Test.hs # Analyze build without building 10: ``` 11: 12: When the executable is built, the output will go to `_/bin`. Example: 13: 14: ```bash 15: # build the example executable 16: bild Omni/Bild/Example.py 17: # run the executable 18: _/bin/example 19: ``` 20: 21: ## Adding New Dependencies 22: 23: ### Python Packages 24: 25: To add a new Python package as a dependency: 26: 27: 1. Add the package name to `Omni/Bild/Deps/Python.nix` (alphabetically sorted) 28: 2. Use it in your Python file with `# : dep <package-name>` comment at the top 29: 3. Run `bild <yourfile.py>` to build with the new dependency 30: 31: Example: 32: ```python 33: # : out myapp 34: # : dep stripe 35: # : dep pytest 36: import stripe 37: ``` 38: 39: The package name must match the nixpkgs python package name (usually the PyPI name). 40: Check available packages: `nix-env -qaP -A nixpkgs.python3Packages | grep <name>`
Let me look for more details about how bild handles dependencies and testing:
bild.*test.*dependenciesLet me search for more specific information about how bild handles dependencies:
Omni/Bild.hs @1-1001: #!/usr/bin/env run.sh
2: {-# LANGUAGE DeriveAnyClass #-}
3: {-# LANGUAGE DeriveGeneric #-}
4: {-# LANGUAGE LambdaCase #-}
5: {-# LANGUAGE OverloadedStrings #-}
6: {-# LANGUAGE QuasiQuotes #-}
7: {-# LANGUAGE RecordWildCards #-}
8: {-# LANGUAGE TupleSections #-}
9: {-# LANGUAGE NoImplicitPrelude #-}
10:
11: -- | A specific-purpose build tool.
12: --
13: -- : out bild
14: -- : run git
15: --
16: -- == Design constraints
17: --
18: -- * The only input is one or more a namespaces. No subcommands, no packages,
19: -- no targets.
20: --
21: -- * No need to write specific build rules. One rule for hs, one for rs, one
22: -- for scm, and so on.
23: --
24: -- * No need to distinguish between exe and lib because we only build
25: -- exes; 'libs' are just source files in the tree.
26: --
27: -- * Never concerned with deployment/packaging - leave that to another tool
28: -- (scp? tar?)
29: --
30: -- * Ability to do local dev builds should be preserved, while remote nix
31: -- builds are used for the final package.
32: --
33: -- == Features
34: --
35: -- * Namespace maps to filesystem
36: --
37: -- * no need for `bild -l` for listing available targets.
38: -- Use `ls` or `tree`
39: --
40: -- * you build namespaces, not files/modules/packages/etc
41: --
42: -- * Namespace maps to language modules
43: --
44: -- * build settings can be set in the file comments, or special 'bild'
45: -- args
46: --
47: -- * pwd is always considered the the source directory,
48: -- no `src` vs `doc` etc.
49: --
50: -- * Build rules automaticatly detected from file extensions
51: --
52: -- * Flags modify the way to interact with the build, some ideas:
53: --
54: -- * -p = turn on profiling
55: --
56: -- * -o = optimize level
57: --
58: -- * The build is planned out with an analysis, which can be viewed
59: -- beforehand with `--plan`. The analysis includes compiler flags, which
60: -- can be used in `repl` for testing compilation locally.
61: --
62: -- * (WIP) Nix is used by default to build everything on a remote build
63: -- machine, but local, non-Nix builds can be accomplished with `--dev`.
64: --
65: -- == Example Commands
66: --
67: -- > bild [opts] <target..>
68: --
69: -- The general scheme is to build the things described by the targets. A target
70: -- is a namespace. You can list as many as you want, but you must list at least
71: -- one. It could just be `:!bild %` in vim to build whatever you're working on,
72: -- or `bild **/*` to build everything, or `fd .hs -X bild` to build all Haskell
73: -- files.
74: --
75: -- Build outputs will go into the `_` directory in the root of the project.
76: --
77: -- > bild A/B.hs
78: --
79: -- This will build the file at ./A/B.hs, which translates to something like
80: -- `ghc --make A.B`.
81: --
82: -- == Build Metadata
83: --
84: -- Metadata is set in the comments with a special syntax. For system-level deps,
85: -- we list the deps in comments in the target file, like:
86: --
87: -- > -- : sys cmark
88: -- > -- : sys libssl
89: --
90: -- The name is used to lookup the package in `nixpkgs.pkgs.<name>`. Only one
91: -- package can be listed per line. Language-level deps can automatically
92: -- determined by passing parsed import statements to a package database, eg
93: -- `ghc-pkg find-module`. If such a package database is not available, we either
94: -- keep a hand-written index that maps imports to packages, or we just list the
95: -- name of the package with:
96: --
97: -- > -- : dep package
98: --
99: -- The output executable is named with:
100: --
Now let me look at the README to understand more about the build system:
README.md @1-1001: This repo is for all subprojects. 2: 3: # Goals of this project 4: 5: The Omni project is to leverage automation and asymmetries to create wealth. The 6: target of the wealth is Bitcoin. The means: write everything down, first 7: in English, then in code. 8: 9: Resources defined in the repo can be used to quickly create and release 10: products. New technology shall be prototyped and developed as needed. 11: 12: Since Bitcoin always goes up, as long as we are acquiring more Bitcoin, then we 13: are growing our wealth. There is a limit to the amount of wealth that one person 14: needs, roughly defined as how much is required to live a modest life and provide 15: for one's family. Once this is achieved, excess wealth shall be deployed to 16: improve the world through activist entrepreneurship. 17: 18: # Developer Guide 19: 20: Here we define the tools and infrastructure for all dev work. 21: 22: ## Goals of the workflow 23: 24: - have minimal ceremony 25: - default to asynchrony, but allow for synchronous work when necessary 26: - automate the boring stuff 27: - standardize environments, tooling, and versions to minimize friction 28: while collaborating 29: - support the longevity and self-sustainability of the project 30: 31: Ideally, each contributor should be able to go off grid for a day or a 32: week or more, continue working offline, submit their work when finished, 33: and have no or minimal conflicts. This also refers to the resilience of 34: the production systems. 35: 36: We should never need "out of office" email auto-replies, or urgent 37: contact. No pager duty, no daily stand-ups. Yes, this policy will affect 38: what code we write, not just how we write it; that is by design. 39: 40: ## Source Layout 41: 42: The source tree maps to the module namespace, and roughly follows the Haskell 43: namespace hierarchy. This is true of all languages: Python, Scheme, Rust, C, 44: etc. 45: 46: Namespaces are formatted either as file paths, like `Omni/Dev`, or 47: dot-separated, like `Omni.Dev`. Parts of the namespace should always be 48: capitalized. 49: 50: The namespace for all products that we own is `Biz`, this includes proprietary 51: applications, products, and related infrastructure. 52: 53: The `Omni` namespace is used for internal development tooling and infrastructure 54: that are shared between all other projects. 55: 56: Stuff that can be open sourced or otherwise externalized should be outside of 57: `Biz` or `Omni`. 58: 59: Related code should be kept close together. This means that you should start 60: with small namespaces: use `Omni/Thing.hs` before `Omni/Thing/Service.hs`. Try 61: to keep all related code in one spot for as long as possible. 62: 63: Re-use code from the `Omni/` namespace as much as possible. For example, use 64: `Omni/Cli.hs` or `Omni/Test.py` instead of trying to roll your own code for cli 65: parsing or running test suites. If the the namespace doesn't have the feature 66: you need, then add the feature. 67: 68: Boundaries and interfaces between namespaces should be singular and 69: well-defined. Likewise, the functionality and purpose of a particular 70: namespace should be singular and well-defined. Follow the unix principle 71: of "do one thing and do it well." 72: 73: Namespaces are always capitalized. In Scheme and Python this actually translates 74: quite well and helps distinguish between types/classes/modules and values. 75: 76: ## Available tools 77: 78: We provide some tools for working with projects in the omnirepo: 79: 80: ### bild 81: 82: `bild` is our universal build tool. It can build and test everything in the repo. 83: 84: #### Examples 85: 86: `bild --test Omni/Bild.hs` will build and test the `Omni/Bild.hs` namespace. Use 87: this command to run tests for a namespace. 88: 89: `bild --time 0 Omni/Cloud.nix` will build the `Omni/Cloud.nix` namespace, which 90: is our cloud VM instance. The `--time 0` argument disables the timeout (which 91: defaults to 10 minutes), this is useful for longer builds. 92: 93: `bild --plan Omni/Test.hs` this just does the analysis step, and prints a 94: description of the build as JSON. It will not build anything. 95: 96: ### lint 97: 98: The `lint` command is a universal lint and formatting tool. It will error if any 99: lints fail, or if the code is not formatted properly. Use `lint --fix` to 100: autofix any lints or autoformat where possible.
Now let me look at an example of t-201 to see what went wrong:
t-201Let me look for recent task activity logs to understand what happened:
jr task show t-201 --json 2>/dev/null || echo "Task not found"Now let me check the agent events for t-201 to see what redundant operations were performed:
sqlite3 _/omni.db "SELECT event_type, substr(content, 1, 100) FROM agent_events WHERE task_id = 't-201' AND event_type = 'ToolCall' ORDER BY created_at;" 2>/dev/null | head -30Let me try a different approach to see the agent logs:
ls -la _/llm/ | tail -20-rw-r--r-- 1 ben users 1879 Nov 12 14:53 goinglive.md -rw-r--r-- 1 ben users 60794 Nov 26 14:20 jr-loop.log -rw-r--r-- 1 ben users 17862 Nov 24 11:05 jr.md -rw-r--r-- 1 ben users 333084 Jul 24 15:21 ludic.txt -rw-r--r-- 1 ben users 2217 Nov 15 09:56 NIXOS_INTEGRATION_DESIGN.md -rw-r--r-- 1 ben users 22444 Nov 14 14:02 PARALLEL_BUILDS_PLAN.md -rw-r--r-- 1 ben users 4347 Nov 26 23:33 PLAN_Agent_Architecture.md -rw-r--r-- 1 ben users 9704 Nov 29 23:45 PLAN_Agent_Engine_V2.md -rw-r--r-- 1 ben users 6495 Nov 22 15:06 PLAN_Autonomous_Workflow.md -rw-r--r-- 1 ben users 3163 Nov 24 15:03 PLAN_Jr_Evolution.md -rw-r--r-- 1 ben users 3379 Nov 26 05:22 PLAN_Jr_Web_UI.md -rw-r--r-- 1 ben users 5929 Nov 27 09:36 PLAN_Jr_Web_V2.md -rw-r--r-- 1 ben users 2084 Nov 22 19:25 PLAN_Mailgun_Integration.md -rw-r--r-- 1 ben users 10395 Nov 15 23:06 podcast-it-later-features-plan.md -rw-r--r-- 1 ben users 2320 Nov 15 09:02 STDIN_BUG.md -rw-r--r-- 1 ben users 2253 Nov 30 22:14 t-193.2-implementation-summary.md -rw-r--r-- 1 ben users 2989 Dec 1 03:25 t-193.3-implementation-summary.md -rw-r--r-- 1 ben users 3060 Dec 1 07:08 t-201-implementation-summary.md -rw-r--r-- 1 ben users 129 Sep 18 15:45 ticket.txt -rw-r--r-- 1 ben users 2351 Aug 27 10:40 todo.txt
_/llm/t-201-implementation-summary.md1: # Task t-201: Add author field to task comments (Human vs Junior) 2: 3: ## Implementation Summary 4: 5: Successfully implemented author tracking for task comments to distinguish between Human and Junior (agent) authors. 6: 7: ## Changes Made 8: 9: ### 1. Core Data Model (`Omni/Task/Core.hs`) 10: - Added `CommentAuthor` data type with `Human` and `Junior` constructors 11: - Updated `Comment` data type to include `commentAuthor :: CommentAuthor` field 12: - Added JSON serialization instances for `CommentAuthor` 13: - Added SQLite instances for `CommentAuthor` (FromField/ToField) 14: - Updated `addComment` function signature to accept `CommentAuthor` parameter 15: - Updated `printComment` function to display author in CLI output 16: 17: ### 2. Database Migration 18: - No explicit ALTER TABLE needed - the existing migration system handles adding the new field 19: - Comments are stored as JSON in the database, so the new field is automatically included 20: 21: ### 3. Updated Call Sites 22: 23: #### `Omni/Task.hs` 24: - CLI comment command: Uses `Human` author 25: - Test cases: Updated to pass author parameter and verify it in assertions 26: 27: #### `Omni/Jr.hs` 28: - Conflict comments: `Junior` author 29: - Auto-review comments (test passed/failed): `Junior` author 30: - Completion summary comments: `Junior` author 31: - Epic summary comments: `Junior` author 32: - Human review acceptance: `Human` author 33: - Human review rejection: `Human` author 34: 35: #### `Omni/Jr/Web.hs` 36: - Web UI comment form submission: `Human` author 37: - Updated comment rendering to display author badge with timestamp 38: 39: ### 4. Web UI Styling (`Omni/Jr/Web/Style.hs`) 40: Added CSS styles for comment author badges: 41: - `.comment-meta`: Flex container for author and timestamp 42: - `.comment-author`: Base badge styling (uppercase, small font) 43: - `.author-human`: Blue badge for human comments 44: - `.author-junior`: Green badge for Junior (agent) comments 45: - Dark mode support: Adjusted colors for dark theme 46: 47: ### 5. Web UI HTML (`Omni/Jr/Web.hs`) 48: Updated `renderComment` function to: 49: - Display author badge before timestamp 50: - Use appropriate CSS classes for styling 51: - Show "Human" or "Junior" label based on author 52: 53: ## Testing 54: 55: All tests pass successfully: 56: - `bild --test Omni/Task/Core.hs` ✓ 57: - `bild --test Omni/Task.hs` ✓ 58: - `bild --test Omni/Jr.hs` ✓ 59: - `bild --test Omni/Jr/Web.hs` ✓ 60: 61: All lint checks pass: 62: - `lint Omni/Task/Core.hs` ✓ 63: - `lint Omni/Task.hs` ✓ 64: - `lint Omni/Jr.hs` ✓ 65: - `lint Omni/Jr/Web.hs` ✓ 66: - `lint Omni/Jr/Web/Style.hs` ✓ 67: 68: ## User Experience 69: 70: ### CLI 71: Comments now display: `[timestamp] [author] comment text` 72: Example: `[2025-12-01 12:00:00 UTC] [Human] This needs review` 73: 74: ### Web UI 75: Comments show a colored badge: 76: - **Human**: Blue badge with "HUMAN" text 77: - **Junior**: Green badge with "JUNIOR" text 78: 79: Badges are placed next to the timestamp for easy identification of comment authorship. 80: 81: ## Backward Compatibility 82: 83: Existing comments in the database will need their author field set. The JSON deserialization will handle missing fields gracefully, but we should run a one-time migration to set a default author (suggested: `Junior`) for existing comments.
Perfect! I can see from the summary that the agent ran bild --test and lint on 4-5 files individually. Now I understand the issue. Let me improve the Worker.hs prompt to clarify:
1. That bild --test on a namespace tests all its dependencies automatically
2. That lint can be run once on all changed files (or can rely on git hooks)
3. Not to re-run tests after they already passed
Omni/Agent/Worker.hsGood! Now let's test to make sure the code still compiles and passes tests:
cd /home/ben/omni && bild --test Omni/Agent/Worker.hs 2>&17[10000;10000H [1A[1G[2K[.] Omni/Agent/Worker.hs [1A[1G[2K[0m[+] Omni/Agent/Worker.hs[0m[1B[1A[1G[2K[0m[~] Omni/Agent/Worker.hs[0m[1B[1A[1G[2K[0m[38;5;2m[✓] Omni/Agent/Worker.hs[0m[1B
Perfect! The tests pass. Now let me run lint to make sure there are no linting issues:
cd /home/ben/omni && lint Omni/Agent/Worker.hs 2>&1[0m[38;5;7m[2Kinfo: lint: ormolu: 1 files [0m[0m[38;5;7m[2Kinfo: lint: hlint: 1 files [0m[0m[38;5;2m[2Kgood: lint: ormolu [0m[0m [0m[0m[38;5;2m[2Kgood: lint: hlint [0m[0m [0m
Excellent! The lint check passes. Let me verify the changes one more time by looking at what I changed:
Omni/Agent/Worker.hs @310-340310: -- | Build the base prompt for the agent 311: buildBasePrompt :: TaskCore.Task -> Text -> FilePath -> Text 312: buildBasePrompt task ns repo = 313: "You are a Worker Agent.\n" 314: <> "Your goal is to implement the following task:\n\n" 315: <> formatTask task 316: <> "\n\nCRITICAL INSTRUCTIONS:\n" 317: <> "1. Analyze the codebase to understand where to make changes.\n" 318: <> "2. Implement the changes by editing files.\n" 319: <> "3. BEFORE finishing, you MUST run: bild --test " 320: <> ns 321: <> "\n" 322: <> "4. Fix ALL errors from bild --test (including lint issues).\n" 323: <> "5. Keep running bild --test until it passes with no errors.\n" 324: <> "6. Do NOT update task status or manage git.\n" 325: <> "7. Only exit after bild --test passes.\n\n" 326: <> "IMPORTANT: The git commit will fail if lint finds issues.\n" 327: <> "You must fix all lint suggestions.\n\n" 328: <> "BUILD SYSTEM NOTES:\n" 329: <> "- Running 'bild --test " 330: <> ns 331: <> "' automatically tests ALL dependencies of that namespace\n" 332: <> "- You do NOT need to run bild --test on individual files - just the main namespace\n" 333: <> "- Once tests pass, do NOT re-run them unless you make more changes\n" 334: <> "- The 'lint' command will be run automatically during git commit via hooks\n" 335: <> "- You can run 'lint --fix' on changed files if needed, but it's optional\n\n" 336: <> "Context:\n" 337: <> "- Working directory: " 338: <> Text.pack repo 339: <> "\n" 340: <> "- Namespace: "
Perfect! The changes look good. Let me create a summary document of the changes I made:
_/llm/t-202-implementation-summary.mdPerfect! Let me run the final verification to ensure everything is working correctly:
cd /home/ben/omni && bild --test Omni/Agent/Worker.hs 2>&17[10000;10000H [1A[1G[2K[.] Omni/Agent/Worker.hs [1A[1G[2K[0m[+] Omni/Agent/Worker.hs[0m[1B[1A[1G[2K[0m[~] Omni/Agent/Worker.hs[0m[1B[1A[1G[2K[0m[38;5;2m[✓] Omni/Agent/Worker.hs[0m[1B
Excellent! All tests pass. Let me also verify there are no lint issues:
cd /home/ben/omni && lint Omni/Agent/Worker.hs 2>&1[0m[38;5;7m[2Kinfo: lint: ormolu: 1 files [0m[0m[38;5;7m[2Kinfo: lint: hlint: 1 files [0m[0m[38;5;2m[2Kgood: lint: ormolu [0m[0m [0m[0m[38;5;2m[2Kgood: lint: hlint [0m[0m [0m
Perfect! All checks pass. Let me verify the changes are correct one more time by viewing the updated prompt:
cd /home/ben/omni && git diff Omni/Agent/Worker.hsdiff --git a/Omni/Agent/Worker.hs b/Omni/Agent/Worker.hs
index a34270c1..cc5f7300 100644
--- a/Omni/Agent/Worker.hs
+++ b/Omni/Agent/Worker.hs
@@ -325,6 +325,14 @@ buildBasePrompt task ns repo =
<> "7. Only exit after bild --test passes.\n\n"
<> "IMPORTANT: The git commit will fail if lint finds issues.\n"
<> "You must fix all lint suggestions.\n\n"
+ <> "BUILD SYSTEM NOTES:\n"
+ <> "- Running 'bild --test "
+ <> ns
+ <> "' automatically tests ALL dependencies of that namespace\n"
+ <> "- You do NOT need to run bild --test on individual files - just the main namespace\n"
+ <> "- Once tests pass, do NOT re-run them unless you make more changes\n"
+ <> "- The 'lint' command will be run automatically during git commit via hooks\n"
+ <> "- You can run 'lint --fix' on changed files if needed, but it's optional\n\n"
<> "Context:\n"
<> "- Working directory: "
<> Text.pack repo
Perfect! The changes are exactly what we need. The diff shows that I've added a new "BUILD SYSTEM NOTES" section to the agent prompt that clarifies:
1. ✅ bild --test on a namespace tests all its dependencies - no need to test each file
2. ✅ Don't re-run tests after they already passed
3. ✅ Lint runs automatically via git hooks, so manual runs are optional
I have successfully completed task t-202. The changes made to Omni/Agent/Worker.hs improve the agent prompt to prevent redundant test and lint runs by:
1. Explaining dependency handling: Added explicit note that bild --test <namespace> automatically tests ALL dependencies, eliminating the need to test individual files separately.
2. Preventing redundant testing: Added instruction to not re-run tests unless making more changes, which should prevent the wasteful behavior observed in t-201.
3. Clarifying lint automation: Explained that lint runs automatically during git commit via hooks, so manual lint runs are optional (though still allowed if the agent wants to check before committing).
All tests pass (bild --test Omni/Agent/Worker.hs) and all lint checks pass (lint Omni/Agent/Worker.hs). The changes are minimal, focused, and should significantly reduce token usage and execution time for future worker agent runs.