When a guardrail fires (e.g., duplicate tool calls), the agent currently restarts from scratch instead of stopping. This causes a loop where: 1. Agent does work 2. Guardrail fires (e.g., read_file called 5 times) 3. Agent restarts from beginning 4. Agent does same work again 5. Guardrail fires again 6. Repeat
The agent should STOP when a guardrail fires, preserving any changes made. The task should be left InProgress or set back to Open with context about why it stopped.
Observed in t-204: Agent correctly fixed the issue, tests passed, then guardrail fired for duplicate read_file calls. Instead of stopping successfully, it restarted and the guardrail fired 3 more times.
I'll start by reading AGENTS.md to understand the codebase conventions, then examine the agent code to understand how guardrails currently work.
AGENTS.md1: # Omni 2: 3: The Omni project is to leverage automation and asymmetries to create wealth. 4: 5: ## Critical Rules for AI Agents 6: 7: **CRITICAL**: This project uses `jr task` for ALL issue tracking. You MUST follow these rules: 8: 9: - ✅ Use `jr task` for ALL task/TODO tracking (`jr task create ... --json`) 10: - ✅ Link discovered work with `--discovered-from=<parent-id>` 11: - ✅ File bugs IMMEDIATELY when you discover unexpected behavior 12: - ✅ Add facts to the knowledge base when you learn something useful (`jr facts add ...`) 13: - ✅ Run `jr task ready --json` before asking "what should I work on?" 14: - ✅ Store AI planning docs in `_/llm` directory (NEVER in repo root) 15: - ❌ Do NOT use `todo_write` tool 16: - ❌ Do NOT create markdown TODO lists or task checklists 17: - ❌ Do NOT put TODO/FIXME comments in code 18: 19: ## Cheat Sheet 20: 21: **Standard workflow:** 22: ```bash 23: # 1. Find ready work 24: jr task ready --json 25: 26: # 2. Claim a task 27: jr task update <id> in-progress --json 28: 29: # 3. During work: create discovered issues 30: jr task create "Fix type error found" --discovered-from=<current-id> --json 31: 32: # 4. Complete the task 33: jr task update <id> done --json 34: ``` 35: 36: **Bug Discovery:** 37: ```bash 38: # Create a task immediately 39: jr task create "Command X fails when Y" --discovered-from=<current-task-id> --json 40: ``` 41: 42: ## Directory Structure 43: 44: - **`_/`** (cabdir) - All ephemeral/generated files. This directory is gitignored. 45: - `_/llm/` - AI planning docs and agent logs 46: - `_/tmp/` - Temporary files, test databases, scratch data 47: - Never create dotfile directories (like `.tasks/`) in the repo root 48: 49: ## Common Commands 50: 51: **Build:** 52: ```bash 53: bild Omni/Jr.hs # Build a Haskell namespace 54: bild Biz/Cloud/Api.py # Build a Python namespace 55: ``` 56: 57: **Lint and Format:** 58: ```bash 59: lint Omni/Cli.hs # Lint a file (DO NOT use hlint directly) 60: lint --fix **/*.py # Lint and auto-fix Python files 61: ``` 62: 63: **Type Check:** 64: ```bash 65: typecheck.sh Omni/Bild/Example.py # Run mypy on Python files 66: ``` 67: 68: **Test:** 69: ```bash 70: bild --test Omni/Jr.hs # Build and run tests for a namespace 71: ``` 72: 73: **Run:** 74: ```bash 75: Omni/Ide/run.sh Omni/Jr.hs # Build (if needed) and run 76: ``` 77: 78: ## Documentation 79: 80: - **Project Context**: [README.md](README.md) - Goals, source layout, and coding conventions. 81: - **Task Manager**: [`Omni/Task/README.md`](Omni/Task/README.md) - Detailed usage, dependency management, and agent best practices. 82: - **Build Tool (Bild)**: [`Omni/Bild/README.md`](Omni/Bild/README.md) - How to use `bild` and manage dependencies. 83: - **Development Tools**: [`Omni/Ide/README.md`](Omni/Ide/README.md) - `run.sh`, `lint`, `repl.sh`, git workflow.
Omni/Agent.hs1: {-# LANGUAGE OverloadedStrings #-}
2: {-# LANGUAGE NoImplicitPrelude #-}
3:
4: -- | Agent system entry point and combined test runner.
5: --
6: -- This module provides the main entry point for the agent system
7: -- and re-exports core types from sub-modules.
8: --
9: -- : out omni-agent
10: -- : dep aeson
11: module Omni.Agent
12: ( -- * Engine
13: module Omni.Agent.Engine,
14:
15: -- * Tools
16: module Omni.Agent.Tools,
17:
18: -- * Core
19: module Omni.Agent.Core,
20:
21: -- * Test
22: main,
23: test,
24: )
25: where
26:
27: import Alpha
28: import Omni.Agent.Core
29: import Omni.Agent.Engine hiding (main, test)
30: import qualified Omni.Agent.Engine as Engine
31: import Omni.Agent.Tools hiding (ToolResult, main, test)
32: import qualified Omni.Agent.Tools as Tools
33: import qualified Omni.Test as Test
34:
35: main :: IO ()
36: main = Test.run test
37:
38: test :: Test.Tree
39: test =
40: Test.group
41: "Omni.Agent"
42: [ Engine.test,
43: Tools.test,
44: Test.unit "Core types are re-exported" <| do
45: let status = Idle :: WorkerStatus
46: status Test.@=? status,
47: Test.unit "Engine and Tools integrate correctly" <| do
48: let tools = Tools.allTools
49: length tools Test.@=? 5
50: let config =
51: Engine.defaultAgentConfig
52: { Engine.agentTools = tools
53: }
54: Engine.agentMaxIterations config Test.@=? 10
55: ]
Now let me examine the Engine module since that's where the agent logic likely resides:
Omni/Agent/Engine.hs1: {-# LANGUAGE DeriveGeneric #-}
2: {-# LANGUAGE OverloadedStrings #-}
3: {-# LANGUAGE NoImplicitPrelude #-}
4:
5: -- | LLM Agent Engine - Tool protocol and LLM provider abstraction.
6: --
7: -- This module provides the core abstractions for building LLM-powered agents:
8: -- - Tool: Defines tools that agents can use
9: -- - LLM: OpenAI-compatible chat completions API provider
10: -- - AgentConfig: Configuration for running agents
11: --
12: -- : out omni-agent-engine
13: -- : dep http-conduit
14: -- : dep aeson
15: -- : dep case-insensitive
16: module Omni.Agent.Engine
17: ( Tool (..),
18: LLM (..),
19: EngineConfig (..),
20: AgentConfig (..),
21: AgentResult (..),
22: Guardrails (..),
23: GuardrailResult (..),
24: Message (..),
25: Role (..),
26: ToolCall (..),
27: FunctionCall (..),
28: ToolResult (..),
29: ChatCompletionRequest (..),
30: ChatCompletionResponse (..),
31: Choice (..),
32: Usage (..),
33: defaultLLM,
34: defaultEngineConfig,
35: defaultAgentConfig,
36: defaultGuardrails,
37: chat,
38: runAgent,
39: main,
40: test,
41: )
42: where
43:
44: import Alpha
45: import Data.Aeson ((.!=), (.:), (.:?), (.=))
46: import qualified Data.Aeson as Aeson
47: import qualified Data.Aeson.KeyMap as KeyMap
48: import qualified Data.ByteString.Lazy as BL
49: import qualified Data.CaseInsensitive as CI
50: import qualified Data.Map.Strict as Map
51: import qualified Data.Text as Text
52: import qualified Data.Text.Encoding as TE
53: import qualified Network.HTTP.Simple as HTTP
54: import qualified Omni.Test as Test
55:
56: main :: IO ()
57: main = Test.run test
58:
59: test :: Test.Tree
60: test =
61: Test.group
62: "Omni.Agent.Engine"
63: [ Test.unit "Tool JSON roundtrip" <| do
64: let tool =
65: Tool
66: { toolName = "get_weather",
67: toolDescription = "Get weather for a location",
68: toolJsonSchema = Aeson.object ["type" .= ("object" :: Text), "properties" .= Aeson.object []],
69: toolExecute = \_ -> pure (Aeson.String "sunny")
70: }
71: let encoded = encodeToolForApi tool
72: case Aeson.decode (Aeson.encode encoded) of
73: Nothing -> Test.assertFailure "Failed to decode tool"
74: Just decoded -> toolName tool Test.@=? toolApiName decoded,
75: Test.unit "Message JSON roundtrip" <| do
76: let msg = Message User "Hello" Nothing Nothing
77: case Aeson.decode (Aeson.encode msg) of
78: Nothing -> Test.assertFailure "Failed to decode message"
79: Just decoded -> msgContent msg Test.@=? msgContent decoded,
80: Test.unit "defaultLLM has correct endpoint" <| do
81: llmBaseUrl defaultLLM Test.@=? "https://openrouter.ai/api/v1",
82: Test.unit "defaultLLM has OpenRouter headers" <| do
83: length (llmExtraHeaders defaultLLM) Test.@=? 2
84: llmModel defaultLLM Test.@=? "anthropic/claude-sonnet-4.5",
85: Test.unit "defaultAgentConfig has sensible defaults" <| do
86: agentMaxIterations defaultAgentConfig Test.@=? 10,
87: Test.unit "defaultEngineConfig has no-op callbacks" <| do
88: engineOnCost defaultEngineConfig 100 5
89: engineOnActivity defaultEngineConfig "test"
90: engineOnToolCall defaultEngineConfig "tool" "result"
91: True Test.@=? True,
92: Test.unit "buildToolMap creates correct map" <| do
93: let tool1 =
94: Tool
95: { toolName = "tool1",
96: toolDescription = "First tool",
97: toolJsonSchema = Aeson.object [],
98: toolExecute = \_ -> pure Aeson.Null
99: }
100: tool2 =
101: Tool
102: { toolName = "tool2",
103: toolDescription = "Second tool",
104: toolJsonSchema = Aeson.object [],
105: toolExecute = \_ -> pure Aeson.Null
106: }
107: toolMap = buildToolMap [tool1, tool2]
108: Map.size toolMap Test.@=? 2
109: Map.member "tool1" toolMap Test.@=? True
110: Map.member "tool2" toolMap Test.@=? True,
111: Test.unit "Usage JSON parsing" <| do
112: let json = "{\"prompt_tokens\":100,\"completion_tokens\":50,\"total_tokens\":150}"
113: case Aeson.decode json of
114: Nothing -> Test.assertFailure "Failed to decode usage"
115: Just usage -> do
116: usagePromptTokens usage Test.@=? 100
117: usageCompletionTokens usage Test.@=? 50
118: usageTotalTokens usage Test.@=? 150
119: usageCost usage Test.@=? Nothing,
120: Test.unit "Usage JSON parsing with cost" <| do
121: let json = "{\"prompt_tokens\":194,\"completion_tokens\":2,\"total_tokens\":196,\"cost\":0.95}"
122: case Aeson.decode json of
123: Nothing -> Test.assertFailure "Failed to decode usage with cost"
124: Just usage -> do
125: usagePromptTokens usage Test.@=? 194
126: usageCompletionTokens usage Test.@=? 2
127: usageTotalTokens usage Test.@=? 196
128: usageCost usage Test.@=? Just 0.95,
129: Test.unit "AgentResult JSON roundtrip" <| do
130: let result =
131: AgentResult
132: { resultFinalMessage = "Done",
133: resultToolCallCount = 3,
134: resultIterations = 2,
135: resultTotalCost = 50,
136: resultTotalTokens = 1500
137: }
138: case Aeson.decode (Aeson.encode result) of
139: Nothing -> Test.assertFailure "Failed to decode AgentResult"
140: Just decoded -> do
141: resultFinalMessage decoded Test.@=? "Done"
142: resultToolCallCount decoded Test.@=? 3
143: resultIterations decoded Test.@=? 2,
144: Test.unit "estimateCost calculates correctly" <| do
145: let gpt4oCost = estimateCost "gpt-4o" 1000
146: gpt4oMiniCost = estimateCost "gpt-4o-mini" 1000
147: (gpt4oCost >= gpt4oMiniCost) Test.@=? True
148: (gpt4oCost > 0) Test.@=? True,
149: Test.unit "ToolCall JSON roundtrip" <| do
150: let tc =
151: ToolCall
152: { tcId = "call_123",
153: tcType = "function",
154: tcFunction = FunctionCall "read_file" "{\"path\":\"/tmp/test\"}"
155: }
156: case Aeson.decode (Aeson.encode tc) of
157: Nothing -> Test.assertFailure "Failed to decode ToolCall"
158: Just decoded -> tcId decoded Test.@=? "call_123",
159: Test.unit "FunctionCall JSON roundtrip" <| do
160: let fc = FunctionCall "test_func" "{\"arg\":\"value\"}"
161: case Aeson.decode (Aeson.encode fc) of
162: Nothing -> Test.assertFailure "Failed to decode FunctionCall"
163: Just decoded -> do
164: fcName decoded Test.@=? "test_func"
165: fcArguments decoded Test.@=? "{\"arg\":\"value\"}",
166: Test.unit "Role JSON roundtrip for all roles" <| do
167: let roles = [System, User, Assistant, ToolRole]
168: forM_ roles <| \role ->
169: case Aeson.decode (Aeson.encode role) of
170: Nothing -> Test.assertFailure ("Failed to decode Role: " <> show role)
171: Just decoded -> decoded Test.@=? role,
172: Test.unit "defaultGuardrails has sensible defaults" <| do
173: guardrailMaxCostCents defaultGuardrails Test.@=? 100.0
174: guardrailMaxTokens defaultGuardrails Test.@=? 500000
175: guardrailMaxDuplicateToolCalls defaultGuardrails Test.@=? 3
176: guardrailMaxTestFailures defaultGuardrails Test.@=? 3,
177: Test.unit "checkCostGuardrail detects exceeded budget" <| do
178: let g = defaultGuardrails {guardrailMaxCostCents = 50.0}
179: checkCostGuardrail g 60.0 Test.@=? GuardrailCostExceeded 60.0 50.0
180: checkCostGuardrail g 40.0 Test.@=? GuardrailOk,
181: Test.unit "checkTokenGuardrail detects exceeded budget" <| do
182: let g = defaultGuardrails {guardrailMaxTokens = 1000}
183: checkTokenGuardrail g 1500 Test.@=? GuardrailTokensExceeded 1500 1000
184: checkTokenGuardrail g 500 Test.@=? GuardrailOk,
185: Test.unit "checkDuplicateGuardrail detects repeated calls" <| do
186: let g = defaultGuardrails {guardrailMaxDuplicateToolCalls = 3}
187: counts = Map.fromList [("bash", 3), ("read_file", 1)]
188: case checkDuplicateGuardrail g counts of
189: GuardrailDuplicateToolCalls name count -> do
190: name Test.@=? "bash"
191: count Test.@=? 3
192: _ -> Test.assertFailure "Expected GuardrailDuplicateToolCalls"
193: checkDuplicateGuardrail g (Map.fromList [("bash", 2)]) Test.@=? GuardrailOk,
194: Test.unit "checkTestFailureGuardrail detects failures" <| do
195: let g = defaultGuardrails {guardrailMaxTestFailures = 3}
196: checkTestFailureGuardrail g 3 Test.@=? GuardrailTestFailures 3
197: checkTestFailureGuardrail g 2 Test.@=? GuardrailOk,
198: Test.unit "updateToolCallCounts accumulates correctly" <| do
199: let tc1 = ToolCall "1" "function" (FunctionCall "bash" "{}")
200: tc2 = ToolCall "2" "function" (FunctionCall "bash" "{}")
201: tc3 = ToolCall "3" "function" (FunctionCall "read_file" "{}")
202: counts = updateToolCallCounts Map.empty [tc1, tc2, tc3]
203: Map.lookup "bash" counts Test.@=? Just 2
204: Map.lookup "read_file" counts Test.@=? Just 1,
205: Test.unit "Guardrails JSON roundtrip" <| do
206: let g = Guardrails 75.0 100000 5 4 3
207: case Aeson.decode (Aeson.encode g) of
208: Nothing -> Test.assertFailure "Failed to decode Guardrails"
209: Just decoded -> decoded Test.@=? g,
210: Test.unit "GuardrailResult JSON roundtrip" <| do
211: let results =
212: [ GuardrailOk,
213: GuardrailCostExceeded 100.0 50.0,
214: GuardrailTokensExceeded 2000 1000,
215: GuardrailDuplicateToolCalls "bash" 5,
216: GuardrailTestFailures 3,
217: GuardrailEditFailures 5
218: ]
219: forM_ results <| \r ->
220: case Aeson.decode (Aeson.encode r) of
221: Nothing -> Test.assertFailure ("Failed to decode GuardrailResult: " <> show r)
222: Just decoded -> decoded Test.@=? r
223: ]
224:
225: data Tool = Tool
226: { toolName :: Text,
227: toolDescription :: Text,
228: toolJsonSchema :: Aeson.Value,
229: toolExecute :: Aeson.Value -> IO Aeson.Value
230: }
231:
232: data ToolApi = ToolApi
233: { toolApiName :: Text,
234: toolApiDescription :: Text,
235: toolApiParameters :: Aeson.Value
236: }
237: deriving (Generic)
238:
239: instance Aeson.ToJSON ToolApi where
240: toJSON t =
241: Aeson.object
242: [ "type" .= ("function" :: Text),
243: "function"
244: .= Aeson.object
245: [ "name" .= toolApiName t,
246: "description" .= toolApiDescription t,
247: "parameters" .= toolApiParameters t
248: ]
249: ]
250:
251: instance Aeson.FromJSON ToolApi where
252: parseJSON =
253: Aeson.withObject "ToolApi" <| \v -> do
254: fn <- v .: "function"
255: (ToolApi </ (fn .: "name"))
256: <*> (fn .: "description")
257: <*> (fn .: "parameters")
258:
259: encodeToolForApi :: Tool -> ToolApi
260: encodeToolForApi t =
261: ToolApi
262: { toolApiName = toolName t,
263: toolApiDescription = toolDescription t,
264: toolApiParameters = toolJsonSchema t
265: }
266:
267: data LLM = LLM
268: { llmBaseUrl :: Text,
269: llmApiKey :: Text,
270: llmModel :: Text,
271: llmExtraHeaders :: [(ByteString, ByteString)]
272: }
273: deriving (Show, Eq, Generic)
274:
275: instance Aeson.ToJSON LLM where
276: toJSON l =
277: Aeson.object
278: [ "llmBaseUrl" .= llmBaseUrl l,
279: "llmApiKey" .= llmApiKey l,
280: "llmModel" .= llmModel l
281: ]
282:
283: instance Aeson.FromJSON LLM where
284: parseJSON =
285: Aeson.withObject "LLM" <| \v ->
286: (LLM </ (v .: "llmBaseUrl"))
287: <*> (v .: "llmApiKey")
288: <*> (v .: "llmModel")
289: <*> pure []
290:
291: defaultLLM :: LLM
292: defaultLLM =
293: LLM
294: { llmBaseUrl = "https://openrouter.ai/api/v1",
295: llmApiKey = "",
296: llmModel = "anthropic/claude-sonnet-4.5",
297: llmExtraHeaders =
298: [ ("HTTP-Referer", "https://omni.dev"),
299: ("X-Title", "Omni Agent")
300: ]
301: }
302:
303: data AgentConfig = AgentConfig
304: { agentModel :: Text,
305: agentTools :: [Tool],
306: agentSystemPrompt :: Text,
307: agentMaxIterations :: Int,
308: agentGuardrails :: Guardrails
309: }
310:
311: data Guardrails = Guardrails
312: { guardrailMaxCostCents :: Double,
313: guardrailMaxTokens :: Int,
314: guardrailMaxDuplicateToolCalls :: Int,
315: guardrailMaxTestFailures :: Int,
316: guardrailMaxEditFailures :: Int
317: }
318: deriving (Show, Eq, Generic)
319:
320: instance Aeson.ToJSON Guardrails
321:
322: instance Aeson.FromJSON Guardrails
323:
324: data GuardrailResult
325: = GuardrailOk
326: | GuardrailCostExceeded Double Double
327: | GuardrailTokensExceeded Int Int
328: | GuardrailDuplicateToolCalls Text Int
329: | GuardrailTestFailures Int
330: | GuardrailEditFailures Int
331: deriving (Show, Eq, Generic)
332:
333: instance Aeson.ToJSON GuardrailResult
334:
335: instance Aeson.FromJSON GuardrailResult
336:
337: defaultGuardrails :: Guardrails
338: defaultGuardrails =
339: Guardrails
340: { guardrailMaxCostCents = 100.0,
341: guardrailMaxTokens = 500000,
342: guardrailMaxDuplicateToolCalls = 3,
343: guardrailMaxTestFailures = 3,
344: guardrailMaxEditFailures = 5
345: }
346:
347: defaultAgentConfig :: AgentConfig
348: defaultAgentConfig =
349: AgentConfig
350: { agentModel = "gpt-4",
351: agentTools = [],
352: agentSystemPrompt = "You are a helpful assistant.",
353: agentMaxIterations = 10,
354: agentGuardrails = defaultGuardrails
355: }
356:
357: data EngineConfig = EngineConfig
358: { engineLLM :: LLM,
359: engineOnCost :: Int -> Double -> IO (),
360: engineOnActivity :: Text -> IO (),
361: engineOnToolCall :: Text -> Text -> IO (),
362: engineOnAssistant :: Text -> IO (),
363: engineOnToolResult :: Text -> Bool -> Text -> IO (),
364: engineOnComplete :: IO (),
365: engineOnError :: Text -> IO (),
366: engineOnGuardrail :: GuardrailResult -> IO ()
367: }
368:
369: defaultEngineConfig :: EngineConfig
370: defaultEngineConfig =
371: EngineConfig
372: { engineLLM = defaultLLM,
373: engineOnCost = \_ _ -> pure (),
374: engineOnActivity = \_ -> pure (),
375: engineOnToolCall = \_ _ -> pure (),
376: engineOnAssistant = \_ -> pure (),
377: engineOnToolResult = \_ _ _ -> pure (),
378: engineOnComplete = pure (),
379: engineOnError = \_ -> pure (),
380: engineOnGuardrail = \_ -> pure ()
381: }
382:
383: data AgentResult = AgentResult
384: { resultFinalMessage :: Text,
385: resultToolCallCount :: Int,
386: resultIterations :: Int,
387: resultTotalCost :: Double,
388: resultTotalTokens :: Int
389: }
390: deriving (Show, Eq, Generic)
391:
392: instance Aeson.ToJSON AgentResult
393:
394: instance Aeson.FromJSON AgentResult
395:
396: data Role = System | User | Assistant | ToolRole
397: deriving (Show, Eq, Generic)
398:
399: instance Aeson.ToJSON Role where
400: toJSON System = Aeson.String "system"
401: toJSON User = Aeson.String "user"
402: toJSON Assistant = Aeson.String "assistant"
403: toJSON ToolRole = Aeson.String "tool"
404:
405: instance Aeson.FromJSON Role where
406: parseJSON = Aeson.withText "Role" parseRole
407: where
408: parseRole "system" = pure System
409: parseRole "user" = pure User
410: parseRole "assistant" = pure Assistant
411: parseRole "tool" = pure ToolRole
412: parseRole _ = empty
413:
414: data Message = Message
415: { msgRole :: Role,
416: msgContent :: Text,
417: msgToolCalls :: Maybe [ToolCall],
418: msgToolCallId :: Maybe Text
419: }
420: deriving (Show, Eq, Generic)
421:
422: instance Aeson.ToJSON Message where
423: toJSON m =
424: Aeson.object
425: <| catMaybes
426: [ Just ("role" .= msgRole m),
427: Just ("content" .= msgContent m),
428: ("tool_calls" .=) </ msgToolCalls m,
429: ("tool_call_id" .=) </ msgToolCallId m
430: ]
431:
432: instance Aeson.FromJSON Message where
433: parseJSON =
434: Aeson.withObject "Message" <| \v ->
435: (Message </ (v .: "role"))
436: <*> (v .:? "content" .!= "")
437: <*> (v .:? "tool_calls")
438: <*> (v .:? "tool_call_id")
439:
440: data ToolCall = ToolCall
441: { tcId :: Text,
442: tcType :: Text,
443: tcFunction :: FunctionCall
444: }
445: deriving (Show, Eq, Generic)
446:
447: instance Aeson.ToJSON ToolCall where
448: toJSON tc =
449: Aeson.object
450: [ "id" .= tcId tc,
451: "type" .= tcType tc,
452: "function" .= tcFunction tc
453: ]
454:
455: instance Aeson.FromJSON ToolCall where
456: parseJSON =
457: Aeson.withObject "ToolCall" <| \v ->
458: (ToolCall </ (v .: "id"))
459: <*> (v .:? "type" .!= "function")
460: <*> (v .: "function")
461:
462: data FunctionCall = FunctionCall
463: { fcName :: Text,
464: fcArguments :: Text
465: }
466: deriving (Show, Eq, Generic)
467:
468: instance Aeson.ToJSON FunctionCall where
469: toJSON fc =
470: Aeson.object
471: [ "name" .= fcName fc,
472: "arguments" .= fcArguments fc
473: ]
474:
475: instance Aeson.FromJSON FunctionCall where
476: parseJSON =
477: Aeson.withObject "FunctionCall" <| \v ->
478: (FunctionCall </ (v .: "name"))
479: <*> (v .: "arguments")
480:
481: data ToolResult = ToolResult
482: { trToolCallId :: Text,
483: trContent :: Text
484: }
485: deriving (Show, Eq, Generic)
486:
487: instance Aeson.ToJSON ToolResult
488:
489: instance Aeson.FromJSON ToolResult
490:
491: data ChatCompletionRequest = ChatCompletionRequest
492: { reqModel :: Text,
493: reqMessages :: [Message],
494: reqTools :: Maybe [ToolApi]
495: }
496: deriving (Generic)
497:
498: instance Aeson.ToJSON ChatCompletionRequest where
499: toJSON r =
500: Aeson.object
501: <| catMaybes
502: [ Just ("model" .= reqModel r),
503: Just ("messages" .= reqMessages r),
504: ("tools" .=) </ reqTools r,
505: Just ("usage" .= Aeson.object ["include" .= True])
506: ]
507:
508: data Choice = Choice
509: { choiceIndex :: Int,
510: choiceMessage :: Message,
511: choiceFinishReason :: Maybe Text
512: }
513: deriving (Show, Eq, Generic)
514:
515: instance Aeson.FromJSON Choice where
516: parseJSON =
517: Aeson.withObject "Choice" <| \v ->
518: (Choice </ (v .: "index"))
519: <*> (v .: "message")
520: <*> (v .:? "finish_reason")
521:
522: data Usage = Usage
523: { usagePromptTokens :: Int,
524: usageCompletionTokens :: Int,
525: usageTotalTokens :: Int,
526: usageCost :: Maybe Double
527: }
528: deriving (Show, Eq, Generic)
529:
530: instance Aeson.FromJSON Usage where
531: parseJSON =
532: Aeson.withObject "Usage" <| \v ->
533: (Usage </ (v .: "prompt_tokens"))
534: <*> (v .: "completion_tokens")
535: <*> (v .: "total_tokens")
536: <*> (v .:? "cost")
537:
538: data ChatCompletionResponse = ChatCompletionResponse
539: { respId :: Text,
540: respChoices :: [Choice],
541: respModel :: Text,
542: respUsage :: Maybe Usage
543: }
544: deriving (Show, Eq, Generic)
545:
546: instance Aeson.FromJSON ChatCompletionResponse where
547: parseJSON =
548: Aeson.withObject "ChatCompletionResponse" <| \v ->
549: (ChatCompletionResponse </ (v .: "id"))
550: <*> (v .: "choices")
551: <*> (v .: "model")
552: <*> (v .:? "usage")
553:
554: data ChatResult = ChatResult
555: { chatMessage :: Message,
556: chatUsage :: Maybe Usage
557: }
558: deriving (Show, Eq)
559:
560: chatWithUsage :: LLM -> [Tool] -> [Message] -> IO (Either Text ChatResult)
561: chatWithUsage llm tools messages = do
562: let url = Text.unpack (llmBaseUrl llm) <> "/chat/completions"
563: req0 <- HTTP.parseRequest url
564: let toolApis = [encodeToolForApi t | not (null tools), t <- tools]
565: body =
566: ChatCompletionRequest
567: { reqModel = llmModel llm,
568: reqMessages = messages,
569: reqTools = if null toolApis then Nothing else Just toolApis
570: }
571: baseReq =
572: HTTP.setRequestMethod "POST"
573: <| HTTP.setRequestHeader "Content-Type" ["application/json"]
574: <| HTTP.setRequestHeader "Authorization" ["Bearer " <> TE.encodeUtf8 (llmApiKey llm)]
575: <| HTTP.setRequestBodyLBS (Aeson.encode body)
576: <| req0
577: req = foldr addHeader baseReq (llmExtraHeaders llm)
578: addHeader (name, value) = HTTP.addRequestHeader (CI.mk name) value
579:
580: response <- HTTP.httpLBS req
581: let status = HTTP.getResponseStatusCode response
582: if status >= 200 && status < 300
583: then case Aeson.decode (HTTP.getResponseBody response) of
584: Just resp ->
585: case respChoices resp of
586: (c : _) -> pure (Right (ChatResult (choiceMessage c) (respUsage resp)))
587: [] -> pure (Left "No choices in response")
588: Nothing -> pure (Left "Failed to parse response")
589: else pure (Left ("HTTP error: " <> tshow status <> " - " <> TE.decodeUtf8 (BL.toStrict (HTTP.getResponseBody response))))
590:
591: chat :: LLM -> [Tool] -> [Message] -> IO (Either Text Message)
592: chat llm tools messages = do
593: result <- chatWithUsage llm tools messages
594: pure (chatMessage </ result)
595:
596: runAgent :: EngineConfig -> AgentConfig -> Text -> IO (Either Text AgentResult)
597: runAgent engineCfg agentCfg userPrompt = do
598: let llm =
599: (engineLLM engineCfg)
600: { llmModel = agentModel agentCfg
601: }
602: tools = agentTools agentCfg
603: toolMap = buildToolMap tools
604: systemMsg = Message System (agentSystemPrompt agentCfg) Nothing Nothing
605: userMsg = Message User userPrompt Nothing Nothing
606: initialMessages = [systemMsg, userMsg]
607:
608: engineOnActivity engineCfg "Starting agent loop"
609: loop llm tools toolMap initialMessages 0 0 0 0.0 Map.empty 0 0
610: where
611: maxIter = agentMaxIterations agentCfg
612: guardrails' = agentGuardrails agentCfg
613:
614: loop ::
615: LLM ->
616: [Tool] ->
617: Map.Map Text Tool ->
618: [Message] ->
619: Int ->
620: Int ->
621: Int ->
622: Double ->
623: Map.Map Text Int ->
624: Int ->
625: Int ->
626: IO (Either Text AgentResult)
627: loop llm tools' toolMap msgs iteration totalCalls totalTokens totalCost toolCallCounts testFailures editFailures
628: | iteration >= maxIter = do
629: let errMsg = "Max iterations (" <> tshow maxIter <> ") reached"
630: engineOnError engineCfg errMsg
631: pure <| Left errMsg
632: | otherwise = do
633: let guardrailViolation = findGuardrailViolation guardrails' totalCost totalTokens toolCallCounts testFailures editFailures
634: case guardrailViolation of
635: Just (g, errMsg) -> do
636: engineOnGuardrail engineCfg g
637: pure <| Left errMsg
638: Nothing -> do
639: engineOnActivity engineCfg <| "Iteration " <> tshow (iteration + 1)
640: result <- chatWithUsage llm tools' msgs
641: case result of
642: Left err -> do
643: engineOnError engineCfg err
644: pure (Left err)
645: Right chatRes -> do
646: let msg = chatMessage chatRes
647: tokens = maybe 0 usageTotalTokens (chatUsage chatRes)
648: cost = case chatUsage chatRes +> usageCost of
649: Just actualCost -> actualCost * 100
650: Nothing -> estimateCost (llmModel llm) tokens
651: engineOnCost engineCfg tokens cost
652: let newTokens = totalTokens + tokens
653: newCost = totalCost + cost
654: let assistantText = msgContent msg
655: unless (Text.null assistantText)
656: <| engineOnAssistant engineCfg assistantText
657: case msgToolCalls msg of
658: Nothing -> do
659: engineOnActivity engineCfg "Agent completed"
660: engineOnComplete engineCfg
661: pure
662: <| Right
663: <| AgentResult
664: { resultFinalMessage = msgContent msg,
665: resultToolCallCount = totalCalls,
666: resultIterations = iteration + 1,
667: resultTotalCost = newCost,
668: resultTotalTokens = newTokens
669: }
670: Just [] -> do
671: engineOnActivity engineCfg "Agent completed (empty tool calls)"
672: engineOnComplete engineCfg
673: pure
674: <| Right
675: <| AgentResult
676: { resultFinalMessage = msgContent msg,
677: resultToolCallCount = totalCalls,
678: resultIterations = iteration + 1,
679: resultTotalCost = newCost,
680: resultTotalTokens = newTokens
681: }
682: Just tcs -> do
683: (toolResults, newTestFailures, newEditFailures) <- executeToolCallsWithTracking engineCfg toolMap tcs testFailures editFailures
684: let newMsgs = msgs <> [msg] <> toolResults
685: newCalls = totalCalls + length tcs
686: newToolCallCounts = updateToolCallCounts toolCallCounts tcs
687: loop llm tools' toolMap newMsgs (iteration + 1) newCalls newTokens newCost newToolCallCounts newTestFailures newEditFailures
688:
689: checkCostGuardrail :: Guardrails -> Double -> GuardrailResult
690: checkCostGuardrail g cost
691: | cost > guardrailMaxCostCents g = GuardrailCostExceeded cost (guardrailMaxCostCents g)
692: | otherwise = GuardrailOk
693:
694: checkTokenGuardrail :: Guardrails -> Int -> GuardrailResult
695: checkTokenGuardrail g tokens
696: | tokens > guardrailMaxTokens g = GuardrailTokensExceeded tokens (guardrailMaxTokens g)
697: | otherwise = GuardrailOk
698:
699: checkDuplicateGuardrail :: Guardrails -> Map.Map Text Int -> GuardrailResult
700: checkDuplicateGuardrail g counts =
701: let maxAllowed = guardrailMaxDuplicateToolCalls g
702: violations = [(name, count) | (name, count) <- Map.toList counts, count >= maxAllowed]
703: in case violations of
704: ((name, count) : _) -> GuardrailDuplicateToolCalls name count
705: [] -> GuardrailOk
706:
707: checkTestFailureGuardrail :: Guardrails -> Int -> GuardrailResult
708: checkTestFailureGuardrail g failures
709: | failures >= guardrailMaxTestFailures g = GuardrailTestFailures failures
710: | otherwise = GuardrailOk
711:
712: checkEditFailureGuardrail :: Guardrails -> Int -> GuardrailResult
713: checkEditFailureGuardrail g failures
714: | failures >= guardrailMaxEditFailures g = GuardrailEditFailures failures
715: | otherwise = GuardrailOk
716:
717: updateToolCallCounts :: Map.Map Text Int -> [ToolCall] -> Map.Map Text Int
718: updateToolCallCounts =
719: foldr (\tc m -> Map.insertWith (+) (fcName (tcFunction tc)) 1 m)
720:
721: findGuardrailViolation :: Guardrails -> Double -> Int -> Map.Map Text Int -> Int -> Int -> Maybe (GuardrailResult, Text)
722: findGuardrailViolation g cost tokens toolCallCounts testFailures editFailures =
723: case checkCostGuardrail g cost of
724: r@(GuardrailCostExceeded actual limit) ->
725: Just (r, "Guardrail: cost budget exceeded (" <> tshow actual <> "/" <> tshow limit <> " cents)")
726: _ -> case checkTokenGuardrail g tokens of
727: r@(GuardrailTokensExceeded actual limit) ->
728: Just (r, "Guardrail: token budget exceeded (" <> tshow actual <> "/" <> tshow limit <> " tokens)")
729: _ -> case checkDuplicateGuardrail g toolCallCounts of
730: r@(GuardrailDuplicateToolCalls tool count) ->
731: Just (r, "Guardrail: duplicate tool calls (" <> tool <> " called " <> tshow count <> " times)")
732: _ -> case checkTestFailureGuardrail g testFailures of
733: r@(GuardrailTestFailures count) ->
734: Just (r, "Guardrail: too many test failures (" <> tshow count <> ")")
735: _ -> case checkEditFailureGuardrail g editFailures of
736: r@(GuardrailEditFailures count) ->
737: Just (r, "Guardrail: too many edit_file failures (" <> tshow count <> " 'old_str not found' errors)")
738: _ -> Nothing
739:
740: buildToolMap :: [Tool] -> Map.Map Text Tool
741: buildToolMap = Map.fromList <. map (\t -> (toolName t, t))
742:
743: -- | Track both test failures and edit failures
744: -- Returns (messages, testFailures, editFailures)
745: executeToolCallsWithTracking :: EngineConfig -> Map.Map Text Tool -> [ToolCall] -> Int -> Int -> IO ([Message], Int, Int)
746: executeToolCallsWithTracking engineCfg toolMap tcs initialTestFailures initialEditFailures = do
747: results <- traverse executeSingle tcs
748: let msgs = map (\(m, _, _) -> m) results
749: testDeltas = map (\(_, t, _) -> t) results
750: editDeltas = map (\(_, _, e) -> e) results
751: totalTestFailures = initialTestFailures + sum testDeltas
752: totalEditFailures = initialEditFailures + sum editDeltas
753: pure (msgs, totalTestFailures, totalEditFailures)
754: where
755: executeSingle tc = do
756: let name = fcName (tcFunction tc)
757: argsText = fcArguments (tcFunction tc)
758: callId = tcId tc
759: engineOnActivity engineCfg <| "Executing tool: " <> name
760: engineOnToolCall engineCfg name argsText
761: case Map.lookup name toolMap of
762: Nothing -> do
763: let errMsg = "Tool not found: " <> name
764: engineOnToolResult engineCfg name False errMsg
765: pure (Message ToolRole errMsg Nothing (Just callId), 0, 0)
766: Just tool -> do
767: case Aeson.decode (BL.fromStrict (TE.encodeUtf8 argsText)) of
768: Nothing -> do
769: let errMsg = "Invalid JSON arguments: " <> argsText
770: engineOnToolResult engineCfg name False errMsg
771: pure (Message ToolRole errMsg Nothing (Just callId), 0, 0)
772: Just args -> do
773: resultValue <- toolExecute tool args
774: let resultText = TE.decodeUtf8 (BL.toStrict (Aeson.encode resultValue))
775: isTestCall = name == "bash" && ("bild --test" `Text.isInfixOf` argsText || "bild -t" `Text.isInfixOf` argsText)
776: isTestFailure = isTestCall && isFailureResult resultValue
777: testDelta = if isTestFailure then 1 else 0
778: isEditFailure = name == "edit_file" && isOldStrNotFoundError resultValue
779: editDelta = if isEditFailure then 1 else 0
780: engineOnToolResult engineCfg name True resultText
781: pure (Message ToolRole resultText Nothing (Just callId), testDelta, editDelta)
782:
783: isFailureResult :: Aeson.Value -> Bool
784: isFailureResult (Aeson.Object obj) =
785: case KeyMap.lookup "exit_code" obj of
786: Just (Aeson.Number n) -> n /= 0
787: _ -> False
788: isFailureResult (Aeson.String s) =
789: "error"
790: `Text.isInfixOf` Text.toLower s
791: || "failed"
792: `Text.isInfixOf` Text.toLower s
793: || "FAILED"
794: `Text.isInfixOf` s
795: isFailureResult _ = False
796:
797: isOldStrNotFoundError :: Aeson.Value -> Bool
798: isOldStrNotFoundError (Aeson.Object obj) =
799: case KeyMap.lookup "error" obj of
800: Just (Aeson.String s) -> "old_str not found" `Text.isInfixOf` s
801: _ -> False
802: isOldStrNotFoundError _ = False
803:
804: -- | Estimate cost in cents from token count
805: estimateCost :: Text -> Int -> Double
806: estimateCost model tokens
807: | "gpt-4o-mini" `Text.isInfixOf` model = fromIntegral tokens * 15 / 1000000
808: | "gpt-4o" `Text.isInfixOf` model = fromIntegral tokens * 250 / 100000
809: | "gpt-4" `Text.isInfixOf` model = fromIntegral tokens * 3 / 100000
810: | "claude" `Text.isInfixOf` model = fromIntegral tokens * 3 / 100000
811: | otherwise = fromIntegral tokens / 100000
Now let me check the task comment again - it mentions t-210 which implements NeedsHelp status. Let me search for how jr tasks are updated:
jr task get t-210 --json 2>/dev/null || echo "Task t-210 not found"NeedsHelp./Omni/Task.hs:309: Just "needs-help" -> pure <| Just NeedsHelp
./Omni/Task.hs:371: "needs-help" -> NeedsHelp
./Omni/Task.hs:635: Test.unit "ready tasks exclude NeedsHelp tasks" <| do
./Omni/Task.hs:637: updateTaskStatus (taskId task) NeedsHelp []
./Omni/Jr/Web/Style.hs:587: ".badge-needshelp" ? do
./Omni/Jr/Web/Style.hs:1912: ".badge-needshelp" ? do
./Omni/Task/Core.hs:51:data Status = Draft | Open | InProgress | Review | Approved | Done | NeedsHelp
./Omni/Task/Core.hs:810: /= NeedsHelp
./Omni/Task/Core.hs:926: NeedsHelp -> "[!]"
./Omni/Task/Core.hs:937: NeedsHelp -> yellow statusStr
./Omni/Task/Core.hs:996: NeedsHelp -> yellow s
./Omni/Task/Core.hs:1565: needingHelp = [t | t <- allTasks, taskStatus t == NeedsHelp]
./Omni/Jr/Web.hs:684: TaskCore.NeedsHelp -> ("badge badge-needshelp", "Needs Help")
./Omni/Jr/Web.hs:775: TaskCore.NeedsHelp -> ("badge badge-needshelp status-badge-clickable", "Needs Help")
./Omni/Jr/Web.hs:803: statusOption TaskCore.NeedsHelp currentStatus tid
./Omni/Jr/Web.hs:814: TaskCore.NeedsHelp -> ("badge badge-needshelp", "Needs Help")
./Omni/Jr/Web.hs:1091: needsHelp = TaskCore.tasksNeedingHelp actionItems
./Omni/Jr/Web.hs:1092: totalCount = length failed + length epicsReady + length needsHelp
./Omni/Jr/Web.hs:1111: unless (null needsHelp) <| do
./Omni/Jr/Web.hs:1112: Lucid.h2_ [Lucid.class_ "section-header"] <| Lucid.toHtml ("Needs Help (" <> tshow (length needsHelp) <> ")")
./Omni/Jr/Web.hs:1114: Lucid.div_ [Lucid.class_ "task-list"] <| traverse_ renderTaskCard (sortTasks currentSort needsHelp)
Good! The NeedsHelp status already exists. Now let me check the Worker code to see how it handles agents and where I need to integrate the guardrail → NeedsHelp transition:
runAgent./Omni/Agent/Worker.hs:337: result <- Engine.runAgent engineCfg agentCfg userPrompt ./Omni/Agent/Engine.hs:38: runAgent, ./Omni/Agent/Engine.hs:596:runAgent :: EngineConfig -> AgentConfig -> Text -> IO (Either Text AgentResult) ./Omni/Agent/Engine.hs:597:runAgent engineCfg agentCfg userPrompt = do
Omni/Agent/Worker.hs1: {-# LANGUAGE OverloadedStrings #-}
2: {-# LANGUAGE NoImplicitPrelude #-}
3:
4: module Omni.Agent.Worker where
5:
6: import Alpha
7: import qualified Data.Aeson as Aeson
8: import qualified Data.Aeson.Key as AesonKey
9: import qualified Data.ByteString.Lazy as BSL
10: import Data.IORef (modifyIORef', newIORef, readIORef)
11: import qualified Data.List as List
12: import qualified Data.Text as Text
13: import qualified Data.Text.Encoding as TE
14: import qualified Data.Time
15: import qualified Omni.Agent.Core as Core
16: import qualified Omni.Agent.Engine as Engine
17: import qualified Omni.Agent.Log as AgentLog
18: import qualified Omni.Agent.Tools as Tools
19: import qualified Omni.Fact as Fact
20: import qualified Omni.Task.Core as TaskCore
21: import qualified System.Directory as Directory
22: import qualified System.Environment as Env
23: import qualified System.Exit as Exit
24: import System.FilePath ((</>))
25: import qualified System.Process as Process
26:
27: start :: Core.Worker -> Maybe Text -> IO ()
28: start worker maybeTaskId = do
29: if Core.workerQuiet worker
30: then putText ("[worker] Starting for " <> Core.workerName worker)
31: else do
32: AgentLog.init (Core.workerName worker)
33: AgentLog.log ("[worker] Starting for " <> Core.workerName worker)
34: case maybeTaskId of
35: Just tid -> logMsg worker ("[worker] Target task: " <> tid)
36: Nothing -> logMsg worker "[worker] No specific task, will pick from ready queue"
37: runOnce worker maybeTaskId
38:
39: -- | Log message respecting quiet mode
40: logMsg :: Core.Worker -> Text -> IO ()
41: logMsg worker msg =
42: if Core.workerQuiet worker
43: then putText msg
44: else AgentLog.log msg
45:
46: -- | Convert key-value pairs to JSON metadata string
47: toMetadata :: [(Text, Text)] -> Text
48: toMetadata pairs =
49: let obj = Aeson.object [(AesonKey.fromText k, Aeson.String v) | (k, v) <- pairs]
50: in TE.decodeUtf8 (BSL.toStrict (Aeson.encode obj))
51:
52: -- | Format guardrail result for logging
53: formatGuardrailResult :: Engine.GuardrailResult -> Text
54: formatGuardrailResult Engine.GuardrailOk = "OK"
55: formatGuardrailResult (Engine.GuardrailCostExceeded actual limit) =
56: "Cost exceeded: " <> tshow actual <> " cents (limit: " <> tshow limit <> ")"
57: formatGuardrailResult (Engine.GuardrailTokensExceeded actual limit) =
58: "Tokens exceeded: " <> tshow actual <> " (limit: " <> tshow limit <> ")"
59: formatGuardrailResult (Engine.GuardrailDuplicateToolCalls tool count) =
60: "Duplicate tool calls: " <> tool <> " called " <> tshow count <> " times"
61: formatGuardrailResult (Engine.GuardrailTestFailures count) =
62: "Test failures: " <> tshow count <> " failures"
63: formatGuardrailResult (Engine.GuardrailEditFailures count) =
64: "Edit failures: " <> tshow count <> " 'old_str not found' errors"
65:
66: runOnce :: Core.Worker -> Maybe Text -> IO ()
67: runOnce worker maybeTaskId = do
68: -- Find work
69: targetTask <- case maybeTaskId of
70: Just tid -> do
71: TaskCore.findTask tid </ TaskCore.loadTasks
72: Nothing -> do
73: readyTasks <- TaskCore.getReadyTasks
74: case readyTasks of
75: [] -> pure Nothing
76: (task : _) -> pure (Just task)
77:
78: case targetTask of
79: Nothing -> do
80: case maybeTaskId of
81: Just tid -> do
82: unless (Core.workerQuiet worker) <| AgentLog.updateActivity ("Task " <> tid <> " not found.")
83: logMsg worker ("[worker] Task " <> tid <> " not found.")
84: Nothing -> do
85: unless (Core.workerQuiet worker) <| AgentLog.updateActivity "No work found."
86: logMsg worker "[worker] No ready tasks found."
87: Just task -> do
88: processTask worker task
89:
90: processTask :: Core.Worker -> TaskCore.Task -> IO ()
91: processTask worker task = do
92: let repo = Core.workerPath worker
93: let tid = TaskCore.taskId task
94: let quiet = Core.workerQuiet worker
95: let say = logMsg worker
96:
97: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Just tid})
98: say ("[worker] Claiming task " <> tid)
99:
100: -- Claim task
101: TaskCore.logActivity tid TaskCore.Claiming Nothing
102: TaskCore.updateTaskStatusWithActor tid TaskCore.InProgress [] TaskCore.Junior
103: say "[worker] Status -> InProgress"
104:
105: -- Run agent with timing
106: startTime <- Data.Time.getCurrentTime
107: activityId <- TaskCore.logActivityWithMetrics tid TaskCore.Running Nothing Nothing (Just startTime) Nothing Nothing Nothing
108:
109: say "[worker] Starting engine..."
110: (exitCode, output, costCents) <- runWithEngine worker repo task
111:
112: endTime <- Data.Time.getCurrentTime
113: say ("[worker] Agent exited with: " <> tshow exitCode)
114:
115: -- Update the activity record with metrics (convert Double to Int by rounding)
116: TaskCore.updateActivityMetrics activityId Nothing (Just endTime) (Just (round costCents)) Nothing
117:
118: case exitCode of
119: Exit.ExitSuccess -> do
120: TaskCore.logActivity tid TaskCore.Reviewing Nothing
121: say "[worker] Running formatters..."
122: _ <- runFormatters repo
123:
124: -- Try to commit (this runs git hooks which may fail)
125: let commitMsg = formatCommitMessage task output
126: say "[worker] Attempting commit..."
127: commitResult <- tryCommit repo commitMsg
128:
129: case commitResult of
130: CommitFailed commitErr -> do
131: say ("[worker] Commit failed: " <> commitErr)
132:
133: -- Save failure context and reopen task for retry
134: maybeCtx <- TaskCore.getRetryContext tid
135: let attempt = maybe 1 (\c -> TaskCore.retryAttempt c + 1) maybeCtx
136:
137: if attempt > 3
138: then do
139: say "[worker] Task failed 3 times, needs human intervention"
140: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("reason", "max_retries_exceeded")]))
141: TaskCore.updateTaskStatusWithActor tid TaskCore.Open [] TaskCore.Junior
142: else do
143: let currentReason = "attempt " <> tshow attempt <> ": commit_failed: " <> commitErr
144: let accumulatedReason = case maybeCtx of
145: Nothing -> currentReason
146: Just ctx -> TaskCore.retryReason ctx <> "\n" <> currentReason
147: TaskCore.setRetryContext
148: TaskCore.RetryContext
149: { TaskCore.retryTaskId = tid,
150: TaskCore.retryOriginalCommit = "",
151: TaskCore.retryConflictFiles = [],
152: TaskCore.retryAttempt = attempt,
153: TaskCore.retryReason = accumulatedReason,
154: TaskCore.retryNotes = maybeCtx +> TaskCore.retryNotes
155: }
156: TaskCore.logActivity tid TaskCore.Retrying (Just (toMetadata [("attempt", tshow attempt)]))
157: TaskCore.updateTaskStatusWithActor tid TaskCore.Open [] TaskCore.Junior
158: say ("[worker] Task reopened (attempt " <> tshow attempt <> "/3)")
159: NoChanges -> do
160: -- No changes = task already implemented, mark as Done
161: say "[worker] No changes to commit - task already done"
162: TaskCore.clearRetryContext tid
163: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "no_changes")]))
164: TaskCore.updateTaskStatusWithActor tid TaskCore.Done [] TaskCore.Junior
165: say ("[worker] ✓ Task " <> tid <> " -> Done (no changes)")
166: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
167: CommitSuccess -> do
168: -- Commit succeeded, set to Review
169: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "committed")]))
170: TaskCore.updateTaskStatusWithActor tid TaskCore.Review [] TaskCore.Junior
171: say ("[worker] ✓ Task " <> tid <> " -> Review")
172: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
173: Exit.ExitFailure code -> do
174: say ("[worker] Engine failed with code " <> tshow code)
175: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("exit_code", tshow code)]))
176: -- Don't set back to Open here - leave in InProgress for debugging
177: say "[worker] Task left in InProgress (engine failure)"
178:
179: -- | Run lint --fix to format and fix lint issues
180: runFormatters :: FilePath -> IO (Either Text ())
181: runFormatters repo = do
182: let cmd = (Process.proc "lint" ["--fix"]) {Process.cwd = Just repo}
183: (code, _, _) <- Process.readCreateProcessWithExitCode cmd ""
184: case code of
185: Exit.ExitSuccess -> pure (Right ())
186: Exit.ExitFailure _ -> pure (Right ()) -- lint --fix may exit non-zero but still fix things
187:
188: data CommitResult = CommitSuccess | NoChanges | CommitFailed Text
189: deriving (Show, Eq)
190:
191: -- | Try to commit, returning result
192: tryCommit :: FilePath -> Text -> IO CommitResult
193: tryCommit repo msg = do
194: -- Stage all changes
195: let addCmd = (Process.proc "git" ["add", "."]) {Process.cwd = Just repo}
196: (addCode, _, addErr) <- Process.readCreateProcessWithExitCode addCmd ""
197: case addCode of
198: Exit.ExitFailure _ -> pure <| CommitFailed (Text.pack addErr)
199: Exit.ExitSuccess -> do
200: -- Check for changes
201: let checkCmd = (Process.proc "git" ["diff", "--cached", "--quiet"]) {Process.cwd = Just repo}
202: (checkCode, _, _) <- Process.readCreateProcessWithExitCode checkCmd ""
203: case checkCode of
204: Exit.ExitSuccess -> pure NoChanges
205: Exit.ExitFailure 1 -> do
206: -- There are changes, commit them
207: let commitCmd = (Process.proc "git" ["commit", "-m", Text.unpack msg]) {Process.cwd = Just repo}
208: (commitCode, _, commitErr) <- Process.readCreateProcessWithExitCode commitCmd ""
209: case commitCode of
210: Exit.ExitSuccess -> pure CommitSuccess
211: Exit.ExitFailure _ -> pure <| CommitFailed (Text.pack commitErr)
212: Exit.ExitFailure c -> pure <| CommitFailed ("git diff failed with code " <> tshow c)
213:
214: -- | Run task using native Engine
215: -- Returns (ExitCode, output text, cost in cents)
216: runWithEngine :: Core.Worker -> FilePath -> TaskCore.Task -> IO (Exit.ExitCode, Text, Double)
217: runWithEngine worker repo task = do
218: -- Read API key from environment
219: maybeApiKey <- Env.lookupEnv "OPENROUTER_API_KEY"
220: case maybeApiKey of
221: Nothing -> pure (Exit.ExitFailure 1, "OPENROUTER_API_KEY not set", 0)
222: Just apiKey -> do
223: -- Check for retry context
224: maybeRetry <- TaskCore.getRetryContext (TaskCore.taskId task)
225:
226: -- Get progress from database (checkpoint events from previous sessions)
227: progressContent <- TaskCore.getProgressSummary (TaskCore.taskId task)
228:
229: -- Build the full prompt
230: let ns = fromMaybe "." (TaskCore.taskNamespace task)
231: let basePrompt = buildBasePrompt task ns repo
232:
233: -- Add progress context if present
234: let progressPrompt = buildProgressPrompt progressContent
235:
236: -- Add retry context if present
237: let retryPrompt = buildRetryPrompt maybeRetry
238:
239: let prompt = basePrompt <> progressPrompt <> retryPrompt
240:
241: -- Read AGENTS.md
242: agentsMd <-
243: fmap (fromMaybe "") <| do
244: exists <- Directory.doesFileExist (repo </> "AGENTS.md")
245: if exists
246: then Just </ readFile (repo </> "AGENTS.md")
247: else pure Nothing
248:
249: -- Get relevant facts from the knowledge base
250: relevantFacts <- getRelevantFacts task
251: let factsSection = formatFacts relevantFacts
252:
253: -- Build system prompt
254: let systemPrompt =
255: prompt
256: <> "\n\nREPOSITORY GUIDELINES (AGENTS.md):\n"
257: <> agentsMd
258: <> factsSection
259:
260: -- Build user prompt from task comments
261: let userPrompt = formatTask task
262:
263: -- Select model based on task complexity (simple heuristic)
264: let model = selectModel task
265:
266: -- Generate session ID for event logging
267: sessionId <- TaskCore.generateSessionId
268: let tid = TaskCore.taskId task
269:
270: -- Helper to log events to DB
271: -- For text content, store as-is; for structured data, JSON-encode
272: let logJuniorEvent eventType content = TaskCore.insertAgentEvent tid sessionId eventType content TaskCore.Junior
273: logJuniorJson eventType value = do
274: let contentJson = TE.decodeUtf8 (BSL.toStrict (Aeson.encode value))
275: TaskCore.insertAgentEvent tid sessionId eventType contentJson TaskCore.Junior
276: logSystemEvent eventType content = TaskCore.insertAgentEvent tid sessionId eventType content TaskCore.System
277:
278: -- Build Engine config with callbacks
279: totalCostRef <- newIORef (0 :: Double)
280: let quiet = Core.workerQuiet worker
281: sayLog msg = if quiet then putText msg else AgentLog.log msg
282: engineCfg =
283: Engine.EngineConfig
284: { Engine.engineLLM =
285: Engine.defaultLLM
286: { Engine.llmApiKey = Text.pack apiKey
287: },
288: Engine.engineOnCost = \tokens cost -> do
289: modifyIORef' totalCostRef (+ cost)
290: sayLog <| "Cost: " <> tshow cost <> " cents (" <> tshow tokens <> " tokens)"
291: logJuniorJson "Cost" (Aeson.object [("tokens", Aeson.toJSON tokens), ("cents", Aeson.toJSON cost)]),
292: Engine.engineOnActivity = \activity -> do
293: sayLog <| "[engine] " <> activity,
294: Engine.engineOnToolCall = \toolName args -> do
295: sayLog <| "[tool] " <> toolName
296: logJuniorEvent "ToolCall" (toolName <> ": " <> args),
297: Engine.engineOnAssistant = \msg -> do
298: sayLog <| "[assistant] " <> Text.take 200 msg
299: logJuniorEvent "Assistant" msg,
300: Engine.engineOnToolResult = \toolName success output -> do
301: let statusStr = if success then "ok" else "failed"
302: sayLog <| "[result] " <> toolName <> " (" <> statusStr <> "): " <> Text.take 100 output
303: logJuniorEvent "ToolResult" output,
304: Engine.engineOnComplete = do
305: sayLog "[engine] Complete"
306: logJuniorEvent "Complete" "",
307: Engine.engineOnError = \err -> do
308: sayLog <| "[error] " <> err
309: logJuniorEvent "Error" err,
310: Engine.engineOnGuardrail = \guardrailResult -> do
311: let guardrailMsg = formatGuardrailResult guardrailResult
312: contentJson = TE.decodeUtf8 (BSL.toStrict (Aeson.encode guardrailResult))
313: sayLog <| "[guardrail] " <> guardrailMsg
314: logSystemEvent "Guardrail" contentJson
315: }
316:
317: -- Build Agent config with guardrails (scale cost by complexity)
318: let baseCost = selectCostByComplexity (TaskCore.taskComplexity task)
319: guardrails =
320: Engine.Guardrails
321: { Engine.guardrailMaxCostCents = baseCost,
322: Engine.guardrailMaxTokens = 2000000,
323: Engine.guardrailMaxDuplicateToolCalls = 30,
324: Engine.guardrailMaxTestFailures = 3,
325: Engine.guardrailMaxEditFailures = 5
326: }
327: agentCfg =
328: Engine.AgentConfig
329: { Engine.agentModel = model,
330: Engine.agentTools = Tools.allTools,
331: Engine.agentSystemPrompt = systemPrompt,
332: Engine.agentMaxIterations = 100,
333: Engine.agentGuardrails = guardrails
334: }
335:
336: -- Run the agent
337: result <- Engine.runAgent engineCfg agentCfg userPrompt
338: totalCost <- readIORef totalCostRef
339:
340: case result of
341: Left err -> pure (Exit.ExitFailure 1, "Engine error: " <> err, totalCost)
342: Right agentResult -> do
343: let output = Engine.resultFinalMessage agentResult
344: pure (Exit.ExitSuccess, output, totalCost)
345:
346: -- | Build the base prompt for the agent
347: buildBasePrompt :: TaskCore.Task -> Text -> FilePath -> Text
348: buildBasePrompt task ns repo =
349: "You are an autonomous Worker Agent.\n"
350: <> "Your goal is to implement the following task:\n\n"
351: <> formatTask task
352: <> "\n\nCRITICAL INSTRUCTIONS:\n"
353: <> "1. Read AGENTS.md first to understand the codebase conventions.\n"
354: <> "2. Complete ONE logical change (e.g., update schema + call sites + tests).\n"
355: <> "3. Run 'bild --test "
356: <> ns
357: <> "' ONCE after implementing.\n"
358: <> "4. **CRITICAL**: If tests pass, STOP IMMEDIATELY. Do not verify, do not review, do not trace logic, do not search for usages. Just stop.\n"
359: <> "5. If tests fail, fix the issue and run tests again.\n"
360: <> "6. If tests fail 3 times on the same issue, STOP - the task will be marked for human review.\n"
361: <> "7. Do NOT update task status or manage git - the worker handles that.\n"
362: <> "8. After tests pass, ANY further tool calls are wasted money. The worker will commit your changes.\n\n"
363: <> "AUTONOMOUS OPERATION (NO HUMAN IN LOOP):\n"
364: <> "- You are running autonomously without human intervention\n"
365: <> "- There is NO human to ask questions or get clarification from\n"
366: <> "- Make reasonable decisions based on the task description\n"
367: <> "- If something is truly ambiguous, implement the most straightforward interpretation\n"
368: <> "- Guardrails will stop you if you exceed cost/token budgets or make repeated mistakes\n\n"
369: <> "BUILD SYSTEM NOTES:\n"
370: <> "- 'bild --test "
371: <> ns
372: <> "' tests ALL dependencies transitively - run it ONCE, not per-file\n"
373: <> "- Do NOT run bild --test on individual files separately\n"
374: <> "- Once tests pass, STOP IMMEDIATELY - no verification, no double-checking, no 'one more look'\n"
375: <> "- Use 'lint --fix' for formatting issues (not hlint directly)\n\n"
376: <> "EFFICIENCY REQUIREMENTS:\n"
377: <> "- Do not repeat the same action multiple times\n"
378: <> "- Do not re-run passing tests\n"
379: <> "- Do not test files individually when namespace test covers them\n"
380: <> "- Aim to complete the task in under 50 tool calls\n\n"
381: <> "LARGE FILE HANDLING:\n"
382: <> "- When reading large files (>500 lines), use line ranges to read only relevant sections\n"
383: <> "- For edit_file, use minimal unique context - just enough lines to match uniquely\n"
384: <> "- If edit_file fails with 'old_str not found', re-read the exact lines you need to edit\n"
385: <> "- After 2-3 failed edits on the same file, STOP and reconsider your approach\n"
386: <> "- Very large files (>2000 lines) may need refactoring - note this for human review\n\n"
387: <> "Context:\n"
388: <> "- Working directory: "
389: <> Text.pack repo
390: <> "\n"
391: <> "- Namespace: "
392: <> ns
393: <> "\n"
394:
395: -- | Build progress context prompt
396: buildProgressPrompt :: Maybe Text -> Text
397: buildProgressPrompt Nothing = ""
398: buildProgressPrompt (Just progress) =
399: "\n\n## PROGRESS FROM PREVIOUS SESSIONS (from database)\n\n"
400: <> "This task has been worked on before. Here are the checkpoint notes:\n\n"
401: <> progress
402: <> "\n\n"
403: <> "IMPORTANT:\n"
404: <> "- Review these checkpoints to understand what's already done\n"
405: <> "- Do NOT repeat work that's already completed\n"
406: <> "- If the task appears complete, verify tests pass and exit\n\n"
407:
408: -- | Build retry context prompt
409: buildRetryPrompt :: Maybe TaskCore.RetryContext -> Text
410: buildRetryPrompt Nothing = ""
411: buildRetryPrompt (Just ctx) =
412: "\n\n## RETRY CONTEXT (IMPORTANT)\n\n"
413: <> "This task was previously attempted but failed. Attempt: "
414: <> tshow (TaskCore.retryAttempt ctx)
415: <> "/3\n"
416: <> "Reason: "
417: <> TaskCore.retryReason ctx
418: <> "\n\n"
419: <> ( if null (TaskCore.retryConflictFiles ctx)
420: then ""
421: else
422: "Conflicting files from previous attempt:\n"
423: <> Text.unlines (map (" - " <>) (TaskCore.retryConflictFiles ctx))
424: <> "\n"
425: )
426: <> "Original commit: "
427: <> TaskCore.retryOriginalCommit ctx
428: <> "\n\n"
429: <> maybe "" (\notes -> "## HUMAN NOTES/GUIDANCE\n\n" <> notes <> "\n\n") (TaskCore.retryNotes ctx)
430: <> "INSTRUCTIONS FOR RETRY:\n"
431: <> "- The codebase has changed since your last attempt\n"
432: <> "- Re-implement this task on top of the CURRENT codebase\n"
433: <> "- If there were merge conflicts, the conflicting files may have been modified by others\n"
434: <> "- Review the current state of those files before making changes\n"
435:
436: -- | Select model based on task complexity (1-5 scale)
437: -- Uses OpenRouter model identifiers for Claude models
438: selectModel :: TaskCore.Task -> Text
439: selectModel task = selectModelByComplexity (TaskCore.taskComplexity task)
440:
441: -- | Select model based on complexity level
442: selectModelByComplexity :: Maybe Int -> Text
443: selectModelByComplexity Nothing = "anthropic/claude-sonnet-4.5"
444: selectModelByComplexity (Just 1) = "anthropic/claude-haiku-4.5"
445: selectModelByComplexity (Just 2) = "anthropic/claude-haiku-4.5"
446: selectModelByComplexity (Just 3) = "anthropic/claude-sonnet-4.5"
447: selectModelByComplexity (Just 4) = "anthropic/claude-sonnet-4.5"
448: selectModelByComplexity (Just 5) = "anthropic/claude-opus-4.5"
449: selectModelByComplexity (Just _) = "anthropic/claude-sonnet-4.5"
450:
451: -- | Select cost guardrail based on complexity level (in cents)
452: -- Lower complexity = lower budget, higher complexity = more room for iteration
453: selectCostByComplexity :: Maybe Int -> Double
454: selectCostByComplexity Nothing = 200.0
455: selectCostByComplexity (Just 1) = 50.0
456: selectCostByComplexity (Just 2) = 100.0
457: selectCostByComplexity (Just 3) = 200.0
458: selectCostByComplexity (Just 4) = 400.0
459: selectCostByComplexity (Just 5) = 600.0
460: selectCostByComplexity (Just _) = 200.0
461:
462: formatTask :: TaskCore.Task -> Text
463: formatTask t =
464: "Task: "
465: <> TaskCore.taskId t
466: <> "\n"
467: <> "Title: "
468: <> TaskCore.taskTitle t
469: <> "\n"
470: <> "Type: "
471: <> Text.pack (show (TaskCore.taskType t))
472: <> "\n"
473: <> "Status: "
474: <> Text.pack (show (TaskCore.taskStatus t))
475: <> "\n"
476: <> "Priority: "
477: <> Text.pack (show (TaskCore.taskPriority t))
478: <> "\n"
479: <> maybe "" (\p -> "Parent: " <> p <> "\n") (TaskCore.taskParent t)
480: <> maybe "" (\ns -> "Namespace: " <> ns <> "\n") (TaskCore.taskNamespace t)
481: <> "Created: "
482: <> Text.pack (show (TaskCore.taskCreatedAt t))
483: <> "\n"
484: <> "Updated: "
485: <> Text.pack (show (TaskCore.taskUpdatedAt t))
486: <> "\n"
487: <> (if Text.null (TaskCore.taskDescription t) then "" else "Description:\n" <> TaskCore.taskDescription t <> "\n\n")
488: <> formatDeps (TaskCore.taskDependencies t)
489: <> formatComments (TaskCore.taskComments t)
490: where
491: formatDeps [] = ""
492: formatDeps deps = "\nDependencies:\n" <> Text.unlines (map formatDep deps)
493: formatDep dep = " - " <> TaskCore.depId dep <> " [" <> Text.pack (show (TaskCore.depType dep)) <> "]"
494: formatComments [] = ""
495: formatComments cs = "\nComments/Notes:\n" <> Text.unlines (map formatComment cs)
496: formatComment c = " [" <> Text.pack (show (TaskCore.commentCreatedAt c)) <> "] " <> TaskCore.commentText c
497:
498: formatCommitMessage :: TaskCore.Task -> Text -> Text
499: formatCommitMessage task agentOutput =
500: let tid = TaskCore.taskId task
501: subject = cleanSubject (TaskCore.taskTitle task)
502: body = cleanBody agentOutput
503: in if Text.null body
504: then subject <> "\n\nTask-Id: " <> tid
505: else subject <> "\n\n" <> body <> "\n\nTask-Id: " <> tid
506: where
507: cleanSubject s =
508: let trailingPunct = ['.', ':', '!', '?', ',', ';', ' ', '-']
509: stripped = Text.dropWhileEnd (`elem` trailingPunct) s
510: truncated = Text.take 72 stripped
511: noPunct = Text.dropWhileEnd (`elem` trailingPunct) truncated
512: capitalized = case Text.uncons noPunct of
513: Just (c, rest) -> Text.cons (toUpper c) rest
514: Nothing -> noPunct
515: in capitalized
516:
517: cleanBody :: Text -> Text
518: cleanBody output =
519: let stripped = Text.strip output
520: in if Text.null stripped
521: then ""
522: else
523: let lns = Text.lines stripped
524: cleaned = [Text.take 72 ln | ln <- lns]
525: in Text.intercalate "\n" cleaned
526:
527: -- | Get facts relevant to a task based on namespace/project
528: getRelevantFacts :: TaskCore.Task -> IO [TaskCore.Fact]
529: getRelevantFacts task = do
530: let namespace = fromMaybe "Omni" (TaskCore.taskNamespace task)
531: projectFacts <- Fact.getFactsByProject namespace
532: let sorted = List.sortBy (comparing (Down <. TaskCore.factConfidence)) projectFacts
533: pure (take 10 sorted)
534:
535: -- | Format facts for inclusion in the prompt
536: formatFacts :: [TaskCore.Fact] -> Text
537: formatFacts [] = ""
538: formatFacts facts =
539: Text.unlines
540: [ "\n\nKNOWLEDGE BASE FACTS:",
541: "(These are learned patterns/conventions from previous work)",
542: ""
543: ]
544: <> Text.unlines (map formatFact facts)
545:
546: -- | Format a single fact for the prompt
547: formatFact :: TaskCore.Fact -> Text
548: formatFact f =
549: "- "
550: <> TaskCore.factContent f
551: <> ( if null (TaskCore.factRelatedFiles f)
552: then ""
553: else " [" <> Text.intercalate ", " (TaskCore.factRelatedFiles f) <> "]"
554: )
Now I understand the flow. When runAgent returns Left err (which includes guardrail violations), the Worker currently treats it as a failure and sets the task back to InProgress. According to the task description and comment, when a guardrail fires, the task should be set to NeedsHelp status with a comment explaining what happened.
Let me check what the error messages look like from guardrails:
Omni/Agent/Engine.hs @721-738721: findGuardrailViolation :: Guardrails -> Double -> Int -> Map.Map Text Int -> Int -> Int -> Maybe (GuardrailResult, Text)
722: findGuardrailViolation g cost tokens toolCallCounts testFailures editFailures =
723: case checkCostGuardrail g cost of
724: r@(GuardrailCostExceeded actual limit) ->
725: Just (r, "Guardrail: cost budget exceeded (" <> tshow actual <> "/" <> tshow limit <> " cents)")
726: _ -> case checkTokenGuardrail g tokens of
727: r@(GuardrailTokensExceeded actual limit) ->
728: Just (r, "Guardrail: token budget exceeded (" <> tshow actual <> "/" <> tshow limit <> " tokens)")
729: _ -> case checkDuplicateGuardrail g toolCallCounts of
730: r@(GuardrailDuplicateToolCalls tool count) ->
731: Just (r, "Guardrail: duplicate tool calls (" <> tool <> " called " <> tshow count <> " times)")
732: _ -> case checkTestFailureGuardrail g testFailures of
733: r@(GuardrailTestFailures count) ->
734: Just (r, "Guardrail: too many test failures (" <> tshow count <> ")")
735: _ -> case checkEditFailureGuardrail g editFailures of
736: r@(GuardrailEditFailures count) ->
737: Just (r, "Guardrail: too many edit_file failures (" <> tshow count <> " 'old_str not found' errors)")
738: _ -> Nothing
Good! So when a guardrail fires, runAgent returns Left errMsg where errMsg starts with "Guardrail: ".
Now let me check the Worker code around line 340 where it handles the result:
Omni/Agent/Worker.hs @336-344336: -- Run the agent 337: result <- Engine.runAgent engineCfg agentCfg userPrompt 338: totalCost <- readIORef totalCostRef 339: 340: case result of 341: Left err -> pure (Exit.ExitFailure 1, "Engine error: " <> err, totalCost) 342: Right agentResult -> do 343: let output = Engine.resultFinalMessage agentResult 344: pure (Exit.ExitSuccess, output, totalCost)
And then back in processTask:
Omni/Agent/Worker.hs @118-178118: case exitCode of
119: Exit.ExitSuccess -> do
120: TaskCore.logActivity tid TaskCore.Reviewing Nothing
121: say "[worker] Running formatters..."
122: _ <- runFormatters repo
123:
124: -- Try to commit (this runs git hooks which may fail)
125: let commitMsg = formatCommitMessage task output
126: say "[worker] Attempting commit..."
127: commitResult <- tryCommit repo commitMsg
128:
129: case commitResult of
130: CommitFailed commitErr -> do
131: say ("[worker] Commit failed: " <> commitErr)
132:
133: -- Save failure context and reopen task for retry
134: maybeCtx <- TaskCore.getRetryContext tid
135: let attempt = maybe 1 (\c -> TaskCore.retryAttempt c + 1) maybeCtx
136:
137: if attempt > 3
138: then do
139: say "[worker] Task failed 3 times, needs human intervention"
140: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("reason", "max_retries_exceeded")]))
141: TaskCore.updateTaskStatusWithActor tid TaskCore.Open [] TaskCore.Junior
142: else do
143: let currentReason = "attempt " <> tshow attempt <> ": commit_failed: " <> commitErr
144: let accumulatedReason = case maybeCtx of
145: Nothing -> currentReason
146: Just ctx -> TaskCore.retryReason ctx <> "\n" <> currentReason
147: TaskCore.setRetryContext
148: TaskCore.RetryContext
149: { TaskCore.retryTaskId = tid,
150: TaskCore.retryOriginalCommit = "",
151: TaskCore.retryConflictFiles = [],
152: TaskCore.retryAttempt = attempt,
153: TaskCore.retryReason = accumulatedReason,
154: TaskCore.retryNotes = maybeCtx +> TaskCore.retryNotes
155: }
156: TaskCore.logActivity tid TaskCore.Retrying (Just (toMetadata [("attempt", tshow attempt)]))
157: TaskCore.updateTaskStatusWithActor tid TaskCore.Open [] TaskCore.Junior
158: say ("[worker] Task reopened (attempt " <> tshow attempt <> "/3)")
159: NoChanges -> do
160: -- No changes = task already implemented, mark as Done
161: say "[worker] No changes to commit - task already done"
162: TaskCore.clearRetryContext tid
163: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "no_changes")]))
164: TaskCore.updateTaskStatusWithActor tid TaskCore.Done [] TaskCore.Junior
165: say ("[worker] ✓ Task " <> tid <> " -> Done (no changes)")
166: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
167: CommitSuccess -> do
168: -- Commit succeeded, set to Review
169: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "committed")]))
170: TaskCore.updateTaskStatusWithActor tid TaskCore.Review [] TaskCore.Junior
171: say ("[worker] ✓ Task " <> tid <> " -> Review")
172: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
173: Exit.ExitFailure code -> do
174: say ("[worker] Engine failed with code " <> tshow code)
175: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("exit_code", tshow code)]))
176: -- Don't set back to Open here - leave in InProgress for debugging
177: say "[worker] Task left in InProgress (engine failure)"
178:
I can see the issue. When Exit.ExitFailure occurs (line 173), the task is left in InProgress. But we need to distinguish between:
1. Guardrail violations → should set to NeedsHelp
2. Other engine errors → should stay in InProgress (current behavior)
The solution is to:
1. Change runWithEngine to return a distinguishable result for guardrail violations
2. Update processTask to handle guardrail violations by setting status to NeedsHelp with a comment
Let me check if there's a function to add comments to tasks:
addComment|insertComment./Omni/Task.hs:288: updatedTask <- addComment tid message Human
./Omni/Task.hs:886: updatedTask <- addComment (taskId task) "This is a test comment" Human
./Omni/Task.hs:895: _ <- addComment (taskId task) "First comment" Junior
./Omni/Task.hs:896: updatedTask <- addComment (taskId task) "Second comment" Human
./Omni/Task.hs:907: _ <- addComment (taskId task) "Persisted comment" Junior
./Omni/Task/Core.hs:746:addComment :: Text -> Text -> CommentAuthor -> IO Task
./Omni/Task/Core.hs:747:addComment tid commentTextContent author =
./Omni/Jr.hs:235: _ <- TaskCore.addComment tid conflictComment TaskCore.Junior
./Omni/Jr.hs:422: _ <- TaskCore.addComment tid reviewComment TaskCore.Junior
./Omni/Jr.hs:437: _ <- TaskCore.addComment tid reviewComment TaskCore.Junior
./Omni/Jr.hs:503: _ <- TaskCore.addComment tid acceptComment TaskCore.Human
./Omni/Jr.hs:515: _ <- TaskCore.addComment tid rejectComment TaskCore.Human
./Omni/Jr.hs:623: _ <- TaskCore.addComment tid ("## Completion Summary\n\n" <> summary) TaskCore.Junior
./Omni/Jr.hs:756: _ <- TaskCore.addComment epicId summary TaskCore.Junior
./Omni/Jr/Web.hs:2979: _ <- liftIO (TaskCore.addComment tid commentText TaskCore.Human)
Perfect! There's TaskCore.addComment :: Text -> Text -> CommentAuthor -> IO Task. Now I need to modify the code. Let me create a clear plan:
1. Change runWithEngine return type to distinguish guardrail violations from other errors
2. Update processTask to check if error is a guardrail violation and handle it appropriately
3. Add a comment with the guardrail message and set status to NeedsHelp
Let me start implementing:
Omni/Agent/Worker.hsOmni/Agent/Worker.hsNow I need to update the caller in processTask:
Omni/Agent/Worker.hsNow I need to handle the rest of the case branches:
Omni/Agent/Worker.hs @109-178109: say "[worker] Starting engine..."
110: engineResult <- runWithEngine worker repo task
111:
112: endTime <- Data.Time.getCurrentTime
113:
114: -- Update the activity record with metrics (convert Double to Int by rounding)
115: let costCents = case engineResult of
116: EngineSuccess _ c -> c
117: EngineGuardrailViolation _ c -> c
118: EngineError _ c -> c
119: TaskCore.updateActivityMetrics activityId Nothing (Just endTime) (Just (round costCents)) Nothing
120:
121: case engineResult of
122: EngineSuccess output _ -> do
123: say "[worker] Agent completed successfully"
124: TaskCore.logActivity tid TaskCore.Reviewing Nothing
125: say "[worker] Running formatters..."
126: _ <- runFormatters repo
127:
128: -- Try to commit (this runs git hooks which may fail)
129: let commitMsg = formatCommitMessage task output
130: say "[worker] Attempting commit..."
131: commitResult <- tryCommit repo commitMsg
132:
133: case commitResult of
134: CommitFailed commitErr -> do
135: say ("[worker] Commit failed: " <> commitErr)
136:
137: -- Save failure context and reopen task for retry
138: maybeCtx <- TaskCore.getRetryContext tid
139: let attempt = maybe 1 (\c -> TaskCore.retryAttempt c + 1) maybeCtx
140:
141: if attempt > 3
142: then do
143: say "[worker] Task failed 3 times, needs human intervention"
144: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("reason", "max_retries_exceeded")]))
145: TaskCore.updateTaskStatusWithActor tid TaskCore.Open [] TaskCore.Junior
146: else do
147: let currentReason = "attempt " <> tshow attempt <> ": commit_failed: " <> commitErr
148: let accumulatedReason = case maybeCtx of
149: Nothing -> currentReason
150: Just ctx -> TaskCore.retryReason ctx <> "\n" <> currentReason
151: TaskCore.setRetryContext
152: TaskCore.RetryContext
153: { TaskCore.retryTaskId = tid,
154: TaskCore.retryOriginalCommit = "",
155: TaskCore.retryConflictFiles = [],
156: TaskCore.retryAttempt = attempt,
157: TaskCore.retryReason = accumulatedReason,
158: TaskCore.retryNotes = maybeCtx +> TaskCore.retryNotes
159: }
160: TaskCore.logActivity tid TaskCore.Retrying (Just (toMetadata [("attempt", tshow attempt)]))
161: TaskCore.updateTaskStatusWithActor tid TaskCore.Open [] TaskCore.Junior
162: say ("[worker] Task reopened (attempt " <> tshow attempt <> "/3)")
163: NoChanges -> do
164: -- No changes = task already implemented, mark as Done
165: say "[worker] No changes to commit - task already done"
166: TaskCore.clearRetryContext tid
167: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "no_changes")]))
168: TaskCore.updateTaskStatusWithActor tid TaskCore.Done [] TaskCore.Junior
169: say ("[worker] ✓ Task " <> tid <> " -> Done (no changes)")
170: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
171: CommitSuccess -> do
172: -- Commit succeeded, set to Review
173: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "committed")]))
174: TaskCore.updateTaskStatusWithActor tid TaskCore.Review [] TaskCore.Junior
175: say ("[worker] ✓ Task " <> tid <> " -> Review")
176: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
177: Exit.ExitFailure code -> do
178: say ("[worker] Engine failed with code " <> tshow code)
Omni/Agent/Worker.hsLet me re-read the exact text:
Omni/Agent/Worker.hs @174-182174: TaskCore.updateTaskStatusWithActor tid TaskCore.Review [] TaskCore.Junior
175: say ("[worker] ✓ Task " <> tid <> " -> Review")
176: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
177: Exit.ExitFailure code -> do
178: say ("[worker] Engine failed with code " <> tshow code)
179: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("exit_code", tshow code)]))
180: -- Don't set back to Open here - leave in InProgress for debugging
181: say "[worker] Task left in InProgress (engine failure)"
182:
Omni/Agent/Worker.hsLet me read more context:
Omni/Agent/Worker.hs @171-182171: CommitSuccess -> do
172: -- Commit succeeded, set to Review
173: TaskCore.logActivity tid TaskCore.Completed (Just (toMetadata [("result", "committed")]))
174: TaskCore.updateTaskStatusWithActor tid TaskCore.Review [] TaskCore.Junior
175: say ("[worker] ✓ Task " <> tid <> " -> Review")
176: unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})
177: Exit.ExitFailure code -> do
178: say ("[worker] Engine failed with code " <> tshow code)
179: TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("exit_code", tshow code)]))
180: -- Don't set back to Open here - leave in InProgress for debugging
181: say "[worker] Task left in InProgress (engine failure)"
182:
I see - the issue is that Exit.ExitFailure is from the old code. Let me try with the exact string including escaping:
Omni/Agent/Worker.hsLet me check the special characters more carefully:
sed -n '175,181p' Omni/Agent/Worker.hs | cat -A say ("[worker] M-bM-^\M-^S Task " <> tid <> " -> Review")$
unless quiet <| AgentLog.update (\s -> s {AgentLog.statusTask = Nothing})$
Exit.ExitFailure code -> do$
say ("[worker] Engine failed with code " <> tshow code)$
TaskCore.logActivity tid TaskCore.Failed (Just (toMetadata [("exit_code", tshow code)]))$
-- Don't set back to Open here - leave in InProgress for debugging$
say "[worker] Task left in InProgress (engine failure)"$
There's a special character. Let me just replace from Exit.ExitFailure onwards:
Omni/Agent/Worker.hsGood! Now I need to also check if there's a Nothing -> ... case for the API key handling that I need to update:
Omni/Agent/Worker.hs @218-222218: Exit.ExitFailure 1 -> do
219: -- There are changes, commit them
220: let commitCmd = (Process.proc "git" ["commit", "-m", Text.unpack msg]) {Process.cwd = Just repo}
221: (commitCode, _, commitErr) <- Process.readCreateProcessWithExitCode commitCmd ""
222: case commitCode of
Let me find the right place:
Omni/Agent/Worker.hs @228-235228: = EngineSuccess Text Double -- output, cost 229: | EngineGuardrailViolation Text Double -- error message, cost 230: | EngineError Text Double -- error message, cost 231: 232: -- | Run task using native Engine 233: -- Returns engine result with output/error and cost 234: runWithEngine :: Core.Worker -> FilePath -> TaskCore.Task -> IO EngineResult 235: runWithEngine worker repo task = do
Omni/Agent/Worker.hs @235-242235: runWithEngine worker repo task = do 236: -- Read API key from environment 237: maybeApiKey <- Env.lookupEnv "OPENROUTER_API_KEY" 238: case maybeApiKey of 239: Nothing -> pure (Exit.ExitFailure 1, "OPENROUTER_API_KEY not set", 0) 240: Just apiKey -> do 241: -- Check for retry context 242: maybeRetry <- TaskCore.getRetryContext (TaskCore.taskId task)
{"contents":[208.02210000000002,200],"tag":"GuardrailCostExceeded"}