Add web search and page reading tools so jr can look up documentation and examples.
Search the web and return results with titles, URLs, and snippets.
data WebSearchArgs = WebSearchArgs
{ wsQuery :: Text -- Search query
, wsMaxResults :: Maybe Int -- Max results (default: 5)
}
deriving (Show, Eq, Generic)
Implementation: Use DuckDuckGo HTML search (no API key needed):
curl -s "https://html.duckduckgo.com/html/?q=<query>" | extract results
Or use the ddgr command-line tool if available:
ddgr --json -n 5 "<query>"
Fallback: Use curl to fetch DuckDuckGo HTML and parse with simple regex/text processing.
Return format:
{
"success": true,
"output": "1. Title - URL\n Snippet...\n\n2. Title - URL\n Snippet..."
}
Fetch a URL and convert to readable text.
data ReadWebPageArgs = ReadWebPageArgs
{ rwpUrl :: Text -- URL to fetch
, rwpObjective :: Maybe Text -- Optional: focus extraction on this goal
}
deriving (Show, Eq, Generic)
Implementation:
1. Fetch URL with curl: curl -sL "<url>"
2. Convert HTML to text. Options:
pandoc -f html -t plain (available in NixOS)lynx -dump (simpler)w3m -dump3. If objective is provided, truncate to ~8000 chars to avoid context overflow
4. Return the text content
{
"type": "object",
"properties": {
"query": { "type": "string", "description": "Search query" },
"max_results": { "type": "integer", "description": "Maximum results (default: 5)" }
},
"required": ["query"]
}
{
"type": "object",
"properties": {
"url": { "type": "string", "description": "URL to fetch and read" },
"objective": { "type": "string", "description": "Optional: focus on content relevant to this goal" }
},
"required": ["url"]
}
1. Test web_search schema is valid 2. Test read_web_page with a simple URL like "https://example.com"
pandoc for HTML-to-text as it handles most sites well