Testing

Write and run tests for Haskell and Python code. Use when adding tests, verifying behavior, or debugging test failures.

Required Companion Skills

Before writing implementation for tested behavior, also load:

test-driven-development
verification-before-completion

Process

Identify the relevant namespace and the test entrypoint.
Run bild --test <namespace> to build and execute tests.
If tests fail, isolate a single case and reproduce locally.
Fix the code or tests, then re-run the suite.
Add new tests for regressions before finishing.

Examples

bild --test Omni/Task.hs
_/bin/task test

Running Tests

bild --test Omni/Task.hs    # Build and run tests
_/bin/task test             # Run tests on existing binary

Test Convention

All programs follow the pattern: if first arg is test, run tests.

main :: IO ()
main = do
  args <- getArgs
  case args of
    ["test"] -> runTests
    _ -> normalMain

Writing Tests (Haskell)

Use HUnit:

-- : dep HUnit

import Test.HUnit

runTests :: IO ()
runTests = do
  results <- runTestTT $ TestList
    [ "parse valid input" ~: parseInput "foo" ~?= Right Foo
    , "parse empty fails" ~: parseInput "" ~?= Left "empty input"
    , "handles unicode" ~: parseInput "日本語" ~?= Right (Text "日本語")
    ]
  when (failures results > 0) exitFailure

Test helpers

-- Test that something throws
assertThrows :: IO a -> IO ()
assertThrows action = do
  result <- try action
  case result of
    Left (_ :: SomeException) -> pure ()
    Right _ -> assertFailure "Expected exception"

-- Test with setup/teardown
withTempFile :: (FilePath -> IO a) -> IO a
withTempFile action = do
  path <- emptySystemTempFile "test"
  action path `finally` removeFile path

Writing Tests (Python)

Use pytest:

# : dep pytest

import pytest

def test_parse_valid():
    assert parse_input("foo") == Foo()

def test_parse_empty_fails():
    with pytest.raises(ValueError):
        parse_input("")

def test_handles_unicode():
    assert parse_input("日本語") == Text("日本語")

Fixtures

@pytest.fixture
def temp_db():
    db = create_temp_database()
    yield db
    db.cleanup()

def test_with_db(temp_db):
    temp_db.insert({"key": "value"})
    assert temp_db.get("key") == "value"

What to Test

Do test

Happy path - Does it work with normal input?
Edge cases - Empty, None, max values, boundaries
Error cases - Invalid input, missing data, failures
Regressions - Add test for each bug fix

Don’t over-test

Simple getters/setters
Obvious wrappers
Implementation details that may change
External services (mock them)

Test-Driven Debugging

When something breaks:

Write a test that reproduces the bug
Verify test fails
Fix the code
Verify test passes
Keep the test (prevents regression)

Test Organization

runTests :: IO ()
runTests = do
  results <- runTestTT $ TestList
    [ -- Group related tests
      TestLabel "Parsing" $ TestList
        [ "valid input" ~: ...
        , "invalid input" ~: ...
        ]
    , TestLabel "Serialization" $ TestList
        [ "round trip" ~: ...
        ]
    ]

Common Test Patterns

Property-based (Haskell)

-- : dep QuickCheck

import Test.QuickCheck

prop_roundTrip :: Text -> Bool
prop_roundTrip t = decode (encode t) == Just t

runTests = quickCheck prop_roundTrip

Parameterized (Python)

@pytest.mark.parametrize("input,expected", [
    ("foo", Foo()),
    ("bar", Bar()),
    ("", None),
])
def test_parse(input, expected):
    assert parse(input) == expected

Mocking

from unittest.mock import patch

def test_api_call():
    with patch('module.requests.get') as mock_get:
        mock_get.return_value.json.return_value = {"key": "value"}
        result = fetch_data()
        assert result == {"key": "value"}

Debugging Test Failures

Run single test: pytest test_file.py::test_name -v
Add print statements
Check test isolation (does it pass alone but fail with others?)
Check for shared mutable state
Check for timing/race conditions