Research: Bayesian calculus for prompt composition

t-398·WorkTask·
·
·
Created3 months ago·Updated3 months ago·pipeline runs →

Description

Edit

Develop theoretical framework for how prompts compose as priors.

Core Question

If prompts are priors, what's the posterior of compose(prompt1, prompt2)?

Intuitions to Formalize

1. System prompt sets a prior distribution over behaviors 2. Each message is a Bayesian update 3. Composing prompts should follow probability laws

Potential Frameworks

1. Hierarchical Bayes: prompt1 is hyperprior, prompt2 is prior

  • compose(system, skill) = hierarchical model

2. Product of Experts: each prompt is a constraint

  • compose(A, B) ~ p(x|A) * p(x|B) / Z

3. Mixture Models: prompts as mixture components

  • compose(A, B) = α*p(x|A) + (1-α)*p(x|B)

4. Information Geometry: prompts as points on manifold

  • compose = geodesic between prompts?

Research Questions

1. Which framework matches empirical behavior? 2. Can we derive composition laws that predict behavior? 3. What's the 'type theory' of prompts? (if A : Task and B : Style, what's A <> B?)

Validation

  • Derive predictions from theory
  • Test predictions empirically
  • Iterate

Connection to Existing Work

  • 'Bayesian Geometry of Transformer Attention' - foundation
  • Hierarchical Bayesian modeling - established theory
  • Information geometry of neural networks - active research

Notes

This is foundational research. High risk, high reward. If successful, enables principled prompt engineering instead of trial-and-error.

Timeline (3)

🔄[human]Open → InProgress3 months ago
💬[human]3 months ago

Connection to Prompt IR (from t-477 design session)

The Prompt IR design includes explicit support for Bayesian composition via CompositionMode:

data CompositionMode
  = Hierarchical    -- Hyperprior (system prompt, base instructions)
  | Constraint      -- Product-of-experts (must satisfy)
  | Additive        -- Mixture (adds info, can be dropped)
  | Contextual      -- Bayesian update (observation shifting posterior)

Mapping to your frameworks:

  • Hierarchical → Hierarchical Bayes (system prompt is hyperprior)
  • Constraint → Product of Experts (each section is a constraint, compose = multiply)
  • Additive → Mixture Models (α*p(x|A) + (1-α)*p(x|B))
  • Contextual → Sequential Bayesian updates

Composition operation:

compose :: PromptIR -> PromptIR -> PromptIR
compose a b = PromptIR
  { pirSections = mergeSections (pirSections a) (pirSections b)
  , ...
  }
  where
    -- Hierarchical sections from 'a' come first (hyperpriors)
    -- Constraint sections are AND'd (product of experts)
    -- Additive sections are collected (mixture)
    -- Contextual sections are sequenced (Bayesian updates)
    mergeSections = ...

This gives us a principled compose that respects the probabilistic semantics of each section type.

Research questions this enables: 1. Empirically test which CompositionMode matches observed LLM behavior 2. Derive composition laws: compose (compose A B) C == compose A (compose B C)? 3. What's the "type theory"? If A : Hierarchical and B : Additive, what's compose A B?