Probabilistic Routing

LLMs are inherently stochastic. Even with Cognitive Type Safety (infer Struct), an LLM might return an answer that structurally conforms to the required schema but is factually or contextually uncertain. Turn solves this problem at the language level via Probabilistic Routing.

In Turn, inference is not just about obtaining a typed result. It is also about managing the model's confidence. The confidence operator allows you to access the underlying probability of any variable and guarantee a deterministic fallback path if the provider's reported confidence falls below your specified threshold.

The `confidence` Operator

The probabilistic threshold is an explicit confidence boundary. It requires the LLM provider to assign a float scalar [0.0, 1.0] representing its log-probability certainty on the response. If the confidence fails to clear the threshold, you can use standard control flow to abort the primary assignment and fallback safely.

routing.tn

struct Sentiment {
  score: Num,
  reasoning: Str
};

// Execute inference
let analysis = infer Sentiment { 
  "Categorize the following sentiment: 'It was okay, I guess.'";
};

// Route execution if the model is less than 80% confident
if confidence analysis < 0.80 {
  // Deterministic fallback
  return Sentiment {
      score: 0.5,
      reasoning: "Automated Fallback: Model uncertainty triggered."
  };
}

call("echo", "Final Reasoning: " + analysis.reasoning);

TIP

The fallback block is completely deterministic. It executes natively in the Turn VM (zero network latency) and enforces structural type safety. The block must assign a value that strictly conforms to the expected structural type constraint of the originating infer statement.

Handling Total Provider Failures

Probabilistic Routing isn't just for low-confidence scores; you can use it to build a safety net for the entire stochastic operation. The conditional block is triggered under any of the following failure scenarios:

Provider Confidence Failure: The returned score is explicitly below your scalar threshold (e.g. < 0.80).
Schema Coercion Failure: After consuming all automatic retry loops, the LLM continually hallucinates incorrect JSON shapes.
Provider Outage: The upstream LLM API goes down, rates limits, or returns a non-200 HTTP code.

Instead of writing verbose try/catch or match blocks for network I/O, Turn treats all non-deterministic failure states interchangeably with low-confidence responses via if confidence < threshold.

Untyped Fallbacks

The threshold operator also applies seamlessly to untyped infer calls (raw generic Strings).

untyped_fallback.tn

// If the model cannot generate a suitable string with 90% confidence
let joke = infer Str {
  "Tell me a highly original joke about WebAssembly.";
};

if confidence joke < 0.90 {
  return "I couldn't think of a good Wasm joke. I guess it got lost in the binary translation.";
}

call("echo", joke);

Why Language-Level Support?

In traditional frameworks, handling confidence branching requires manual orchestration:

Asking the LLM to output a `"confidence_score"` field inside its JSON response.
Writing manual if/else logic after parsing.
Dealing with the edge case where the LLM forgets the score field or hallucinates a `"high"` string instead of a float.

Turn handles routing mechanically inside the Wasm Driver boundary. The host extracts top-level token probabilities and forces the branch natively. The syntax is mathematically robust: either you get the guaranteed type above your confidence threshold, or you execute the exact fallback logic you dictated. There are no silent failures.

Confidence as a Reflection Trigger

Low confidence does not have to mean giving up. The more powerful pattern is using it as a trigger to run a second, deeper inference pass before acting. This is how Turn implements agent self-reflection without any prompt engineering hacks.

The key insight: instead of falling back to a hardcoded value when confidence is low, feed the first draft back into context and ask the model to critique and improve it.

reflection.tn

struct Draft    { content: Str };
struct Critique { flaws: List, improved_version: Str };
struct Final    { content: Str, confidence_gained: Num };

let topic = "Write a concise risk summary for a Series B fintech investment.";

// Pass 1: Generate initial draft
let draft = infer Draft { topic; };

// Check confidence. If the model was uncertain, force a reflection pass.
if confidence draft < 0.75 {
  // Feed the draft into context so the model can see its own output
  context.append("Initial draft: " + draft.content);

  // Pass 2: Structured self-critique
  let critique = infer Critique {
      "Review the draft above. Identify weaknesses and produce an improved version.";
  };

  // Pass 3: Produce the final response informed by the critique
  context.append("Critique and improvements: " + critique.improved_version);
  let final_draft = infer Final { "Write the definitive version based on the critique."; };

  return final_draft.content;
}

// High confidence path: use the original draft directly
return draft.content;

This pattern has three properties that prompt engineering cannot match:

The reflection is typed. The Critique struct forces the model to produce a list of discrete flaws and a concrete improved version. It cannot produce vague, unstructured introspection.
The reflection is conditional. It only runs when confidence actually warrants it, not on every invocation. In a batch of 1,000 documents, only the uncertain ones pay the cost of a second pass.
The reflection is auditable. Every variable is a named, typed value in the agent's memory. You can remember("last_critique", critique) and inspect it later with turn inspect to understand exactly why an agent took a second pass.

NOTE

For high-stakes applications, the Critic can be a separate concurrent actor with its own context and system prompt. See Concurrency and Actors for the full peer-reflection pattern using spawn_link.