Inference Providers

When you write infer in a Turn program, the VM does not make HTTP calls. Instead, it delegates the entire request pipeline to a Wasm inference driver: a sandboxed WebAssembly plugin that knows how to talk to a specific LLM provider.

This page explains how the system works, how to configure it, and how to write your own.

The Architecture

Turn's inference pipeline is built on a strict security boundary: the Turn VM is the only component that can access the network. The Wasm driver is purely computational.

the dual-pass pipeline

Turn VM (Host)
│
│  (1) Passes Turn Inference Request JSON to Wasm module
▼
Wasm Driver (sandboxed)
│
│  (2) Returns HTTP Config JSON: URL, headers (with $env: templates), body
│      The driver CANNOT access the network or filesystem
▼
Turn VM (Host)
│
│  (3) Substitutes $env:OPENAI_API_KEY → real value from process environment
│  (4) Executes the HTTPS call via reqwest
│  (5) Passes raw HTTP response JSON back to Wasm module
▼
Wasm Driver (sandboxed)
│
│  (6) Parses HTTP response → structured Turn result JSON
▼
Turn VM (Host)
│
└──▶  result bound to the infer expression

Why this matters:

A Wasm driver cannot read your SSH keys, scan your disk, or exfiltrate your API keys. The W3C sandbox prevents all system calls.
Credentials are never in driver code. The driver writes $env:OPENAI_API_KEY as a template string. The Host substitutes the real value before making the HTTP call.
A single .wasm file runs everywhere. macOS, Linux, and Windows are all supported wherever the Turn VM runs. No native binary distribution per platform.
Microsecond cold starts. Wasm modules initialize in under 100μs vs. 10–50ms for an OS subprocess.

Invariant— Driver Sandbox Invariant

A Wasm inference driver is a pure transformation function: JSON string → JSON string. It has no host imports. It cannot access the network, filesystem, environment variables, or system clock directly.

Configuring a Provider

Set TURN_LLM_PROVIDER to specify the inference provider to use:

terminal

export TURN_LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...

turn run my_agent.tn

The provider path is resolved once at VM startup. All infer calls in the program use the same provider.

Official Providers

All official drivers are compiled to wasm32-unknown-unknown and available in the Turn repository under providers/:

Standard OpenAI

Connects to api.openai.com. Uses OpenAI's structured outputs (JSON Schema mode) for Cognitive Type Safety.

configuration

export TURN_LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_MODEL=gpt-4o        # optional, default: gpt-4o

Azure OpenAI

Connects to your Azure OpenAI deployment endpoint.

configuration

export TURN_LLM_PROVIDER=azure_openai
export AZURE_OPENAI_ENDPOINT=https://my-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_DEPLOYMENT=gpt-4o

Azure AI Foundry Anthropic

Connects to Anthropic's Claude via Azure AI Foundry (not the direct Anthropic API).

configuration

export TURN_LLM_PROVIDER=azure_anthropic
export AZURE_ANTHROPIC_ENDPOINT=https://my-foundry-resource.azure.com
export AZURE_ANTHROPIC_API_KEY=...

Anthropic

Connects directly to api.anthropic.com. Uses Anthropic's Messages API with structured system prompts for Cognitive Type Safety.

configuration

export TURN_LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
export ANTHROPIC_MODEL=claude-3-5-sonnet-20241022   # optional, default: claude-3-5-sonnet-20241022

Google Gemini

Connects to generativelanguage.googleapis.com. Uses Gemini's generateContent API with system_instruction support for structured prompts.

configuration

export TURN_LLM_PROVIDER=gemini
export GEMINI_API_KEY=...
export GEMINI_MODEL=gemini-1.5-pro   # optional, default: gemini-1.5-pro

xAI Grok

Connects to api.x.ai using the OpenAI-compatible chat completions endpoint.

configuration

export TURN_LLM_PROVIDER=grok
export XAI_API_KEY=...
export GROK_MODEL=grok-3   # optional, default: grok-3

Ollama

Connects to a local Ollama server via /api/chat. No API key required. Ideal for local development and air-gapped deployments.

configuration

export TURN_LLM_PROVIDER=ollama
export OLLAMA_MODEL=llama3          # optional, default: llama3
export OLLAMA_HOST=http://localhost:11434   # optional, default: http://localhost:11434

turn run my_agent.tn

TIP

Ollama supports a wide range of open-weight models. Run ollama pull llama3 (or any model) before starting your Turn program.

The `$env:` Template Syntax

Wasm drivers use $env:VARIABLE_NAME placeholders in their HTTP config output. The Turn Host resolves these before executing the request:

HTTP config returned by a Wasm driver

{
"url": "https://api.openai.com/v1/chat/completions",
"method": "POST",
"headers": {
  "Authorization": "Bearer $env:OPENAI_API_KEY",
  "Content-Type": "application/json"
},
"body": { "model": "$env:OPENAI_MODEL", "messages": [...] }
}

After substitution, the HTTP request the Host sends uses your real credentials, but the .wasm file itself never contains or reads them.

Writing Your Own Provider

A Turn inference driver is a Rust cdylib compiled to wasm32-unknown-unknown. It must export exactly three C-ABI functions:

provider_template/src/lib.rs

use serde_json::{json, Value};

// Memory management  -  the Turn host calls this to allocate space for JSON strings
#[no_mangle]
pub extern "C" fn alloc(len: u32) -> u32 {
  let mut buf: Vec<u8> = Vec::with_capacity(len as usize);
  let ptr = buf.as_mut_ptr();
  std::mem::forget(buf);
  ptr as usize as u32
}

// Pass 1: Turn Request → HTTP Config
// Input:  JSON string (Turn Inference Request)
// Output: JSON string (HTTP Request Config with $env: templates)
// Returns: packed u64 = (ptr << 32) | len
#[no_mangle]
pub unsafe extern "C" fn transform_request(ptr: u32, len: u32) -> u64 {
  let input = read_string(ptr, len);
  let req: Value = serde_json::from_str(&input).unwrap();

  let prompt = req["params"]["prompt"].as_str().unwrap_or("");
  let schema = &req["params"]["schema"];

  let body = json!({
      "model": "$env:MY_MODEL",
      "messages": [{"role": "user", "content": prompt}],
      "response_format": { "type": "json_object", "schema": schema }
  });

  let config = json!({
      "url": "https://my-llm-provider.com/v1/completions",
      "method": "POST",
      "headers": { "Authorization": "Bearer $env:MY_API_KEY" },
      "body": body
  });

  pack_string(config.to_string())
}

// Pass 2: HTTP Response → Turn Result
// Input:  JSON string (HTTP response: { status, headers, body })
// Output: JSON string (JSON-RPC result: { jsonrpc, id, result } or { error })
#[no_mangle]
pub unsafe extern "C" fn transform_response(ptr: u32, len: u32) -> u64 {
  let input = read_string(ptr, len);
  let http_res: Value = serde_json::from_str(&input).unwrap();

  if http_res["status"].as_u64().unwrap_or(0) != 200 {
      return pack_string(json!({
          "jsonrpc": "2.0", "id": 1,
          "error": format!("HTTP {}: {}", http_res["status"], http_res["body"])
      }).to_string());
  }

  // Parse provider-specific response format
  let response: Value = serde_json::from_str(
      http_res["body"].as_str().unwrap_or("{}")
  ).unwrap_or(json!({}));

  let content = response["choices"][0]["message"]["content"].as_str().unwrap_or("{}");
  let result: Value = serde_json::from_str(content).unwrap_or(json!(content));

  pack_string(json!({ "jsonrpc": "2.0", "id": 1, "result": result }).to_string())
}

// ── Helpers ──────────────────────────────────────────────────────────────────

unsafe fn read_string(ptr: u32, len: u32) -> String {
  let buf = Vec::from_raw_parts(ptr as *mut u8, len as usize, len as usize);
  String::from_utf8_lossy(&buf).into_owned()
}

fn pack_string(s: String) -> u64 {
  let len = s.len() as u64;
  let mut buf = s.into_bytes();
  let ptr = buf.as_mut_ptr() as u64;
  std::mem::forget(buf);
  (ptr << 32) | len
}

Build it:

terminal

# Install the Wasm target if you haven't already
rustup target add wasm32-unknown-unknown

# Build
cargo build --target wasm32-unknown-unknown --release

# The driver is at:
ls target/wasm32-unknown-unknown/release/my_provider.wasm

# Use it
export TURN_LLM_PROVIDER=custom

TIP

Because providers are pure JSON transformers, they can target any HTTP API: a local Ollama server, Llama.cpp, a private inference cluster, or a custom gateway. The Wasm model means the Turn community can build and distribute drivers for any provider without touching the core VM.

Inference Providers

The Architecture

Configuring a Provider

Official Providers

Standard OpenAI

Azure OpenAI

Azure AI Foundry Anthropic

Anthropic

Google Gemini

xAI Grok

Ollama

The $env: Template Syntax

Writing Your Own Provider

Next Steps

The `$env:` Template Syntax