Memory & Semantic Recall
Turn gives every agent a persistent, semantic memory — a key-value store backed by an HNSW vector index that decays over time like human memory. You write to it with remember, read from it with recall, and the VM enriches your infer calls automatically.
The Problem with Ad-Hoc Memory
In most agent systems, "memory" means either:
- A messages list — unbounded, context-polluting, no retrieval strategy
- A vector database — a separate service to maintain, query, and synchronize
Turn makes memory a first-class component of the process state, stored alongside the agent's environment, context window, and mailbox. It is durable, semantic, and automatic.
remember — Writing to Memory
remember("key", value);Stores a value in the agent's persistent memory under the given key. Memory persists across turns, across suspend boundaries, and across process restarts — it is part of the durable agent state.
// Store any Turn value
remember("user_name", "Alice");
remember("preferred_style", "concise, bullet-point format");
remember("context_snapshot", current_summary);
// Memory values are indexed semantically via HNSW
// When you remember a Str, the VM generates an embedding automatically
remember("last_conversation_topic", "pricing strategy for enterprise tier");When you remember a string value, the Turn VM:
- Stores the key-value pair in the durable WAL
- Generates an embedding vector for the value automatically (via the configured embedding endpoint)
- Inserts the vector into the HNSW semantic index
This embedding happens transparently. You write remember() — the VM handles the intelligence.
recall — Reading from Memory
let value = recall("key");Retrieves the value stored under the given key. Returns null if the key does not exist.
let name = recall("user_name");
call("echo", "Hello, " + name);
let style = recall("preferred_style");
let result = infer Summary {
"Summarize this report in " + style + ": " + report_text;
};Semantic Auto-Recall
The most powerful feature of Turn's memory system is Semantic Auto-Recall: before every infer call, the VM automatically queries the HNSW index for the 3 most semantically similar memories to the current prompt.
These memories are prepended to the context payload — so the model receives relevant history without you writing any retrieval logic.
// Session 1: an agent learns about a user
remember("alice_contract", "Alice signed a 2-year enterprise contract in Nov 2024");
remember("alice_preference", "Alice prefers email communication over calls");
remember("alice_issue", "Alice escalated a billing dispute in Jan 2025");
// ... time passes, process restarts, world moves on ...
// Session 7: a new turn, brand new context
let result = infer Recommendation {
"Generate a check-in message for Alice about her account renewal.";
};
// The VM automatically recalled the 3 most relevant memories before calling infer:
// - "Alice signed a 2-year enterprise contract in Nov 2024"
// - "Alice prefers email communication over calls"
// - "Alice escalated a billing dispute in Jan 2025"
// The model knows everything it needs — you wrote zero retrieval code.Every infer call is preceded by a semantic search over all memories in the agent's HNSW index. The top-k results (default: 3) are injected into the system context before the LLM prompt. This costs 50 token-budget units per infer call.
Ebbinghaus Temporal Decay
Turn's memory system implements a variant of the Ebbinghaus Forgetting Curve: memories that have not been accessed recently have lower recall priority in the HNSW search rankings.
This prevents stale, irrelevant memories from polluting every inference call indefinitely. Memories that are frequently accessed remain highly ranked. Memories that have not been accessed are gradually de-prioritized.
The effect is that long-running agents don't accumulate "memory debt" — their recall stays semantically focused on what's actually relevant to recent work.
Memory vs. Context
It's important to understand the distinction between Turn's two forms of working knowledge:
| Memory | Context | |
|---|---|---|
| Lifetime | Permanent (persists across restarts) | Per-turn (cleared on completion) |
| Capacity | Unbounded | Token-budgeted |
| Access | Semantic similarity search | Ordered stack (FIFO with priority) |
| Auto-enrichment | Yes — top-k injected per infer | Yes — always included |
| Written with | remember() | context.append() |
| Read with | recall("key") | Automatic |
Use memory for facts that should persist across conversations, sessions, and process restarts. Use context for the working scratchpad of the current turn.
Next Steps
- Context Window — How the token-budgeted priority stack works
- The
inferPrimitive — How memory feeds your inference calls - Runtime Model — Where memory fits in the full agent state