Chapter 5: Threads, Sessions, and Durable State
Reading Contract: Use this chapter to understand durable work. Track thread identity, queues, history, rollout, and session ownership as separate answers to state, progress, and resumability.

Source boundary: named files, types, functions, tests, schemas, and request or event shapes are verified source only where this chapter links to the pinned Codex commit or its Source Map. Broader architecture terms such as “runtime”, “owner”, “projection”, and “contract” are surrounding contract inference from those visible anchors, not claims about OpenAI service internals.
Chapter 4 ended at the protocol boundary: clients can submit operations, receive events, and exchange model items without knowing how the runtime works. This chapter steps behind that boundary and shows the durable machinery that turns those messages into a live agent thread.
Problem: an agent must be resumable, forkable, interruptible, and observable while still feeling like one continuous conversation.
Thesis: Codex achieves that by separating the live runtime handle from the durable thread record.
Mental model: a thread is the long-lived work ledger; a session is the running process currently serving it.
The Runtime Stack
Codex is not organized around a single “chat object.” The runtime is a stack of handles, each narrowing the authority of the layer above it.
ThreadManager owns the set of live threads and the shared services needed to create new sessions. CodexThread is the stable external handle clients use after a thread is created or resumed. Codex is intentionally narrow: it exposes a submission path and an event path. Session is the large interior object that holds resolved configuration, persistent identity, active turn state, mailbox state, realtime state, goals, review state, and service handles.
This separation matters because clients should not be able to accidentally reach into the scheduler. They submit an operation. The session decides whether that operation starts a new task, steers an active one, records pending input, updates durable metadata, or shuts the thread down.
The First Event Is a Contract
The first client-visible event in a newly opened runtime is the session setup event. It carries the thread identity, session identity, working directory, model choice, provider identifier, approval policy, permission profile, initial messages, and other resolved settings.
That ordering is deliberate. Some startup work can continue after the session is visible: MCP servers may initialize, plugins may report delayed capability, and prewarm work may run in the background. The setup event gives every client a deterministic anchor before those later events arrive.
// Pseudocode - illustrative pattern.
function open_runtime_thread(request):
resolved = resolve_configuration(request)
live_store = open_thread_persistence(resolved.thread_identity)
session = build_session(resolved, live_store)
emit_event(SessionConfigured(resolved.public_metadata))
record_initial_history_if_resuming(session)
start_background_capability_initialization(session)
start_submission_loop(session)
return client_thread_handle(session)
The important property is not the exact implementation order. The important property is that clients can treat setup as the beginning of the event stream and can render a resumed thread without waiting for every optional capability to finish loading.
Three Histories
Most mistakes in agent architecture begin by treating “history” as one thing. Codex keeps three histories with different purposes.
| History | Primary owner | What it answers |
|---|---|---|
| Model-visible context | ContextManager | What should the next model request see? |
| Rollout JSONL | thread persistence | What happened, in replay order, so this thread can resume or fork? |
| SQLite projections | state layer | What can be queried efficiently without replaying every line? |
The model-visible context is normalized into model items. It is not a UI transcript and not a raw log. It contains the items that should participate in future inference, filtered by model modality and adjusted by compaction, context injection, tool observations, and pending input.
The rollout file is the replay spine. It records durable runtime facts in order: session metadata, turn context snapshots, model-visible items, event records, compaction records, rollback markers, and other replayable items. It is the source of truth for reconstructing a thread.
The SQLite state is a projection layer. It stores thread metadata, list/search indexes, logs, memory jobs, graph edges, and other query-friendly state. A projection can be repaired or rebuilt from the durable record when the design allows it. The rollout is the ledger; the database is the index and operational state.
Recording One Item May Touch All Three
When a turn records a new item, the runtime has to decide which surfaces should learn about it. A tool result might be model-visible, replayable, useful for client rendering, relevant to analytics, and relevant to trace diagnostics. Those are related, but they are not identical.
// Pseudocode - illustrative pattern.
function record_runtime_item(turn, item):
if item.belongs_in_model_context:
context_manager.append(item.as_model_item)
if item.belongs_in_replay:
rollout.append(item.as_rollout_item)
if item.updates_queryable_state:
sqlite_projection.apply(item.as_projection_delta)
if item.should_be_seen_by_clients:
event_stream.emit(item.as_event)
This design protects replay. A client can disappear and rejoin. A database row can be stale and refreshed. A model request can be rebuilt from normalized context instead of screen text. Durable state remains the common language among those surfaces.
Start, Resume, Fork, Roll Back
Thread operations are variations on the same principle: load or choose a replay prefix, build a session over it, then append future items in the same vocabulary.
Resume is not a string reload. It reconstructs model history and session metadata from durable items. Fork is not a copy of a UI transcript. It selects a coherent prefix and opens a new thread whose future diverges from the source. Rollback is not merely deleting visible messages. It rebuilds the active model context from the surviving durable record, taking compaction and incomplete turns into account.
This is why turn context snapshots are durable. They let a later session know which working directory, permissions, model settings, network policy, memory mode, and environment choices surrounded a turn. Without that context, replay would be text without execution semantics.
Session State Is Layered
The live session holds several kinds of state that should not be collapsed.
| Layer | Lifetime | Examples |
|---|---|---|
| Configuration | session lifetime with controlled updates | model, provider, permissions, working directory, service tier |
| Mutable session state | session lifetime | token usage, current metadata, connector selection, rate limits |
| Active turn state | one in-flight task | cancellation token, tool futures, pending approvals, streamed items |
| Mailbox and pending input | across scheduling boundaries | messages that arrive while another turn is running |
| Durable store handles | thread lifetime | rollout writer, thread metadata, state database handle |
The layering lets Codex answer a subtle question: “Did this operation change the durable thread, the current live turn, the next turn, or only a client view?” Those are different commitments. A runtime that cannot separate them will eventually lose user input, duplicate tool results, or make resume incoherent.
Metadata Is Extracted, Not Invented
Thread lists and search views should not require full replay on every request. Codex therefore maintains metadata projections: title or preview fields, model provider, memory mode, archive state, working directory, git information, created and updated positions, and pagination anchors.
The projection is useful because it is queryable. It is safe because the durable rollout still exists behind it. When a listing row is missing or stale, the system can scan the corresponding rollout head or replay metadata-bearing items to repair the index. That is the durable-state pattern: optimize reads with projections, but keep the event record authoritative.
Apply This
- Keep the live-handle stack explicit, so clients submit operations rather than mutating scheduler internals.
- Separate thread identity from session execution, because one durable thread may have many runtime lifetimes.
- Emit a deterministic setup event before optional startup work, so every client has the same stream anchor.
- Maintain separate model, replay, and query histories instead of forcing one transcript to serve all purposes.
- Implement resume, fork, and rollback over the same replay vocabulary used by normal turns.
Closing
Chapter 5 turns the protocol from Chapter 4 into a living runtime. The thread is durable; the session is active; the turn is scheduled work inside that session. Chapter 6 follows one scheduled turn through the loop where Codex builds context, samples the model, executes tools, handles interruptions, and decides whether the agent is done.
Source Map
| Concept | Source anchor |
|---|---|
| Thread manager boundary | codex-rs/core/src/thread_manager.rs |
| Client-facing thread handle | codex-rs/core/src/codex_thread.rs |
| Queue-pair runtime facade | codex-rs/core/src/session/mod.rs |
| Model-visible history | codex-rs/core/src/context_manager/history.rs |
| Accepted prompt recording | codex-rs/core/src/session/mod.rs |