Resume, Rollback, Fork, and Replay

Reading Contract: Use this chapter to read resume, rollback, fork, and replay as reconstruction problems. Track the durable facts that let Codex rebuild prompt state after time or branching.

Chapter 6 explained compaction as a checkpoint protocol. That protocol only pays off if the system can later reconstruct effective context. Codex has to resume old threads, roll back recent user turns, fork work for child agents, and replay enough history for clients. The runtime cannot trust an in-memory vector after process exit or branching. It must rebuild from rollout evidence.

The reconstruction code is one of the clearest examples of Codex’s context discipline. It scans rollout items from newest to oldest to find the latest surviving replacement-history checkpoint and resume metadata, then replays the surviving suffix forward. Rollback markers are applied while scanning, so the rebuilt history reflects effective state rather than raw event order.

By the end of this chapter, you should understand why durable context is not the same as durable transcript.

This chapter maps to RolloutReconstruction, reverse reconstruction, legacy compaction handling, user-turn rollout positions, and fork-turn positions.

Rollout reconstruction scans backward through checkpoints, rollback markers, agent envelopes, user turns, and assistant turns — Resume and fork are projection problems: the raw rollout stays append-only while reconstruction chooses the newest surviving checkpoint, applies rollback markers, and respects agent-triggered boundaries.

Reconstruction Has Three Outputs

Reconstruction returns three things:

Output	Why it matters
Rebuilt history	The model-visible ledger for future turns.
Previous turn settings	The metadata needed to decide model/realtime diffs on resume.
Reference context item	The baseline for settings diffs, or an explicit cleared state.

The last two outputs are easy to miss. Resume is not just “load messages.” It must also recover the context baseline used by the diff system from Chapter 4. If compaction cleared that baseline, resume must preserve the clearing.

The reverse scan is efficient because old rollout items become irrelevant once a newer surviving replacement-history checkpoint and needed metadata are known.

Reading the Rollout Layout

A rollout is an append-only log of structured items. A typical section contains initial context, user turns, assistant turns, tool observations, checkpoints, rollback markers, and fork boundaries.

The scan walks right to left and short-circuits at the latest surviving checkpoint. In this example that is CP between u3 and u4, so older items (i_c, u1, a1, t1, u2, a2) become irrelevant for the rebuilt history. The rollback marker RB is interpreted while scanning.

Reverse Scan Algorithm

The algorithm is small but careful. The pseudocode below preserves the actual shape:

// Pseudocode -- reverse reconstruction.
rebuiltSuffix      = []
referenceCleared   = false
sawCheckpoint      = false

for item in reversed(rollout):
    if not sawCheckpoint and item.isReplacementCheckpoint():
        rebuiltSuffix = item.replacementBase + rebuiltSuffix
        sawCheckpoint = true
        continue

    if item.isRollbackMarker():
        applyRollback(rebuiltSuffix, item.scope)
        continue

    if item.isPreviousSettings():
        previousSettings = previousSettings or item.value
        continue

    if item.isReferenceContext():
        referenceContext = referenceContext or item.value
        if item.cleared: referenceCleared = true
        continue

    if not sawCheckpoint:
        rebuiltSuffix.prepend(item)

if referenceCleared:
    referenceContext = None

return rebuiltSuffix, previousSettings, referenceContext

Notice three properties. First, the loop terminates as soon as it has the information it needs. Second, the cleared flag wins over an earlier baseline even if the baseline appears later in reverse order. Third, rollback markers are applied while building, not after, so the suffix does not have to be edited twice.

Rollback Changes the Meaning of the Past

Rollback markers do not delete raw rollout records. They change which user-turn segments count in effective history. While scanning in reverse, Codex interprets “drop the newest N user turns” as “skip the next N finalized user-turn segments.” That lets reconstruction keep raw evidence while rebuilding the state the user asked for.

The same logic appears in rollout truncation helpers. User-message positions are tracked while applying rollback markers. Fork-turn positions include real user messages and assistant inter-agent envelopes that trigger turns; rollback removes stale suffixes based on instruction-turn boundaries.

This is a serious design choice. It means rollback is an event in the ledger, not a destructive edit to the log.

Naive approach	Codex approach
Mutate the log: delete rolled-back items.	Append a marker and apply it during reconstruction.
Resume reads exactly what is on disk.	Resume reads the rollout and projects it.
Every audit shows the surviving state, not why it survived.	Every audit can replay the rollback decision.
Two clients can disagree if one read before, one after, the deletion.	Both clients see the same append-only log.

The Codex approach pays a small replay cost in exchange for full auditability.

Fork Boundaries Are Not Only Human Messages

Multi-agent work complicates context boundaries. A child agent may start from an assistant inter-agent envelope rather than a normal user message. Codex’s fork turn logic treats certain assistant envelopes as boundaries when they trigger a turn. That preserves the semantic unit of delegated work.

// Pseudocode -- illustrates effective fork truncation.
for item in rollout:
    if item.isRollbackMarker():
        removeRolledBackInstructionTurns()
    if item.isRealUserMessage()
       or item.isTriggeringAgentEnvelope():
        rememberForkBoundary(item.position)
return suffixStartingAtNthBoundaryFromEnd()

The pattern matters outside multi-agent systems too. If your runtime has more than one way to start work, your context truncation must understand all of them.

The boundary rule is concise:

AGENT_ENVELOPE is an assistant message that triggered a child agent turn. It is treated as a boundary because the suffix starting there is its own unit of work. A user message is also a boundary, but it is not the only kind.

Legacy Compaction

The reconstruction code still handles legacy compaction records without replacement history. It rebuilds compacted history from user messages and the stored compaction message, clears the reference baseline, and accepts a less ideal prompt shape. That backward-compatibility path is instructive: the newer checkpoint protocol exists because summary-only compaction is not enough.

This is also why the source distinguishes durable rollout evidence from live history. The live history can improve over time, while rollout replay preserves compatibility with older records.

The legacy decision rule is small but worth naming:

The branch never silently produces an inferior prompt; the shape difference is explicit, recorded, and visible to telemetry.

Apply This

Replay From Evidence. Rebuild context from append-only rollout facts. Treat live memory as a cache, and avoid resume paths that trust stale in-memory state.
Reverse Checkpoint Search. Scan backward to find the newest surviving base. Apply the pattern to event-sourced systems, and do not replay the entire log when a checkpoint can bound the work.
Rollback Marker. Record rollback as an event. Apply markers during reconstruction, and avoid destructive log edits that erase auditability.
Semantic Boundaries. Define user, agent, and fork turn boundaries explicitly. Apply the boundary to every source of work, and reject truncation that only understands human messages.
Legacy Bridge. Keep compatibility paths, but clear unsafe baselines. Prefer correctness over perfect prompt shape, and do not treat old records like new checkpoints.