中文

Part 4

Expose Context Without Losing Control

Clients may render context, but the runtime must remain the source of truth.

Client-Facing Context

Reading Contract: Use this chapter to separate context ownership from context display. Track what TUI, realtime, app-server, token usage, and trace can expose without becoming the source of truth.

Chapter 7 showed that the runtime can reconstruct effective context from rollout evidence. The final layer is exposure. Users and clients need to see context state: token usage, compaction warnings, realtime mode, thread history, and replayed usage when attaching to an existing thread. Codex exposes those facts through the TUI, app-server notifications, realtime context modules, rollout trace, and telemetry. The key rule remains the same: clients render context; the runtime owns it.

This separation prevents a UI from becoming an alternate context manager. The TUI can show remaining context. The app-server can replay token usage. A trace can explain compaction. But the live history ledger and turn envelope stay in core.

By the end of this chapter, you should understand the client surfaces as projections of runtime-owned context state.

Surfaces and Their Owners

Codex exposes context through several surfaces. The table is short but worth internalising:

SurfaceWhat it showsRuntime ownerFailure mode if owner is wrong
TUI token barInput/cached/output tokens, remaining ratio.ContextManager.token_info + display baseline.UI invents its own budget number.
App-server token notificationLive token usage events for connected clients.Token usage events from sampling.Clients infer state from text rendering.
App-server replayRestored token usage update on reattach.Rollout evidence + reattach scope.Replay duplicates events to other subscribers.
Realtime fragmentRealtime mode start/end as model context.TurnContext.modes + settings update.Mode appears in transport but not in prompt.
Rollout traceInstalled checkpoint, compaction reason.Compaction install events.Trace blends compact-input and post-compact prompt.

The right column is the part that matters. Each surface has a precise way it fails when the runtime is not the source of truth.

Token Usage Is a Context Surface

The TUI token usage model separates input, cached input, output, reasoning output, and total tokens. It also computes a remaining percentage against a baseline-adjusted context window. That baseline is a display choice, not the runtime’s only budget rule. The core runtime still uses model info and token usage to trigger compaction.

The same source fact feeds multiple surfaces. That is what keeps the UI honest. The pattern is “fan-out from one fact” rather than “let each consumer compute its own number.” If the TUI bar disagrees with compaction’s threshold, that is a bug, not just a display preference.

App-Server Replay

When a client attaches to an existing thread, the app-server can send a restored token usage update to that connection. The code treats this as lifecycle replay, not a fresh model event. It avoids duplicating persisted usage records and avoids surprising other subscribers with historical updates.

Attribution is careful. If the latest persisted token count has an explicit turn id that still exists in the rebuilt thread, Codex uses it. If turn ids changed during reconstruction, it falls back to the active turn position recorded when the token count appeared.

// Pseudocode -- illustrates replay attribution.
owner = findTurnActiveWhenLatestTokenCountWasPersisted(rollout)
if rebuiltThread.hasTurn(owner.id):
    notify(connection, owner.id, usage)
else:
    notify(connection, rebuiltThread.turnAt(owner.position), usage)

The pattern is subtle: client replay is connection-scoped because it explains history to a new observer; it is not a new runtime event.

This is why the replay event is delivered only to the attaching client. Other subscribers already saw the original event when it was live; sending it again would look like a new model action and confuse their state.

Realtime Is Context, Not Just Transport

Realtime state appears in settings update logic. Starting or ending realtime can emit model-visible guidance. That is correct because realtime changes how the model should interact, not merely how bytes move. If a voice or realtime client changes the interaction contract, the model needs context about that mode.

This reinforces Chapter 2’s envelope idea: client metadata and realtime flags belong in the turn context because they alter the runtime contract.

A small comparison clarifies the move:

TreatmentResult
Realtime as transport flag only.Model keeps sending long-form text while audio client expects turn-by-turn replies.
Realtime as context fragment.Model receives explicit guidance: shorter turns, expect interruption, leave silence.

The correct treatment crosses a layer boundary. A piece of client metadata becomes prompt-visible state because it changes meaning at the model level, not just at the wire level.

Trace Gives Compaction Evidence

Remote compaction records an installed checkpoint payload containing input history and replacement history. That trace boundary is different from the later inference request. The distinction lets reducers and debuggers represent exactly what happened: the provider compacted one history, Codex installed another live history, and future sampling used the updated prompt projection.

Good observability does not just count tokens. It preserves semantic boundaries.

Semantic trace boundary separating compact input, checkpoint installation, future sampling, trace reducers, runtime ledger, and client surfaces
The trace is valuable because it does not collapse compact input, checkpoint installation, and later sampling into one vague event; each phase remains auditable.

Three phases live in the same trace stream: compact, install, sample. A naive trace would collapse them into one “compaction event” and lose the ability to audit each phase separately.

Apply This

  1. Runtime-Owned Display. Let clients render context facts, but do not let them own those facts. Derive UI state from runtime events, and keep UI-only context from replacing model-visible state.
  2. Connection-Scoped Replay. Replay historical context facts only to the attaching observer. Use the pattern for resumed clients, and mark replay events so they do not look like new live events.
  3. Attribution Fallback. Attribute restored usage by id first and position second. Apply the rule to rebuilt timelines, and keep regenerated ids from breaking UI state.
  4. Mode as Context. Treat interaction modes as model-visible context when they change behavior. Diff mode state, and do not hide behavior-changing transport flags from the prompt.
  5. Semantic Trace Boundary. Trace context rewrites as install events. Separate compaction input from later sampling input, and avoid observability that collapses distinct phases into one blob.