Chapter 17: MCP: External Tools Without Runtime Entanglement

Reading Contract: Use this chapter to answer one question: how can an external MCP server become available inside a Codex turn without becoming part of the Codex runtime? Track four owners: server configuration, connection lifecycle, model-visible tool shape, and provenance-based routing. Afterward, you should be able to explain why the model sees a sanitized tool name while execution still uses the raw server/tool pair.

MCP trust plane separating server provenance, sanitized tool names, discovery, routing, elicitation, and structured observations — MCP extends capability without runtime entanglement by separating server provenance, transport, discovery, normalized specs, and routing.

Source boundary: named files, types, functions, schemas, request shapes, and event shapes are verified source only where this chapter links to the pinned Codex commit or this chapter’s Source Map. The claim that MCP is a “trust plane” is surrounding contract inference from those visible owners. This chapter does not claim to know provider internals or private hosted-app policy.

Chapter 16 ended Part IV by treating the terminal UI as a client over the same runtime contract used by other clients. This chapter moves one step outward. If clients can observe and steer threads without owning the runtime, external tools must also be able to join a turn without becoming trusted runtime code.

You are here: clients already observe threads through typed events and send decisions back through controlled requests.

Problem: external tools need to be available to the model, but server transport, authentication, naming, and failures cannot leak into the core turn loop.

Mental model: MCP is a four-stage boundary: define effective servers, start clients, expose normalized tool specs, then route calls by stored provenance.

Calling MCP “just more tools” erases the important part. A built-in shell tool has a handler inside Codex. An MCP tool is discovered from a server that may run over stdio, HTTP, an executor-backed process, or an in-process adapter. The model should not choose among those transports. The turn loop should not treat that external server as if it were compiled into the runtime.

The architecture therefore splits one apparent capability into several owned records:

Layer	Owner	What must stay stable
Server configuration	`codex-mcp` config and effective server merge	transport, OAuth state, sandbox policy, built-ins, and plugin attribution
Connection lifecycle	`McpConnectionManager`	startup status, managed clients, cancellation, and elicitation state
Model-visible projection	tool normalization and core exposure	safe names, deduplicated namespaces, direct/deferred tool lists
Execution routing	provenance in `ToolInfo`	raw server name, raw tool name, shaped observation, and error boundary

The invariant is simple: Codex may show the model a normalized function, but the model-visible string never becomes the source of routing authority.

1. MCP Creates A Provenance Boundary

MCP has three identities that must not be collapsed.

Identity	Who uses it	Why it exists
Raw server identity	MCP client and connection manager	Keeps the real server, transport, and tool namespace addressable.
Model-visible name	model prompt and tool schema	Gives the model a safe identifier that fits tool-call naming limits.
Provenance record	runtime router and audit trail	Maps the model-visible call back to the owning server and raw tool.

The separation is not cosmetic. External servers can choose colliding names, use characters awkward for tool calling, or expose tools whose origin matters for approval, sandboxing, and user explanation. Codex therefore treats naming as an adaptation step.

The source makes this boundary visible in two places. McpConfig stores long-lived MCP runtime settings such as OAuth state location, approval policy, sandbox executable, configured servers, built-ins, and plugin capability summaries. ToolInfo then keeps both sides of the name split: server_name and raw tool for protocol calls, plus callable_namespace and callable_name for the model-facing declaration.

Shape-level, a discovered tool starts like this:

{
  "server_name": "github",
  "tool": {
    "name": "create_pull_request",
    "input_schema": { "type": "object" }
  },
  "callable_namespace": "GitHub",
  "callable_name": "create_pull_request",
  "connector_id": "github"
}

After normalization, the model may see a shorter or hashed callable_namespace / callable_name, especially when raw identities collide or exceed the API length limit. The raw tool.name remains available for execution. normalize_tools_for_model() is the verified source for that split: it sanitizes model-visible parts, adds hash suffixes when needed, sorts by raw identity, and writes the final callable fields without discarding raw metadata.

1.1 Discovery Is Not Dispatch

Discovery produces the model-facing catalog. Dispatch uses the remembered origin. The lifecycle is:

effective server
  -> managed MCP client
  -> listed raw tools
  -> normalized ToolInfo records
  -> model-visible tool declarations
  -> tool call resolved back to (server, raw tool)
  -> structured CallToolResult returned to Codex

McpConnectionManager::list_all_tools() aggregates listed tools from managed clients and normalizes them at the manager return boundary. McpConnectionManager::call_tool() takes server and tool separately, checks the per-server tool filter, calls the raw MCP tool name, and converts MCP content into Codex’s protocol CallToolResult.

That shape prevents a common bug: reconstructing provenance by parsing a public tool name. A name such as github__create_pull_request may be useful to the model, but the real route is the stored server/tool pair.

1.2 Direct And Deferred Exposure Protect Context

Tool discovery can create too many declarations for a single prompt. Codex therefore separates the complete MCP catalog from the set inserted directly into the model request. build_mcp_tool_exposure() returns direct_tools and optional deferred_tools. The direct set can keep explicitly enabled hosted app tools visible while the broader MCP catalog moves behind search/deferred loading.

Pressure	Simpler approach that fails	Codex mechanism	Invariant protected
Many discovered tools	dump every schema into every request	direct/deferred exposure	prompt budget and stable tool surface
Hosted app access differs by account	expose every directory tool	filter by connector IDs and app-tool enablement	only accessible hosted tools become direct tools
Tool names collide	trust raw names	sanitize and hash callable parts	model-visible names remain unique
Raw metadata is needed for execution	overwrite raw names with public names	keep `ToolInfo.tool.name` and `server_name`	dispatch keeps provenance

This is why MCP discovery is a runtime projection, not a registry mutation. The complete capability set can exist behind the boundary while only a safe subset enters the model-visible view.

MCP and client tools passing through canonical names, provenance, direct spec, deferred tool search, unavailable placeholders, and route-back handler lookup — Tool provenance keeps discovery, model-visible naming, deferred exposure, and route-back handler lookup separate.

2. Server Lifecycle Belongs Behind The Boundary

The connection manager is deliberately more than a map of functions. Its source comment says it owns running clients, startup status events, server origin metadata, aggregated tools/resources/templates, tool routing, and the public manager API used by core. The fields on McpConnectionManager match that statement: managed clients, server metadata, hosted app enablement, elicitation requests, and a startup cancellation token.

Startup is designed to report status rather than block the whole runtime indefinitely. During McpConnectionManager::new(), each enabled server emits Starting, resolves to Ready, Cancelled, or Failed, and then contributes to a startup-complete summary. Required server failures can later be queried through required_startup_failures().

That lifecycle protects an availability invariant:

optional server fails
  -> startup status records failure
  -> unavailable tools are absent or stale
  -> ordinary turn processing can continue

required server fails
  -> required_startup_failures reports a concrete blocker
  -> caller can stop or ask the user to fix auth/config

2.1 Elicitation Is A Runtime Request, Not Hidden Prompt Text

Some MCP operations need more user or client input. The connection manager holds an ElicitationRequestManager, lets callers update approval policy and permission profile, and exposes resolve_elicitation(). That makes elicitation a structured runtime path. It is not a license for an external server to silently invent missing user decisions or smuggle instructions into the model prompt.

2.2 Resources And Templates Stay Read/List Operations

MCP servers can expose resources and resource templates as well as tools. Codex keeps those request families separate. list_all_resources() and list_all_resource_templates() aggregate by server and warn on per-server failure. Per-server read_resource() preserves the server name in the error context.

That distinction matters because a resource can become context, a tool can create a side effect, and a template can become a parameterized read. Sharing clients is fine; sharing semantics would be a boundary error.

3. Hosted App Tools Converge Late

Hosted app tools can look like MCP tools after exposure, but their source of truth is not the same as a user-configured MCP server. They depend on connector IDs, directory metadata, account access, and hosted app enablement. Core exposure makes that late convergence visible: hosted app MCP tools are filtered through connector lists and codex_app_tool_is_enabled() before becoming direct tools.

The safe rule is:

MCP server tool       -> server provenance proves where to call
hosted app tool      -> connector/access metadata proves whether to expose
model-visible tool   -> one tool boundary, after the previous checks

A hosted app can be known in a directory but unusable by the current account. An MCP server can be configured but failed at startup. Both cases should be visible to clients as capability state, not hidden as model confusion.

4. Outbound Codex MCP Is A Bridge, Not A Second Runtime

Codex also has an outbound direction: expose Codex itself as an MCP server. The source is intentionally narrow. MessageProcessor::new() creates a ThreadManager with SessionSource::Mcp. handle_list_tools() exposes two tools, codex and codex-reply. handle_call_tool() dispatches only those names and returns an error for unknown tools.

Shape-level, the outbound bridge looks like this:

{
  "tools/list": ["codex", "codex-reply"],
  "tools/call": {
    "codex": "start a Codex session from validated arguments",
    "codex-reply": "send a reply to an existing thread_id"
  },
  "notifications/cancelled": "submit Interrupt to the mapped Codex thread"
}

That is not symmetry with inbound MCP. Inbound MCP lets Codex consume an ecosystem of tools. Outbound MCP lets an external MCP client start or resume Codex work and receive results through a narrow tool surface. Exporting every thread, turn, approval, and rollout concept as arbitrary MCP capabilities would blur the native product contract.

5. Failure Conditions Define The Boundary

MCP failures are extension failures before they are agent failures. Preserving the category gives the caller the right recovery path.

Failure	Verified source surface	Runtime meaning	Recovery
startup timeout/auth error	startup status and `mcp_init_error_display()`	server did not become available	show config/auth guidance and continue or block if required
disabled tool	`ToolFilter::allows()` and `call_tool()` check	tool exists but is not allowed for this server	do not dispatch; return a structured error
stale hosted app cache	app-tool cache refresh path	catalog may be old, server health not proven	hard refresh or mark capability stale
resource pagination failure	per-server resource aggregation warning	one server failed a read/list family	keep other servers’ resources
elicitation cancelled	elicitation manager resolution path	required input was not provided	stop that operation without inventing data
unknown outbound tool	outbound `handle_call_tool()`	external client asked for a non-contract operation	return MCP tool error

Tool router converting function, custom, local shell, tool search, and MCP response items into payload kinds before handler matching — The router gate is where a model-visible call becomes a classified payload before any runtime handler executes it.

The failure table is also the trust table. Anything that cannot be proven by config, connection status, connector access, or provenance should not be promoted to model-visible capability.

Trace Ledger

Question	Chapter 17 answer
Where is the user request now?	It can cross from the model-visible tool boundary into an external MCP server, or from an external MCP client into a narrow Codex MCP bridge.
What carries it?	`McpConfig`, effective server metadata, managed clients, `ToolInfo`, direct/deferred exposure, raw `(server, tool)` routing, resource/template requests, and `CallToolResult`.
Who owns the next decision?	The model chooses a visible tool; Codex resolves provenance and policy; the MCP server performs the operation; Codex shapes the observation.
What must remain invariant?	Model-visible names may be rewritten, but raw server/tool provenance, account access, approval policy, and error category must remain intact.
What can fail here?	startup, OAuth, listing, stale cache, disabled tools, name collision, resource access, elicitation, unknown outbound tools, or tool-call failure.

Apply This

Split identity before dispatch. Use this when external tools can collide or be renamed. Keep raw server identity, raw tool name, model-visible name, and provenance as separate fields. Pitfall: parsing a public tool name to decide where to route.
Treat discovery as projection. Use this when a catalog is larger than one prompt. Produce a full capability catalog, then choose direct versus deferred exposure. Pitfall: equating “known” with “inserted into the current model request.”
Make server lifecycle observable. Use this when optional extensions can fail. Emit startup status and required-server failures separately. Pitfall: letting one flaky optional server collapse the runtime.
Keep reads distinct from side effects. Use this when a protocol has tools, resources, and templates. Share clients, not semantics. Pitfall: turning every resource into a callable function.
Expose outbound bridges narrowly. Use this when another protocol wants to drive Codex. Map only stable operations such as start/reply/cancel. Pitfall: mirroring the entire native runtime as MCP tools.

What Comes Next

MCP is one extension plane, but not the only one. Chapter 18 moves up a layer to the packages and metadata that decide which skills, plugin contributions, connectors, and typed prompt fragments exist before MCP routing ever begins.

Source Map

Concept	Source anchor
MCP configuration	`codex-rs/codex-mcp/src/mcp/mod.rs`
Connection manager	`codex-rs/codex-mcp/src/connection_manager.rs`
Tool metadata and normalization	`codex-rs/codex-mcp/src/tools.rs`
Core direct/deferred tool exposure	`codex-rs/core/src/mcp_tool_exposure.rs`
Resource and template routing	`codex-rs/codex-mcp/src/connection_manager.rs`
Outbound Codex MCP server	`codex-rs/mcp-server/src/message_processor.rs`