WebSocket API: GET /agents/connect
Persistent, agent-initiated WebSocket used by connected-mode agents to
receive dispatches from the Svantic mesh. This is the only transport
for connected-mode agents; see
Agent Connectivity for the
conceptual background.
OpenAPI does not support WebSocket endpoints, so this reference lives
as a Markdown page. The wire format is fully versioned under the
subprotocol identifier svantic.v1 and is governed by the internal
spec (platform/docs/specs/ws_transport.md,
engineer-facing).
Endpoint
- Scheme:
wss://only.ws://is rejected. - Query parameters:
instance_id(required) — the instance that registered withdeployment_mode: connected. Registration happens first over HTTPS; only afterPOST /agents/registerreturns aconnect_urldoes the WebSocket upgrade succeed.
- Subprotocol: client must offer
svantic.v1inSec-WebSocket-Protocol. The server echoes it back in the101response. Clients that offer no compatible subprotocol are rejected with400 UNSUPPORTED_SUBPROTOCOL.
Authentication
Upgrade requests carry a tenant-scoped JWT in the standard bearer header:POST /agents/register. Cookie authentication
is not supported. The JWT’s tenant_id must match the tenant that
owns the instance_id.
Handshake errors
The server validates four things, in order, before allocating a socket. Each failure returns a plain HTTP response — no101
upgrade — with a JSON body under application/json.
| HTTP status | Code | When |
|---|---|---|
400 | MISSING_INSTANCE_ID | Query parameter instance_id is absent or empty. |
400 | UNSUPPORTED_SUBPROTOCOL | Client did not offer svantic.v1. |
401 | UNAUTHORIZED | JWT is missing, malformed, signature-invalid, or expired. |
403 | TENANT_MISMATCH | JWT’s tenant does not own the requested instance_id. |
404 | INSTANCE_NOT_FOUND | No registered instance with that instance_id. |
409 | DEPLOYMENT_MODE_MISMATCH | Instance exists but was registered as hosted. Re-register as connected on a new instance_id. |
Lifecycle
After101 Switching Protocols the client must send a hello
frame as its first text frame. The server replies with welcome;
the connection is “live” only after the client observes welcome.
The mesh will not push dispatch frames before the client reaches
the ready state.
Frame envelope
Every frame is a UTF-8 JSON text frame. Binary frames are reserved for future use; if the server receives one it closes the socket with code1003 Unsupported Data.
v— protocol version literal. Always1forsvantic.v1. A breaking change ships under a new subprotocol identifier.type— see the frame catalog below.id— sender-assigned UUID v7. Used for correlation; response frames setin_reply_toto the request’sid.ts— sender wall-clock in ISO 8601 / RFC 3339.in_reply_to— required on response frames (dispatch_result,dispatch_chunk,dispatch_ack,tool_result,pong); otherwise omitted ornull.trace_id,parent_span_id— optional envelope-level W3C trace context for per-frame routing telemetry. Note that trace context for your business logic arrives inside thedispatchpayload, atpayload.session_context.propagation_headers(W3Ctraceparent+baggage).payload— type-specific, documented per frame below.
error frame (code: BAD_FRAME) and the server closes the socket with code 1002 Protocol Error.
Frame catalog
hello — agent → mesh
First frame after upgrade. Announces the agent and optionally asks
to resume.
welcome — mesh → agent
Acknowledges hello. Transitions the client to ready.
resumed: true, replayed_dispatches lists the dispatch.id
values the server is re-delivering on this reconnect.
dispatch — mesh → agent
A single A2A task to execute. payload is byte-identical to the
body the mesh would POST to a hosted agent for the same operation —
so the same handler code works on both transports.
dispatch_ack — agent → mesh (optional)
Optional “received and working” signal. If the server does not
receive a terminal response (result or error) before
payload.deadline_ms, the dispatch times out regardless of whether
an ack was sent.
dispatch_chunk — agent → mesh
A streaming output chunk. Multiple chunks, in order, may precede a
dispatch_result. Receivers must preserve chunk ordering per
in_reply_to group.
dispatch_result — agent → mesh
Terminal success for a dispatch.
tool_call / tool_result
When one agent invokes another agent’s tool, the request is routed
through the mesh. tool_call has the same payload shape as a hosted
tool invocation; tool_result is the reply.
heartbeat — agent → mesh
Self-reported presence, load, and health. Carries the same
HeartbeatPayload as the HTTP POST /agents/heartbeat endpoint —
hosted agents heartbeat over HTTP, connected agents heartbeat over
this frame, but the payload bytes are identical and the gateway
stores them in the same row.
HEARTBEAT_INTERVAL_MS constant),
or immediately on any status change.
ping / pong
Application-level keepalive. Independent of the WebSocket protocol’s
own ping/pong — both run in parallel to catch different failure
modes.
error
Out-of-band error. Carries a stable machine-readable code the client
can branch on; does not close the socket on its own unless the
server decides it must.
close_request
Graceful shutdown request. The sender promises no new dispatch
frames; in-flight dispatches continue up to payload.grace_seconds
(default 30 s). After the grace window, the sender closes the socket
with code 1000 Normal Closure.
Error frame catalog
payload.code values are stable. New codes may be added; existing
codes never change meaning.
| Code | Direction | Meaning |
|---|---|---|
BAD_FRAME | server → client | Frame failed schema validation. The server closes the connection with 1002 immediately after. |
AUTH_EXPIRED | server → client | JWT expired mid-connection. Client must reconnect with a fresh token. |
AGENT_DISCONNECTED | server → client | The dispatch targeted an instance whose socket is gone. The caller receives the same error. |
RATE_LIMITED | server → client | Frame rate or concurrent dispatches exceeded. detail.retry_after_ms is an advisory backoff hint. |
INTERNAL | either | Unrecoverable internal error. Should be rare; always safe to reconnect. |
Heartbeats
| Mechanism | Cadence | Initiator | Purpose |
|---|---|---|---|
| WebSocket protocol ping/pong | 30 s | server | TCP-level keepalive. |
App-level ping / pong | 30 s, 15 s offset | server | Detects wedged agent event loops. |
heartbeat | 30 s or on change | agent | Self-reported presence, load, and health. |
1001 Going Away.
Clients should reconnect; the SDK handles this automatically.
Reconnect semantics
- Bounded exponential backoff: 1 s, 2 s, 4 s, 8 s, 16 s, 30 s (cap), with ±25% random jitter applied to each step.
- On every successful reconnect the backoff resets.
- If the client carries a
resume_tokeninhello, the server attempts to re-bind pending dispatches:- Match →
welcome.resumed = true,welcome.replayed_dispatcheslists the re-delivered IDs. - No match →
welcome.resumed = false. Any dispatches that were in flight at the time of the disconnect have already failed on the original owner pod withAGENT_DISCONNECTED, and their callers have received the error.
- Match →
- Resume is pod-local. A reconnect that lands on a different pod (normal under scaling) cannot resume and starts clean.
Compatibility
v: 1is fixed for the life ofsvantic.v1. Breaking changes ship as a new subprotocol (svantic.v2); both will be supported during the transition.- New optional fields may be added to existing payloads without a version bump. Receivers must ignore unknown fields.
- New frame types may be added without a version bump. Receivers
reply with
error/BAD_FRAMEbut must not close the connection; the sender downgrades.
SDK support
If you’re writing your agent with@svantic/sdk, none of the above
is your concern day-to-day — set deployment_mode: 'connected' at
registration and the SDK dials, authenticates, reconnects, and
resumes for you. Your capability handlers receive the same
CapabilitySessionContext they would on a hosted deployment, with
parent_trace_id, parent_span_id, and baggage already parsed off
the incoming traceparent.
This reference exists for:
- Teams writing their own client (e.g. a non-TypeScript agent).
- Debugging: reading
wslogs from the SDK and matching them to protocol states. - Compliance reviews that need the wire format documented externally.
