Documentation Index
Fetch the complete documentation index at: https://docs.svantic.com/llms.txt
Use this file to discover all available pages before exploring further.
WebSocket API: GET /agents/connect
Persistent, agent-initiated WebSocket used by connected-mode agents to
receive dispatches from the Svantic mesh. This is the only transport
for connected-mode agents; see
Agent Connectivity for the
conceptual background.
OpenAPI does not support WebSocket endpoints, so this reference lives
as a Markdown page. The wire format is fully versioned under the
subprotocol identifier svantic.v1 and is governed by the internal
spec (platform/docs/specs/ws_transport.md,
engineer-facing).
Endpoint
wss://api.svantic.com/agents/connect?instance_id=<instance_id>
- Scheme:
wss:// only. ws:// is rejected.
- Query parameters:
instance_id (required) — the instance that registered with
deployment_mode: connected. Registration happens first over
HTTPS; only after POST /agents/register returns a connect_url
does the WebSocket upgrade succeed.
- Subprotocol: client must offer
svantic.v1 in
Sec-WebSocket-Protocol. The server echoes it back in the 101
response. Clients that offer no compatible subprotocol are rejected
with 400 UNSUPPORTED_SUBPROTOCOL.
Authentication
Upgrade requests carry a tenant-scoped JWT in the standard bearer
header:
Authorization: Bearer <jwt>
The same JWT used for POST /agents/register. Cookie authentication
is not supported. The JWT’s tenant_id must match the tenant that
owns the instance_id.
Handshake errors
The server validates four things, in order, before allocating a
socket. Each failure returns a plain HTTP response — no 101
upgrade — with a JSON body under application/json.
| HTTP status | Code | When |
|---|
400 | MISSING_INSTANCE_ID | Query parameter instance_id is absent or empty. |
400 | UNSUPPORTED_SUBPROTOCOL | Client did not offer svantic.v1. |
401 | UNAUTHORIZED | JWT is missing, malformed, signature-invalid, or expired. |
403 | TENANT_MISMATCH | JWT’s tenant does not own the requested instance_id. |
404 | INSTANCE_NOT_FOUND | No registered instance with that instance_id. |
409 | DEPLOYMENT_MODE_MISMATCH | Instance exists but was registered as hosted. Re-register as connected on a new instance_id. |
Example failure body:
{
"error": "DEPLOYMENT_MODE_MISMATCH",
"message": "Instance navigator-prod-01 is registered as 'hosted'; WebSocket upgrade requires 'connected'.",
"status": 409
}
Lifecycle
After 101 Switching Protocols the client must send a hello
frame as its first text frame. The server replies with welcome;
the connection is “live” only after the client observes welcome.
The mesh will not push dispatch frames before the client reaches
the ready state.
Frame envelope
Every frame is a UTF-8 JSON text frame. Binary frames are reserved
for future use; if the server receives one it closes the socket with
code 1003 Unsupported Data.
{
"v": 1,
"type": "dispatch",
"id": "0199e3b5-7d8c-7a10-9a1c-ff65e2b3c0de",
"ts": "2026-04-17T13:41:22.814Z",
"in_reply_to": null,
"trace_id": null,
"parent_span_id": null,
"payload": { }
}
v — protocol version literal. Always 1 for svantic.v1. A
breaking change ships under a new subprotocol identifier.
type — see the frame catalog below.
id — sender-assigned UUID v7. Used for correlation; response
frames set in_reply_to to the request’s id.
ts — sender wall-clock in ISO 8601 / RFC 3339.
in_reply_to — required on response frames (dispatch_result,
dispatch_chunk, dispatch_ack, tool_result, pong); otherwise
omitted or null.
trace_id, parent_span_id — optional envelope-level W3C trace
context for per-frame routing telemetry. Note that trace context
for your business logic arrives inside the dispatch payload, at
payload.session_context.propagation_headers (W3C traceparent +
baggage).
payload — type-specific, documented per frame below.
Frames that fail schema validation receive an error frame (code: BAD_FRAME) and the server closes the socket with code 1002 Protocol Error.
Frame catalog
hello — agent → mesh
First frame after upgrade. Announces the agent and optionally asks
to resume.
{
"v": 1,
"type": "hello",
"id": "…",
"ts": "…",
"payload": {
"instance_id": "navigator-prod-01",
"agent_type": "navigator",
"agent_version": "1.4.2",
"sdk_version": "@svantic/sdk@0.12.0",
"agent_card": { },
"resume_token": null
}
}
welcome — mesh → agent
Acknowledges hello. Transitions the client to ready.
{
"v": 1,
"type": "welcome",
"id": "…",
"ts": "…",
"in_reply_to": "<hello.id>",
"payload": {
"resumed": false,
"server_time": "2026-04-17T13:41:23.001Z",
"replayed_dispatches": []
}
}
When resumed: true, replayed_dispatches lists the dispatch.id
values the server is re-delivering on this reconnect.
dispatch — mesh → agent
A single A2A task to execute. payload is byte-identical to the
body the mesh would POST to a hosted agent for the same operation —
so the same handler code works on both transports.
{
"type": "dispatch",
"payload": {
"skill_id": "lookup_ticket",
"args": { "ticket_id": 42 },
"session_context": {
"session_id": "sess-abc",
"tenant_id": "tenant-1",
"propagation_headers": {
"traceparent": "00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01",
"baggage": "svantic.session_id=sess-abc,svantic.tenant_id=tenant-1,svantic.invocation_id=deadbeefcafef00d"
}
},
"deadline_ms": 1713361323000
}
}
dispatch_ack — agent → mesh (optional)
Optional “received and working” signal. If the server does not
receive a terminal response (result or error) before
payload.deadline_ms, the dispatch times out regardless of whether
an ack was sent.
dispatch_chunk — agent → mesh
A streaming output chunk. Multiple chunks, in order, may precede a
dispatch_result. Receivers must preserve chunk ordering per
in_reply_to group.
{
"type": "dispatch_chunk",
"in_reply_to": "<dispatch.id>",
"payload": { "delta": "…" }
}
dispatch_result — agent → mesh
Terminal success for a dispatch.
{
"type": "dispatch_result",
"in_reply_to": "<dispatch.id>",
"payload": { "result": { } }
}
When one agent invokes another agent’s tool, the request is routed
through the mesh. tool_call has the same payload shape as a hosted
tool invocation; tool_result is the reply.
heartbeat — agent → mesh
Self-reported presence, load, and health. Carries the same
HeartbeatPayload as the HTTP POST /agents/heartbeat endpoint —
hosted agents heartbeat over HTTP, connected agents heartbeat over
this frame, but the payload bytes are identical and the gateway
stores them in the same row.
{
"type": "heartbeat",
"payload": {
"status": "available",
"current_sessions": 3,
"max_concurrent_sessions": 16,
"consecutive_failures": 0
}
}
Cadence: every 30 s (the shared HEARTBEAT_INTERVAL_MS constant),
or immediately on any status change.
ping / pong
Application-level keepalive. Independent of the WebSocket protocol’s
own ping/pong — both run in parallel to catch different failure
modes.
error
Out-of-band error. Carries a stable machine-readable code the client
can branch on; does not close the socket on its own unless the
server decides it must.
{
"type": "error",
"in_reply_to": null,
"payload": {
"code": "RATE_LIMITED",
"message": "Frame rate exceeded.",
"detail": { "retry_after_ms": 1000 }
}
}
close_request
Graceful shutdown request. The sender promises no new dispatch
frames; in-flight dispatches continue up to payload.grace_seconds
(default 30 s). After the grace window, the sender closes the socket
with code 1000 Normal Closure.
Error frame catalog
payload.code values are stable. New codes may be added; existing
codes never change meaning.
| Code | Direction | Meaning |
|---|
BAD_FRAME | server → client | Frame failed schema validation. The server closes the connection with 1002 immediately after. |
AUTH_EXPIRED | server → client | JWT expired mid-connection. Client must reconnect with a fresh token. |
AGENT_DISCONNECTED | server → client | The dispatch targeted an instance whose socket is gone. The caller receives the same error. |
RATE_LIMITED | server → client | Frame rate or concurrent dispatches exceeded. detail.retry_after_ms is an advisory backoff hint. |
INTERNAL | either | Unrecoverable internal error. Should be rare; always safe to reconnect. |
Heartbeats
| Mechanism | Cadence | Initiator | Purpose |
|---|
| WebSocket protocol ping/pong | 30 s | server | TCP-level keepalive. |
App-level ping / pong | 30 s, 15 s offset | server | Detects wedged agent event loops. |
heartbeat | 30 s or on change | agent | Self-reported presence, load, and health. |
Dead-peer timeout. Three missed app-level pings (~90 seconds)
cause the server to close the socket with code 1001 Going Away.
Clients should reconnect; the SDK handles this automatically.
Reconnect semantics
- Bounded exponential backoff: 1 s, 2 s, 4 s, 8 s, 16 s, 30 s (cap),
with ±25% random jitter applied to each step.
- On every successful reconnect the backoff resets.
- If the client carries a
resume_token in hello, the server
attempts to re-bind pending dispatches:
- Match →
welcome.resumed = true,
welcome.replayed_dispatches lists the re-delivered IDs.
- No match →
welcome.resumed = false. Any dispatches that
were in flight at the time of the disconnect have already failed
on the original owner pod with AGENT_DISCONNECTED, and their
callers have received the error.
- Resume is pod-local. A reconnect that lands on a different pod
(normal under scaling) cannot resume and starts clean.
Compatibility
v: 1 is fixed for the life of svantic.v1. Breaking changes
ship as a new subprotocol (svantic.v2); both will be supported
during the transition.
- New optional fields may be added to existing payloads without
a version bump. Receivers must ignore unknown fields.
- New frame types may be added without a version bump. Receivers
reply with
error / BAD_FRAME but must not close the
connection; the sender downgrades.
SDK support
If you’re writing your agent with @svantic/sdk, none of the above
is your concern day-to-day — set deployment_mode: 'connected' at
registration and the SDK dials, authenticates, reconnects, and
resumes for you. Your capability handlers receive the same
CapabilitySessionContext they would on a hosted deployment, with
parent_trace_id, parent_span_id, and baggage already parsed off
the incoming traceparent.
This reference exists for:
- Teams writing their own client (e.g. a non-TypeScript agent).
- Debugging: reading
ws logs from the SDK and matching them to
protocol states.
- Compliance reviews that need the wire format documented
externally.