Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.svantic.com/llms.txt

Use this file to discover all available pages before exploring further.

WebSocket API: GET /agents/connect

Persistent, agent-initiated WebSocket used by connected-mode agents to receive dispatches from the Svantic mesh. This is the only transport for connected-mode agents; see Agent Connectivity for the conceptual background. OpenAPI does not support WebSocket endpoints, so this reference lives as a Markdown page. The wire format is fully versioned under the subprotocol identifier svantic.v1 and is governed by the internal spec (platform/docs/specs/ws_transport.md, engineer-facing).

Endpoint

wss://api.svantic.com/agents/connect?instance_id=<instance_id>
  • Scheme: wss:// only. ws:// is rejected.
  • Query parameters:
    • instance_id (required) — the instance that registered with deployment_mode: connected. Registration happens first over HTTPS; only after POST /agents/register returns a connect_url does the WebSocket upgrade succeed.
  • Subprotocol: client must offer svantic.v1 in Sec-WebSocket-Protocol. The server echoes it back in the 101 response. Clients that offer no compatible subprotocol are rejected with 400 UNSUPPORTED_SUBPROTOCOL.

Authentication

Upgrade requests carry a tenant-scoped JWT in the standard bearer header:
Authorization: Bearer <jwt>
The same JWT used for POST /agents/register. Cookie authentication is not supported. The JWT’s tenant_id must match the tenant that owns the instance_id.

Handshake errors

The server validates four things, in order, before allocating a socket. Each failure returns a plain HTTP response — no 101 upgrade — with a JSON body under application/json.
HTTP statusCodeWhen
400MISSING_INSTANCE_IDQuery parameter instance_id is absent or empty.
400UNSUPPORTED_SUBPROTOCOLClient did not offer svantic.v1.
401UNAUTHORIZEDJWT is missing, malformed, signature-invalid, or expired.
403TENANT_MISMATCHJWT’s tenant does not own the requested instance_id.
404INSTANCE_NOT_FOUNDNo registered instance with that instance_id.
409DEPLOYMENT_MODE_MISMATCHInstance exists but was registered as hosted. Re-register as connected on a new instance_id.
Example failure body:
{
  "error": "DEPLOYMENT_MODE_MISMATCH",
  "message": "Instance navigator-prod-01 is registered as 'hosted'; WebSocket upgrade requires 'connected'.",
  "status": 409
}

Lifecycle

After 101 Switching Protocols the client must send a hello frame as its first text frame. The server replies with welcome; the connection is “live” only after the client observes welcome. The mesh will not push dispatch frames before the client reaches the ready state.

Frame envelope

Every frame is a UTF-8 JSON text frame. Binary frames are reserved for future use; if the server receives one it closes the socket with code 1003 Unsupported Data.
{
  "v": 1,
  "type": "dispatch",
  "id": "0199e3b5-7d8c-7a10-9a1c-ff65e2b3c0de",
  "ts": "2026-04-17T13:41:22.814Z",
  "in_reply_to": null,
  "trace_id": null,
  "parent_span_id": null,
  "payload": { }
}
  • v — protocol version literal. Always 1 for svantic.v1. A breaking change ships under a new subprotocol identifier.
  • type — see the frame catalog below.
  • id — sender-assigned UUID v7. Used for correlation; response frames set in_reply_to to the request’s id.
  • ts — sender wall-clock in ISO 8601 / RFC 3339.
  • in_reply_to — required on response frames (dispatch_result, dispatch_chunk, dispatch_ack, tool_result, pong); otherwise omitted or null.
  • trace_id, parent_span_id — optional envelope-level W3C trace context for per-frame routing telemetry. Note that trace context for your business logic arrives inside the dispatch payload, at payload.session_context.propagation_headers (W3C traceparent + baggage).
  • payload — type-specific, documented per frame below.
Frames that fail schema validation receive an error frame (code: BAD_FRAME) and the server closes the socket with code 1002 Protocol Error.

Frame catalog

hello — agent → mesh

First frame after upgrade. Announces the agent and optionally asks to resume.
{
  "v": 1,
  "type": "hello",
  "id": "…",
  "ts": "…",
  "payload": {
    "instance_id": "navigator-prod-01",
    "agent_type": "navigator",
    "agent_version": "1.4.2",
    "sdk_version": "@svantic/sdk@0.12.0",
    "agent_card": { },
    "resume_token": null
  }
}

welcome — mesh → agent

Acknowledges hello. Transitions the client to ready.
{
  "v": 1,
  "type": "welcome",
  "id": "…",
  "ts": "…",
  "in_reply_to": "<hello.id>",
  "payload": {
    "resumed": false,
    "server_time": "2026-04-17T13:41:23.001Z",
    "replayed_dispatches": []
  }
}
When resumed: true, replayed_dispatches lists the dispatch.id values the server is re-delivering on this reconnect.

dispatch — mesh → agent

A single A2A task to execute. payload is byte-identical to the body the mesh would POST to a hosted agent for the same operation — so the same handler code works on both transports.
{
  "type": "dispatch",
  "payload": {
    "skill_id": "lookup_ticket",
    "args": { "ticket_id": 42 },
    "session_context": {
      "session_id": "sess-abc",
      "tenant_id": "tenant-1",
      "propagation_headers": {
        "traceparent": "00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01",
        "baggage": "svantic.session_id=sess-abc,svantic.tenant_id=tenant-1,svantic.invocation_id=deadbeefcafef00d"
      }
    },
    "deadline_ms": 1713361323000
  }
}

dispatch_ack — agent → mesh (optional)

Optional “received and working” signal. If the server does not receive a terminal response (result or error) before payload.deadline_ms, the dispatch times out regardless of whether an ack was sent.

dispatch_chunk — agent → mesh

A streaming output chunk. Multiple chunks, in order, may precede a dispatch_result. Receivers must preserve chunk ordering per in_reply_to group.
{
  "type": "dispatch_chunk",
  "in_reply_to": "<dispatch.id>",
  "payload": { "delta": "…" }
}

dispatch_result — agent → mesh

Terminal success for a dispatch.
{
  "type": "dispatch_result",
  "in_reply_to": "<dispatch.id>",
  "payload": { "result": { } }
}

tool_call / tool_result

When one agent invokes another agent’s tool, the request is routed through the mesh. tool_call has the same payload shape as a hosted tool invocation; tool_result is the reply.

heartbeat — agent → mesh

Self-reported presence, load, and health. Carries the same HeartbeatPayload as the HTTP POST /agents/heartbeat endpoint — hosted agents heartbeat over HTTP, connected agents heartbeat over this frame, but the payload bytes are identical and the gateway stores them in the same row.
{
  "type": "heartbeat",
  "payload": {
    "status": "available",
    "current_sessions": 3,
    "max_concurrent_sessions": 16,
    "consecutive_failures": 0
  }
}
Cadence: every 30 s (the shared HEARTBEAT_INTERVAL_MS constant), or immediately on any status change.

ping / pong

Application-level keepalive. Independent of the WebSocket protocol’s own ping/pong — both run in parallel to catch different failure modes.

error

Out-of-band error. Carries a stable machine-readable code the client can branch on; does not close the socket on its own unless the server decides it must.
{
  "type": "error",
  "in_reply_to": null,
  "payload": {
    "code": "RATE_LIMITED",
    "message": "Frame rate exceeded.",
    "detail": { "retry_after_ms": 1000 }
  }
}

close_request

Graceful shutdown request. The sender promises no new dispatch frames; in-flight dispatches continue up to payload.grace_seconds (default 30 s). After the grace window, the sender closes the socket with code 1000 Normal Closure.

Error frame catalog

payload.code values are stable. New codes may be added; existing codes never change meaning.
CodeDirectionMeaning
BAD_FRAMEserver → clientFrame failed schema validation. The server closes the connection with 1002 immediately after.
AUTH_EXPIREDserver → clientJWT expired mid-connection. Client must reconnect with a fresh token.
AGENT_DISCONNECTEDserver → clientThe dispatch targeted an instance whose socket is gone. The caller receives the same error.
RATE_LIMITEDserver → clientFrame rate or concurrent dispatches exceeded. detail.retry_after_ms is an advisory backoff hint.
INTERNALeitherUnrecoverable internal error. Should be rare; always safe to reconnect.

Heartbeats

MechanismCadenceInitiatorPurpose
WebSocket protocol ping/pong30 sserverTCP-level keepalive.
App-level ping / pong30 s, 15 s offsetserverDetects wedged agent event loops.
heartbeat30 s or on changeagentSelf-reported presence, load, and health.
Dead-peer timeout. Three missed app-level pings (~90 seconds) cause the server to close the socket with code 1001 Going Away. Clients should reconnect; the SDK handles this automatically.

Reconnect semantics

  • Bounded exponential backoff: 1 s, 2 s, 4 s, 8 s, 16 s, 30 s (cap), with ±25% random jitter applied to each step.
  • On every successful reconnect the backoff resets.
  • If the client carries a resume_token in hello, the server attempts to re-bind pending dispatches:
    • Matchwelcome.resumed = true, welcome.replayed_dispatches lists the re-delivered IDs.
    • No matchwelcome.resumed = false. Any dispatches that were in flight at the time of the disconnect have already failed on the original owner pod with AGENT_DISCONNECTED, and their callers have received the error.
  • Resume is pod-local. A reconnect that lands on a different pod (normal under scaling) cannot resume and starts clean.

Compatibility

  • v: 1 is fixed for the life of svantic.v1. Breaking changes ship as a new subprotocol (svantic.v2); both will be supported during the transition.
  • New optional fields may be added to existing payloads without a version bump. Receivers must ignore unknown fields.
  • New frame types may be added without a version bump. Receivers reply with error / BAD_FRAME but must not close the connection; the sender downgrades.

SDK support

If you’re writing your agent with @svantic/sdk, none of the above is your concern day-to-day — set deployment_mode: 'connected' at registration and the SDK dials, authenticates, reconnects, and resumes for you. Your capability handlers receive the same CapabilitySessionContext they would on a hosted deployment, with parent_trace_id, parent_span_id, and baggage already parsed off the incoming traceparent. This reference exists for:
  • Teams writing their own client (e.g. a non-TypeScript agent).
  • Debugging: reading ws logs from the SDK and matching them to protocol states.
  • Compliance reviews that need the wire format documented externally.