Telemetry
What it is
The SDK emits OpenTelemetry spans for every capability invocation, LLM call, and tool call it runs — no setup required on your side. Spans follow the OpenTelemetry GenAI semantic conventions (gen_ai.operation.name, gen_ai.request.model, gen_ai.usage.input_tokens, …), so they land cleanly in any OTEL-aware backend.
Three helpers are exported from the SDK for instrumenting your own code:
| Helper | Use for | Emits span |
|---|---|---|
trace_llm(meta, fn) | Direct LLM calls (OpenAI, Anthropic, Bedrock, Vertex, Ollama, any) | llm.<op> <model> |
trace_tool(meta, fn) | External calls (DB, HTTP, shell, MCP) | tool.execute <name> |
trace_step(name, fn) | Arbitrary work blocks (planning, parsing, validation) | step.<name> |
- On the Svantic mesh (hosted, or self-hosted): the mesh runtime installs a global
TracerProviderat startup and ships all completed spans to the gateway. They show up in the dashboard’s Traces and Usage views. - Anywhere else: if the process has no global
TracerProvider, the helpers become no-ops — zero runtime cost, nothing to configure.
When to use it
In the common case, you don’t. The SDK already traces:- Every capability invocation (
execute_tool <capability_name>spans withgen_ai.tool.*attributes) - Every LLM call made by smart-agent mode (
call_llm <model>spans withgen_ai.request.model,gen_ai.usage.*,gen_ai.response.finish_reasons) - Every agent invocation in smart-agent mode (
invoke_agent <name>spans withgen_ai.conversation.id, aggregated token totals)
API
trace_llm(meta, fn)
Wrap any LLM provider call so it shows up as a dedicated child span with the standard gen_ai.* attributes.
{ value, telemetry? }. The helper attaches gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, and gen_ai.response.finish_reasons from the telemetry object, then resolves the outer promise with value alone — so the caller sees a clean value.
Example (OpenAI):
trace_tool(meta, fn)
Wrap any tool/side-effect call.
trace_step(name, fn)
Wrap arbitrary work so it shows up as step.<name> in the waterfall. Use to eliminate “unaccounted time” gaps.
record_span_error(span, err)
For advanced callers who start their own spans via @opentelemetry/api: mark the span as failed in a way consistent with the helpers above (records the exception, sets status=ERROR, attaches error.message and error.type).
Spans the SDK & mesh emit
| Span name | Emitted by | Operation | Key attributes |
|---|---|---|---|
execute_tool <name> | SDK capability executor | capability invocation | gen_ai.operation.name=execute_tool, gen_ai.tool.name, gen_ai.conversation.id, svantic.tenant.id |
call_llm <model> | SDK smart-agent loop | LLM call inside smart-agent mode | gen_ai.operation.name=chat, gen_ai.system, gen_ai.request.model, gen_ai.request.temperature, gen_ai.request.max_tokens, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons, svantic.llm.iteration |
invoke_agent <name> | ADK (mesh side) | mesh agent turn | gen_ai.operation.name=invoke_agent, gen_ai.agent.name, gen_ai.system, gen_ai.request.model, gen_ai.conversation.id |
llm.chat <model> | Mesh (ADK auto-instrumentation) | ADK LLM call | gen_ai.operation.name=chat, gen_ai.system, gen_ai.request.model, gen_ai.usage.*, gen_ai.response.finish_reasons, svantic.source=adk.LlmAgent.callLlmAsync |
llm.<op> <model> | trace_llm | custom LLM call | gen_ai.operation.name, gen_ai.system, gen_ai.request.model, gen_ai.usage.*, gen_ai.response.finish_reasons |
tool.execute <name> | trace_tool | custom tool call | gen_ai.operation.name=execute_tool, gen_ai.tool.name, svantic.tool.kind, svantic.tool.args |
step.<name> | trace_step | custom work block | any attributes you pass |
Using your own OpenTelemetry backend
To send traces to Datadog, Honeycomb, Grafana Tempo, or any OTLP collector, configure aTracerProvider yourself at process startup — before creating any Agent:
maybeSetOtelProviders, which is first-write-wins), but the agent-side provider is preserved if it’s the first one registered.
See also
- Telemetry guide — reading traces in the dashboard.
- Trace propagation — W3C
traceparent/baggageheaders across service boundaries.
