Documentation Index
Fetch the complete documentation index at: https://docs.svantic.com/llms.txt
Use this file to discover all available pages before exploring further.
Knowledge Management
Svantic is a self-learning system. Every task it executes feeds back into a knowledge store, so the next task benefits from what was learned before. Over time, the system builds institutional intelligence — navigation strategies, error recovery patterns, site-specific quirks — that compounds with every execution.The Self-Learning Pipeline
- Task arrives — a user or trigger sends a request to the mesh
- Agents execute — the orchestrator plans and agents run tools against real systems
- Critic reviews — a dedicated Critic agent evaluates the execution for quality, correctness, and completeness
- Learner extracts — a dedicated Learner agent analyzes the full execution trace (every tool call, every response, every retry) and distills reusable patterns
- Knowledge cards are created — the learner’s output becomes versioned, confidence-scored knowledge cards stored in the knowledge base
- RLAIF feedback — future executions that use a card report success or failure, adjusting the card’s confidence score (Laplace smoothing). Low-confidence or stale cards are automatically swept
- Retrieval injects context — when the next task arrives, the system performs semantic search against the knowledge base and injects relevant learnings into the agent’s prompt
How Learning Happens
Learning occurs at two points during every task:At Every Turn
During execution, the system observes the full context of each tool call — the inputs, outputs, errors, retries, and the decisions the orchestrator made. When a tool fails and the agent recovers (retries with different parameters, falls back to an alternative approach), that recovery path is captured as a pattern. When a tool succeeds on the first attempt, the working parameters and sequence are noted. This turn-level observation means the system doesn’t just learn from final outcomes — it learns from the intermediate steps, including the ones that didn’t work.At Session End
Once a task completes, the Learner agent receives the full execution trace and distills it into reusable knowledge. It identifies what worked, what failed, and what could be done differently. The output is structured into two scopes:- Scope-specific — tied to a particular target (a domain, service, API, or integration endpoint). These learnings are keyed by scope so they’re retrieved only when future tasks interact with the same target.
- General — workflow-level patterns that transfer across contexts. These capture tool selection strategies, sequencing heuristics, error handling approaches, and other insights that apply regardless of the specific target.
Knowledge Cards
Knowledge is stored as cards — the atomic unit of learning in Svantic. Each card is:- Scope-keyed — tied to a domain, workflow, or general pattern (e.g.
site:ecams.geico.com,workflow:pdf-extraction) - Versioned — every update increments the version, so you can track how knowledge evolves
- Confidence-scored — starts at a neutral score and adjusts with outcomes using Laplace smoothing:
confidence = (successes + 1) / (successes + failures + 2) - Merge-aware — when the learner runs, it fetches existing cards for the same scope. If nothing changed, it emits
UNCHANGEDand skips the write. If the card needs updating, it merges the new observations with existing content
Card Lifecycle
Outcome Feedback (RLAIF)
Every time an agent retrieves and uses a knowledge card during execution, the outcome is recorded:- Success — the task completed correctly, the card’s advice was useful → confidence increases
- Failure — the task failed or the card’s advice was wrong → confidence decreases
Stale Sweep
Cards that haven’t been updated or validated within a configurable TTL are automatically cleaned up. This prevents the knowledge base from accumulating outdated information — if a website redesigns its portal, the old navigation learnings naturally expire.How Retrieval Works
When a task arrives, Svantic queries the knowledge base before the agent starts executing. The retrieval system uses several strategies:Semantic Search
The query is converted to a vector embedding and compared against all card embeddings using cosine similarity. The top-k most relevant results are returned with relevance scores.Site Boost
For tasks targeting a known domain, retrieval allocates dedicated slots for site-specific cards. If a task targetsportal.example.com, the system ensures site-specific learnings for that domain appear even if generic workflow learnings have higher raw similarity scores.
Outcome Split
Retrieved results are separated into success and failure learnings. The agent’s prompt receives both:- “What worked” — patterns from successful past executions
- “What failed” — patterns from failed attempts, so the agent avoids known pitfalls
The Compounding Effect
Deployment and Sharing
How knowledge flows depends on your deployment topology:| Topology | Knowledge Scope | Sharing |
|---|---|---|
| Standalone | Local to the instance | Single brain, all knowledge in one place |
| Sidecar | Local to each pod | Each pod builds its own knowledge independently |
| Central + Sidecar | Shared across fleet | Sidecars contribute learnings upward; central distributes to all pods |
