> ## Documentation Index
> Fetch the complete documentation index at: https://docs.svantic.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge store

# Knowledge Management

Svantic is a **self-learning** system. Every task it executes feeds back into a knowledge store, so the next task benefits from what was learned before. Over time, the system builds institutional intelligence — navigation strategies, error recovery patterns, site-specific quirks — that compounds with every execution.

***

## The Self-Learning Pipeline

<img src="https://mintcdn.com/svantic/DQEVJ_RnVW5_l6gu/images/diagrams/knowledge-pipeline.svg?fit=max&auto=format&n=DQEVJ_RnVW5_l6gu&q=85&s=5e0549c79a3db248e26dc348ec497bb4" alt="Self-learning pipeline: Task → Execute → Critic → Learner → Knowledge Cards → Retrieval → back to execution" width="720" height="400" data-path="images/diagrams/knowledge-pipeline.svg" />

Every completed task flows through a structured learning pipeline:

1. **Task arrives** — a user or trigger sends a request to the mesh
2. **Agents execute** — the orchestrator plans and agents run tools against real systems
3. **Critic reviews** — a dedicated Critic agent evaluates the execution for quality, correctness, and completeness
4. **Learner extracts** — a dedicated Learner agent analyzes the full execution trace (every tool call, every response, every retry) and distills reusable patterns
5. **Knowledge cards are created** — the learner's output becomes versioned, confidence-scored knowledge cards stored in the knowledge base
6. **RLAIF feedback** — future executions that use a card report success or failure, adjusting the card's confidence score (Laplace smoothing). Low-confidence or stale cards are automatically swept
7. **Retrieval injects context** — when the next task arrives, the system performs semantic search against the knowledge base and injects relevant learnings into the agent's prompt

This loop runs automatically. You don't need to configure it — every task that completes feeds the loop.

***

## How Learning Happens

Learning occurs at two points during every task:

### At Every Turn

During execution, the system observes the full context of each tool call — the inputs, outputs, errors, retries, and the decisions the orchestrator made. When a tool fails and the agent recovers (retries with different parameters, falls back to an alternative approach), that recovery path is captured as a pattern. When a tool succeeds on the first attempt, the working parameters and sequence are noted.

This turn-level observation means the system doesn't just learn from final outcomes — it learns from the intermediate steps, including the ones that didn't work.

### At Session End

Once a task completes, the Learner agent receives the full execution trace and distills it into reusable knowledge. It identifies what worked, what failed, and what could be done differently. The output is structured into two scopes:

* **Scope-specific** — tied to a particular target (a domain, service, API, or integration endpoint). These learnings are keyed by scope so they're retrieved only when future tasks interact with the same target.
* **General** — workflow-level patterns that transfer across contexts. These capture tool selection strategies, sequencing heuristics, error handling approaches, and other insights that apply regardless of the specific target.

The learner doesn't just append — it merges. If a card already exists for the same scope, the learner compares the new observations against existing knowledge, updates what changed, and leaves the rest intact.

***

## Knowledge Cards

Knowledge is stored as **cards** — the atomic unit of learning in Svantic. Each card is:

* **Scope-keyed** — tied to a domain, workflow, or general pattern (e.g. `site:ecams.geico.com`, `workflow:pdf-extraction`)
* **Versioned** — every update increments the version, so you can track how knowledge evolves
* **Confidence-scored** — starts at a neutral score and adjusts with outcomes using Laplace smoothing: `confidence = (successes + 1) / (successes + failures + 2)`
* **Merge-aware** — when the learner runs, it fetches existing cards for the same scope. If nothing changed, it emits `UNCHANGED` and skips the write. If the card needs updating, it merges the new observations with existing content

### Card Lifecycle

```
New task completes → Learner runs → Existing card fetched
    ↓
Card exists?
    No  → Create new card (version 1, confidence 0.5)
    Yes → Compare with trace
            ↓
        Content changed?
            No  → UNCHANGED (skip write)
            Yes → UPDATE (merge + increment version)
    ↓
Future tasks use the card
    ↓
Success → confidence increases
Failure → confidence decreases
    ↓
Confidence too low → card flagged for review
No updates for too long → stale sweep removes it
```

### Outcome Feedback (RLAIF)

Every time an agent retrieves and uses a knowledge card during execution, the outcome is recorded:

* **Success** — the task completed correctly, the card's advice was useful → confidence increases
* **Failure** — the task failed or the card's advice was wrong → confidence decreases

This creates a reinforcement loop: cards that consistently help get higher confidence and appear more prominently in future retrievals. Cards that mislead get downranked or removed.

### Stale Sweep

Cards that haven't been updated or validated within a configurable TTL are automatically cleaned up. This prevents the knowledge base from accumulating outdated information — if a website redesigns its portal, the old navigation learnings naturally expire.

***

## How Retrieval Works

When a task arrives, Svantic queries the knowledge base before the agent starts executing. The retrieval system uses several strategies:

### Semantic Search

The query is converted to a vector embedding and compared against all card embeddings using cosine similarity. The top-k most relevant results are returned with relevance scores.

### Site Boost

For tasks targeting a known domain, retrieval allocates dedicated slots for site-specific cards. If a task targets `portal.example.com`, the system ensures site-specific learnings for that domain appear even if generic workflow learnings have higher raw similarity scores.

### Outcome Split

Retrieved results are separated into **success** and **failure** learnings. The agent's prompt receives both:

* **"What worked"** — patterns from successful past executions
* **"What failed"** — patterns from failed attempts, so the agent avoids known pitfalls

This dual injection is more effective than only providing positive examples.

***

## The Compounding Effect

<img src="https://mintcdn.com/svantic/DQEVJ_RnVW5_l6gu/images/diagrams/compounding-effect.svg?fit=max&auto=format&n=DQEVJ_RnVW5_l6gu&q=85&s=539cb94ce0db874fdb31ca6f4b463849" alt="Compounding effect: execution quality improves as knowledge accumulates over time" width="700" height="320" data-path="images/diagrams/compounding-effect.svg" />

Knowledge compounds across tasks and contexts. Scope-specific learnings from one integration help when the system encounters similar patterns elsewhere. Error recovery strategies generalize. Sequencing heuristics transfer. Each task adds to the store, and every future task benefits from everything that came before.

Early tasks explore. Later tasks execute with the accumulated intelligence of every prior execution — institutional knowledge that builds automatically, without manual curation.

***

## Deployment and Sharing

How knowledge flows depends on your [deployment topology](/concepts/deployment-models):

| Topology              | Knowledge Scope       | Sharing                                                               |
| --------------------- | --------------------- | --------------------------------------------------------------------- |
| **Standalone**        | Local to the instance | Single brain, all knowledge in one place                              |
| **Sidecar**           | Local to each pod     | Each pod builds its own knowledge independently                       |
| **Central + Sidecar** | Shared across fleet   | Sidecars contribute learnings upward; central distributes to all pods |

In the Central + Sidecar topology, what Pod A learned on Monday is available to Pod C on Tuesday. This creates fleet-wide institutional intelligence — the collective experience of every agent in every pod, accessible to all.
