Documentation

Everything you need to integrate Silo into your project, configure CI workflows, run evaluations, and work with the API.

Documentation hub

Setup, operate, and troubleshoot Silo with fewer surprises.

This guide is organized for real production use: how the repo is structured, how sync and evaluation work, and how the app behaves when the backend or database is degraded.

Fast start

Use the quickstart sections if you need the shortest path from install to first run.

Ops aware

Maintenance mode and degraded database behavior are documented with the exact health endpoints.

CLI aligned

Every major workflow now calls out the matching CLI path, not just the HTTP API.

Recommended reading path

1. Overview and API keys
2. CLI plus repository layout
3. Workflow and evaluation config
4. Operations and health endpoints
5. API reference and troubleshooting

Start here

What Silo does

Jump link

Silo compares an accepted baseline prompt against a candidate version on stored test cases, scores the result across quality, safety, format, task accuracy, and speed dimensions, and stores enough observability to make regressions explainable instead of mysterious.

Keep prompt files in silo/prompts/ inside your repo.
Sync changes from the dashboard, silo sync, or scripts/silo_git_sync.py.
Run drift on stored test cases, not generated-on-the-fly cases.
Use the dashboard for diagnostics, historical runs, and API key management.
Typical flow
Prompt change -> sync/create version -> run evaluation -> review diagnostics -> gate CI or accept baseline

Authentication

API keys and bearer auth

Jump link

Silo supports Cognito JWTs for user sessions and silo_... user API keys for automation. API keys are intended for CLI usage, CI, and local scripts.

ModeBest forNotes
Cognito JWTDashboard sessionsRequired for creating, listing, and revoking API keys
User API keyCLI, CI, local toolingUse as `Authorization: Bearer silo_...`
Environment
export SILO_API_BASE_URL="https://silo-siix.onrender.com"
export SILO_API_KEY="silo_abc123..."

Key visibility is one-time

Copy a newly created API key immediately. The dashboard will not show the raw secret again after creation.

Tooling

CLI package

Jump link

The official CLI is published as silo-drift-cli, and the installed command name is silo. Use it to sync prompt suites, run drift checks, and connect your repo or CI to your already deployed Silo workspace.

Install and verify
npm install -g silo-drift-cli
silo login --url "$SILO_API_BASE_URL" --key "$SILO_API_KEY"
silo doctor
silo whoami

The CLI talks to the Silo API only, so you can install it and start using the hosted app right away with your workspace URL and API key.

Website: silo-frontend.onrender.com

npm package: npmjs.com/package/silo-drift-cli

Production backend API: silo-siix.onrender.com

Contributors can run the CLI from this repo through packages/cli with npm install, npm run build, and then either npm link or node dist/index.js.

Repo layout

Repository layout

Jump link

The quickest way to scaffold a compatible repository is silo setup. That creates the canonical silo/ directory and a starter GitHub Actions workflow.

Expected structure
your-repo/
|-- silo/
|   |-- silo.yml
|   \-- prompts/
|       \-- <suite-name>/
|           |-- prompt-a.prompt
|           \-- prompt-b.prompt
\-- .github/
    \-- workflows/
        \-- silo-prompt-pipeline.yml
Only *.prompt files under silo/prompts/ are synced.
Each prompt folder maps to a suite and can be auto-created during sync.
The scaffolded workflow is a starting point; customize it for your repo conventions.
You still need to define suites and prompt metadata in silo/silo.yml.

Configuration

silo.yml configuration

Jump link

silo/silo.yml is the source of truth for suite definitions, defaults, and per-prompt metadata.

silo/silo.yml
suites: {}

suite_definitions:
  support:
    name: "Support"
    description: "Customer support prompts"

defaults:
  default_suite_model: "gpt-4o-mini"
  context_threshold: 0.85
  max_speed_regression_pct: 20.0

prompt_meta:
  support/refund:
    format_rules:
      - required_fields: ["intent", "answer", "confidence"]
    test_cases:
      - input_text: "Can I return damaged shoes?"
        expected_output: '{"intent":"refund","answer":"Start a return...","confidence":0.9}'
FieldPurpose
suitesSync cache of known suite ids
suite_definitionsNames, descriptions, and suite-level model settings
defaultsGlobal thresholds and shared run defaults
prompt_metaPer prompt-key metadata including rules and test cases

Automation

GitHub Actions workflow

Jump link

Silo works best when prompt sync and evaluation happen automatically on push and PR events. The dashboard and CLI are excellent for iteration, but CI is where regressions get caught early.

.github/workflows/silo-prompt-pipeline.yml
name: Prompt pipeline

on:
  push:
    branches: [main]
    paths:
      - "silo/prompts/**"
      - "silo/silo.yml"

jobs:
  sync-and-evaluate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
      - run: npm install -g silo-drift-cli
      - run: silo sync --repo . --base HEAD~1 --head HEAD
        env:
          SILO_API_BASE_URL: ${{ secrets.SILO_API_BASE_URL }}
          SILO_API_KEY: ${{ secrets.SILO_API_KEY }}

Secrets model

Use repository secrets for SILO_API_BASE_URL and SILO_API_KEY. Fork PRs do not receive those secrets, so design the workflow to skip safely for forked contributions.

Local dev

Local sync and dry runs

Jump link

You can run the same sync flow locally before opening a PR. That keeps prompt iteration fast and reduces noisy CI runs.

Local sync
export SILO_API_BASE_URL="https://silo-siix.onrender.com"
export SILO_API_KEY="silo_your_key_here"

silo sync --repo . --base HEAD~1 --head HEAD
silo sync --repo . --base HEAD~1 --head HEAD --dry-run --json

If you prefer the existing Python sync path, scripts/silo_git_sync.py remains available and still maps to the same API surface.

Run behavior

Evaluation behavior

Jump link

Drift runs operate on stored test cases and accept an evaluation_config payload for presets, judge policy, early-stop behavior, and two-stage evaluation.

PresetUse caseCharacteristics
fullDeep inspectionMost complete scoring; highest cost
ciBranch protectionBalanced observability and guardrail enforcement
iterationFast prompt iterationAggressive optimizations and quicker feedback
generativeLong-form responsesHeavier emphasis on semantic and judge checks
POST /api/agent/run
{
  "suite_id": "<uuid>",
  "prompt_key": "support",
  "candidate_version_id": 3,
  "baseline_version_id": 2,
  "test_case_ids": ["<uuid>"],
  "evaluation_config": {
    "preset": "ci",
    "judge": { "policy": "disagreement", "disagreement_threshold": 0.15 },
    "two_stage_enabled": true,
    "early_stop": { "enabled": true, "min_cases": 5, "severity_threshold": 0.2 }
  }
}

Quality control

CI result gate

Jump link

After a run completes, use the CI gate when your team needs stricter deployment thresholds than the default run pass/fail value.

CLI gate
silo gate --run-id "<run-id>" --config thresholds.json
ThresholdMeaning
SILO_GATE_REQUIRE_PASSEDRequire the run's passed field to be true
SILO_GATE_MIN_CONTEXT_SIMILARITYMinimum acceptable context similarity
SILO_GATE_MIN_CANDIDATE_SCOREComposite score floor for candidate output
SILO_GATE_MAX_SPEED_REGRESSION_PCTMaximum allowed slowdown vs baseline

Reliability

Operations and health

Jump link

The app now distinguishes between a full backend outage and a degraded database layer. That distinction powers the maintenance experience in the authenticated UI.

EndpointLayerMeaning
/healthFastAPI rootBackend process is alive
/api/system/healthFastAPI deep healthBackend plus lightweight Supabase probe
/api/workspace-statusNext.js workspace availabilityDedicated signal for the full-page unreachable state
/api/healthNext.js aggregate healthFrontend health plus backend/database reachability
If /api/workspace-status returns 503, the app shows a full-page unreachable state.
If backend is reachable but database is degraded or unconfigured, the UI stays up and shows a themed warning banner.
The service-health provider polls both /api/workspace-status and /api/health every 30 seconds and refreshes on focus or reconnect.
A single uptime monitor against either frontend health endpoint can help keep both the frontend and backend warm on Render-style platforms.
Example health responses
200 OK
GET /api/workspace-status
{"workspace":"available","backend":"ok","detail":null}

503 Service Unavailable
GET /api/workspace-status
{"workspace":"unavailable","backend":"unreachable","detail":"The frontend could not reach the backend liveness endpoint."}

200 OK
GET /api/health
{"frontend":"ok","backend":"ok","database":"ok","detail":null}

207 Multi-Status
GET /api/health
{"frontend":"ok","backend":"ok","database":"degraded","detail":"Supabase probe failed: APIError"}

207 Multi-Status
GET /api/health
{"frontend":"ok","backend":"unreachable","database":"unreachable","detail":null}

Endpoints

Core API reference

Jump link

These are the main paths you will touch most often. The full endpoint breakdown still lives in the API reference page and docs/API.md.

System

GET/healthBackend liveness check
GET/api/system/healthBackend plus database probe
GET/api/workspace-statusWorkspace availability endpoint
GET/api/healthFrontend aggregate health endpoint

Suites and prompts

GET/api/suitesList suites for the current user
POST/api/suitesCreate a suite
GET/api/suites/{id}/promptsList prompt versions
POST/api/suites/{id}/promptsCreate a prompt version

Drift runs

POST/api/agent/runRun a synchronous evaluation
POST/api/agent/run/streamRun a streaming SSE evaluation
GET/api/agent/runs/{run_id}Fetch a stored run and diagnostics
POST/api/ci/gateCheck a run against deployment thresholds

Recovery

Troubleshooting

Jump link

Most production failures fall into a handful of buckets: proxy target mistakes, missing Supabase configuration, or platform timeouts after cold starts.

SymptomLikely causeWhat to check first
Dashboard requests fail or parse as invalid JSONFrontend proxy cannot reach the backendConfirm BACKEND_INTERNAL_URL or NEXT_PUBLIC_API_URL points to the deployed FastAPI host.
Maintenance page appearsWorkspace availability endpoint cannot reach the backendCheck /api/workspace-status, then /api/health, then the backend's /health directly.
Warning banner but UI still rendersDatabase degraded or not configuredCheck backend SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY.
Failed to fetch after idle time on RenderCold start plus host timeoutUse a keep-alive monitor against the frontend /api/health endpoint.

Need more detail?

Reach for docs/API.md for production troubleshooting and environment guidance, or the support page if you want the shortest operator-focused summary.