Documentation

Everything you need to integrate Silo into your project, configure CI workflows, run evaluations, and work with the API.

Documentation hub

Setup, operate, and troubleshoot Silo with fewer surprises.

This guide is organized for real production use: how the repo is structured, how sync and evaluation work, and how the app behaves when the backend or database is degraded.

Fast start

Use the quickstart sections if you need the shortest path from install to first run.

Ops aware

Maintenance mode and degraded database behavior are documented with the exact health endpoints.

CLI aligned

Every major workflow now calls out the matching CLI path, not just the HTTP API.

What Silo does

Jump link

Silo compares an accepted baseline prompt against a candidate version on stored test cases, scores the result across quality, safety, format, task accuracy, and speed dimensions, and stores enough observability to make regressions explainable instead of mysterious.

Keep prompt files in silo/prompts/ inside your repo.

Sync changes from the dashboard, silo sync, or scripts/silo_git_sync.py.

Run drift on stored test cases, not generated-on-the-fly cases.

Use the dashboard for diagnostics, historical runs, and API key management.

Typical flow

Prompt change -> sync/create version -> run evaluation -> review diagnostics -> gate CI or accept baseline

Authentication

API keys and bearer auth

Jump link

Silo supports Cognito JWTs for user sessions and silo_... user API keys for automation. API keys are intended for CLI usage, CI, and local scripts.

Mode	Best for	Notes
Cognito JWT	Dashboard sessions	Required for creating, listing, and revoking API keys
User API key	CLI, CI, local tooling	Use as `Authorization: Bearer silo_...`

Environment

export SILO_API_BASE_URL="https://silo-siix.onrender.com"
export SILO_API_KEY="silo_abc123..."

Key visibility is one-time

Copy a newly created API key immediately. The dashboard will not show the raw secret again after creation.

Tooling

CLI package

Jump link

The official CLI is published as silo-drift-cli, and the installed command name is silo. Use it to sync prompt suites, run drift checks, and connect your repo or CI to your already deployed Silo workspace.

Install and verify

npm install -g silo-drift-cli
silo login --url "$SILO_API_BASE_URL" --key "$SILO_API_KEY"
silo doctor
silo whoami

The CLI talks to the Silo API only, so you can install it and start using the hosted app right away with your workspace URL and API key.

Website: silo-frontend.onrender.com

npm package: npmjs.com/package/silo-drift-cli

Production backend API: silo-siix.onrender.com

Contributors can run the CLI from this repo through packages/cli with npm install, npm run build, and then either npm link or node dist/index.js.

Repo layout

Repository layout

Jump link

The quickest way to scaffold a compatible repository is silo setup. That creates the canonical silo/ directory and a starter GitHub Actions workflow.

Expected structure

your-repo/
|-- silo/
|   |-- silo.yml
|   \-- prompts/
|       \-- <suite-name>/
|           |-- prompt-a.prompt
|           \-- prompt-b.prompt
\-- .github/
    \-- workflows/
        \-- silo-prompt-pipeline.yml

Only *.prompt files under silo/prompts/ are synced.

Each prompt folder maps to a suite and can be auto-created during sync.

The scaffolded workflow is a starting point; customize it for your repo conventions.

You still need to define suites and prompt metadata in silo/silo.yml.

Configuration

silo.yml configuration

Jump link

silo/silo.yml is the source of truth for suite definitions, defaults, and per-prompt metadata.

silo/silo.yml

suites: {}

suite_definitions:
  support:
    name: "Support"
    description: "Customer support prompts"

defaults:
  default_suite_model: "gpt-4o-mini"
  context_threshold: 0.85
  max_speed_regression_pct: 20.0

prompt_meta:
  support/refund:
    format_rules:
      - required_fields: ["intent", "answer", "confidence"]
    test_cases:
      - input_text: "Can I return damaged shoes?"
        expected_output: '{"intent":"refund","answer":"Start a return...","confidence":0.9}'

Field	Purpose
`suites`	Sync cache of known suite ids
`suite_definitions`	Names, descriptions, and suite-level model settings
`defaults`	Global thresholds and shared run defaults
`prompt_meta`	Per prompt-key metadata including rules and test cases

Automation

GitHub Actions workflow

Jump link

Silo works best when prompt sync and evaluation happen automatically on push and PR events. The dashboard and CLI are excellent for iteration, but CI is where regressions get caught early.

.github/workflows/silo-prompt-pipeline.yml

name: Prompt pipeline

on:
  push:
    branches: [main]
    paths:
      - "silo/prompts/**"
      - "silo/silo.yml"

jobs:
  sync-and-evaluate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
      - run: npm install -g silo-drift-cli
      - run: silo sync --repo . --base HEAD~1 --head HEAD
        env:
          SILO_API_BASE_URL: ${{ secrets.SILO_API_BASE_URL }}
          SILO_API_KEY: ${{ secrets.SILO_API_KEY }}

Secrets model

Use repository secrets for SILO_API_BASE_URL and SILO_API_KEY. Fork PRs do not receive those secrets, so design the workflow to skip safely for forked contributions.

Local dev

Local sync and dry runs

Jump link

You can run the same sync flow locally before opening a PR. That keeps prompt iteration fast and reduces noisy CI runs.

Local sync

export SILO_API_BASE_URL="https://silo-siix.onrender.com"
export SILO_API_KEY="silo_your_key_here"

silo sync --repo . --base HEAD~1 --head HEAD
silo sync --repo . --base HEAD~1 --head HEAD --dry-run --json

If you prefer the existing Python sync path, scripts/silo_git_sync.py remains available and still maps to the same API surface.

Run behavior

Evaluation behavior

Jump link

Drift runs operate on stored test cases and accept an evaluation_config payload for presets, judge policy, early-stop behavior, and two-stage evaluation.

Preset	Use case	Characteristics
`full`	Deep inspection	Most complete scoring; highest cost
`ci`	Branch protection	Balanced observability and guardrail enforcement
`iteration`	Fast prompt iteration	Aggressive optimizations and quicker feedback
`generative`	Long-form responses	Heavier emphasis on semantic and judge checks

POST /api/agent/run

{
  "suite_id": "<uuid>",
  "prompt_key": "support",
  "candidate_version_id": 3,
  "baseline_version_id": 2,
  "test_case_ids": ["<uuid>"],
  "evaluation_config": {
    "preset": "ci",
    "judge": { "policy": "disagreement", "disagreement_threshold": 0.15 },
    "two_stage_enabled": true,
    "early_stop": { "enabled": true, "min_cases": 5, "severity_threshold": 0.2 }
  }
}

Quality control

CI result gate

Jump link

After a run completes, use the CI gate when your team needs stricter deployment thresholds than the default run pass/fail value.

CLI gate

silo gate --run-id "<run-id>" --config thresholds.json

Threshold	Meaning
`SILO_GATE_REQUIRE_PASSED`	Require the run's passed field to be true
`SILO_GATE_MIN_CONTEXT_SIMILARITY`	Minimum acceptable context similarity
`SILO_GATE_MIN_CANDIDATE_SCORE`	Composite score floor for candidate output
`SILO_GATE_MAX_SPEED_REGRESSION_PCT`	Maximum allowed slowdown vs baseline

Reliability

Operations and health

Jump link

The app now distinguishes between a full backend outage and a degraded database layer. That distinction powers the maintenance experience in the authenticated UI.

Endpoint	Layer	Meaning
`/health`	FastAPI root	Backend process is alive
`/api/system/health`	FastAPI deep health	Backend plus lightweight Supabase probe
`/api/workspace-status`	Next.js workspace availability	Dedicated signal for the full-page unreachable state
`/api/health`	Next.js aggregate health	Frontend health plus backend/database reachability

If /api/workspace-status returns 503, the app shows a full-page unreachable state.

If backend is reachable but database is degraded or unconfigured, the UI stays up and shows a themed warning banner.

The service-health provider polls both /api/workspace-status and /api/health every 30 seconds and refreshes on focus or reconnect.

A single uptime monitor against either frontend health endpoint can help keep both the frontend and backend warm on Render-style platforms.

Example health responses

200 OK
GET /api/workspace-status
{"workspace":"available","backend":"ok","detail":null}

503 Service Unavailable
GET /api/workspace-status
{"workspace":"unavailable","backend":"unreachable","detail":"The frontend could not reach the backend liveness endpoint."}

200 OK
GET /api/health
{"frontend":"ok","backend":"ok","database":"ok","detail":null}

207 Multi-Status
GET /api/health
{"frontend":"ok","backend":"ok","database":"degraded","detail":"Supabase probe failed: APIError"}

207 Multi-Status
GET /api/health
{"frontend":"ok","backend":"unreachable","database":"unreachable","detail":null}

Endpoints

Core API reference

Jump link

These are the main paths you will touch most often. The full endpoint breakdown still lives in the API reference page and docs/API.md.

System

GET	`/health`	Backend liveness check
GET	`/api/system/health`	Backend plus database probe
GET	`/api/workspace-status`	Workspace availability endpoint
GET	`/api/health`	Frontend aggregate health endpoint

Suites and prompts

GET	`/api/suites`	List suites for the current user
POST	`/api/suites`	Create a suite
GET	`/api/suites/{id}/prompts`	List prompt versions
POST	`/api/suites/{id}/prompts`	Create a prompt version

Drift runs

POST	`/api/agent/run`	Run a synchronous evaluation
POST	`/api/agent/run/stream`	Run a streaming SSE evaluation
GET	`/api/agent/runs/{run_id}`	Fetch a stored run and diagnostics
POST	`/api/ci/gate`	Check a run against deployment thresholds

Recovery

Troubleshooting

Jump link

Most production failures fall into a handful of buckets: proxy target mistakes, missing Supabase configuration, or platform timeouts after cold starts.

Symptom	Likely cause	What to check first
Dashboard requests fail or parse as invalid JSON	Frontend proxy cannot reach the backend	Confirm `BACKEND_INTERNAL_URL` or `NEXT_PUBLIC_API_URL` points to the deployed FastAPI host.
Maintenance page appears	Workspace availability endpoint cannot reach the backend	Check `/api/workspace-status`, then `/api/health`, then the backend's `/health` directly.
Warning banner but UI still renders	Database degraded or not configured	Check backend `SUPABASE_URL` and `SUPABASE_SERVICE_ROLE_KEY`.
Failed to fetch after idle time on Render	Cold start plus host timeout	Use a keep-alive monitor against the frontend /api/health endpoint.

Need more detail?

Reach for docs/API.md for production troubleshooting and environment guidance, or the support page if you want the shortest operator-focused summary.