Documentation
Everything you need to integrate Silo into your project, configure CI workflows, run evaluations, and work with the API.
Setup, operate, and troubleshoot Silo with fewer surprises.
This guide is organized for real production use: how the repo is structured, how sync and evaluation work, and how the app behaves when the backend or database is degraded.
Fast start
Use the quickstart sections if you need the shortest path from install to first run.
Ops aware
Maintenance mode and degraded database behavior are documented with the exact health endpoints.
CLI aligned
Every major workflow now calls out the matching CLI path, not just the HTTP API.
Recommended reading path
Start here
What Silo does
Silo compares an accepted baseline prompt against a candidate version on stored test cases, scores the result across quality, safety, format, task accuracy, and speed dimensions, and stores enough observability to make regressions explainable instead of mysterious.
silo/prompts/ inside your repo.silo sync, or scripts/silo_git_sync.py.Prompt change -> sync/create version -> run evaluation -> review diagnostics -> gate CI or accept baseline
Authentication
API keys and bearer auth
Silo supports Cognito JWTs for user sessions and silo_... user API keys for automation. API keys are intended for CLI usage, CI, and local scripts.
| Mode | Best for | Notes |
|---|---|---|
| Cognito JWT | Dashboard sessions | Required for creating, listing, and revoking API keys |
| User API key | CLI, CI, local tooling | Use as `Authorization: Bearer silo_...` |
export SILO_API_BASE_URL="https://silo-siix.onrender.com" export SILO_API_KEY="silo_abc123..."
Key visibility is one-time
Tooling
CLI package
The official CLI is published as silo-drift-cli, and the installed command name is silo. Use it to sync prompt suites, run drift checks, and connect your repo or CI to your already deployed Silo workspace.
npm install -g silo-drift-cli silo login --url "$SILO_API_BASE_URL" --key "$SILO_API_KEY" silo doctor silo whoami
The CLI talks to the Silo API only, so you can install it and start using the hosted app right away with your workspace URL and API key.
Website: silo-frontend.onrender.com
npm package: npmjs.com/package/silo-drift-cli
Production backend API: silo-siix.onrender.com
Contributors can run the CLI from this repo through packages/cli with npm install, npm run build, and then either npm link or node dist/index.js.
Repo layout
Repository layout
The quickest way to scaffold a compatible repository is silo setup. That creates the canonical silo/ directory and a starter GitHub Actions workflow.
your-repo/
|-- silo/
| |-- silo.yml
| \-- prompts/
| \-- <suite-name>/
| |-- prompt-a.prompt
| \-- prompt-b.prompt
\-- .github/
\-- workflows/
\-- silo-prompt-pipeline.yml*.prompt files under silo/prompts/ are synced.silo/silo.yml.Configuration
silo.yml configuration
silo/silo.yml is the source of truth for suite definitions, defaults, and per-prompt metadata.
suites: {}
suite_definitions:
support:
name: "Support"
description: "Customer support prompts"
defaults:
default_suite_model: "gpt-4o-mini"
context_threshold: 0.85
max_speed_regression_pct: 20.0
prompt_meta:
support/refund:
format_rules:
- required_fields: ["intent", "answer", "confidence"]
test_cases:
- input_text: "Can I return damaged shoes?"
expected_output: '{"intent":"refund","answer":"Start a return...","confidence":0.9}'| Field | Purpose |
|---|---|
suites | Sync cache of known suite ids |
suite_definitions | Names, descriptions, and suite-level model settings |
defaults | Global thresholds and shared run defaults |
prompt_meta | Per prompt-key metadata including rules and test cases |
Automation
GitHub Actions workflow
Silo works best when prompt sync and evaluation happen automatically on push and PR events. The dashboard and CLI are excellent for iteration, but CI is where regressions get caught early.
name: Prompt pipeline
on:
push:
branches: [main]
paths:
- "silo/prompts/**"
- "silo/silo.yml"
jobs:
sync-and-evaluate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: npm install -g silo-drift-cli
- run: silo sync --repo . --base HEAD~1 --head HEAD
env:
SILO_API_BASE_URL: ${{ secrets.SILO_API_BASE_URL }}
SILO_API_KEY: ${{ secrets.SILO_API_KEY }}Secrets model
SILO_API_BASE_URL and SILO_API_KEY. Fork PRs do not receive those secrets, so design the workflow to skip safely for forked contributions.Local dev
Local sync and dry runs
You can run the same sync flow locally before opening a PR. That keeps prompt iteration fast and reduces noisy CI runs.
export SILO_API_BASE_URL="https://silo-siix.onrender.com" export SILO_API_KEY="silo_your_key_here" silo sync --repo . --base HEAD~1 --head HEAD silo sync --repo . --base HEAD~1 --head HEAD --dry-run --json
If you prefer the existing Python sync path, scripts/silo_git_sync.py remains available and still maps to the same API surface.
Run behavior
Evaluation behavior
Drift runs operate on stored test cases and accept an evaluation_config payload for presets, judge policy, early-stop behavior, and two-stage evaluation.
| Preset | Use case | Characteristics |
|---|---|---|
full | Deep inspection | Most complete scoring; highest cost |
ci | Branch protection | Balanced observability and guardrail enforcement |
iteration | Fast prompt iteration | Aggressive optimizations and quicker feedback |
generative | Long-form responses | Heavier emphasis on semantic and judge checks |
{
"suite_id": "<uuid>",
"prompt_key": "support",
"candidate_version_id": 3,
"baseline_version_id": 2,
"test_case_ids": ["<uuid>"],
"evaluation_config": {
"preset": "ci",
"judge": { "policy": "disagreement", "disagreement_threshold": 0.15 },
"two_stage_enabled": true,
"early_stop": { "enabled": true, "min_cases": 5, "severity_threshold": 0.2 }
}
}Quality control
CI result gate
After a run completes, use the CI gate when your team needs stricter deployment thresholds than the default run pass/fail value.
silo gate --run-id "<run-id>" --config thresholds.json
| Threshold | Meaning |
|---|---|
SILO_GATE_REQUIRE_PASSED | Require the run's passed field to be true |
SILO_GATE_MIN_CONTEXT_SIMILARITY | Minimum acceptable context similarity |
SILO_GATE_MIN_CANDIDATE_SCORE | Composite score floor for candidate output |
SILO_GATE_MAX_SPEED_REGRESSION_PCT | Maximum allowed slowdown vs baseline |
Reliability
Operations and health
The app now distinguishes between a full backend outage and a degraded database layer. That distinction powers the maintenance experience in the authenticated UI.
| Endpoint | Layer | Meaning |
|---|---|---|
/health | FastAPI root | Backend process is alive |
/api/system/health | FastAPI deep health | Backend plus lightweight Supabase probe |
/api/workspace-status | Next.js workspace availability | Dedicated signal for the full-page unreachable state |
/api/health | Next.js aggregate health | Frontend health plus backend/database reachability |
/api/workspace-status returns 503, the app shows a full-page unreachable state.degraded or unconfigured, the UI stays up and shows a themed warning banner./api/workspace-status and /api/health every 30 seconds and refreshes on focus or reconnect.200 OK
GET /api/workspace-status
{"workspace":"available","backend":"ok","detail":null}
503 Service Unavailable
GET /api/workspace-status
{"workspace":"unavailable","backend":"unreachable","detail":"The frontend could not reach the backend liveness endpoint."}
200 OK
GET /api/health
{"frontend":"ok","backend":"ok","database":"ok","detail":null}
207 Multi-Status
GET /api/health
{"frontend":"ok","backend":"ok","database":"degraded","detail":"Supabase probe failed: APIError"}
207 Multi-Status
GET /api/health
{"frontend":"ok","backend":"unreachable","database":"unreachable","detail":null}Endpoints
Core API reference
These are the main paths you will touch most often. The full endpoint breakdown still lives in the API reference page and docs/API.md.
System
| GET | /health | Backend liveness check |
| GET | /api/system/health | Backend plus database probe |
| GET | /api/workspace-status | Workspace availability endpoint |
| GET | /api/health | Frontend aggregate health endpoint |
Suites and prompts
| GET | /api/suites | List suites for the current user |
| POST | /api/suites | Create a suite |
| GET | /api/suites/{id}/prompts | List prompt versions |
| POST | /api/suites/{id}/prompts | Create a prompt version |
Drift runs
| POST | /api/agent/run | Run a synchronous evaluation |
| POST | /api/agent/run/stream | Run a streaming SSE evaluation |
| GET | /api/agent/runs/{run_id} | Fetch a stored run and diagnostics |
| POST | /api/ci/gate | Check a run against deployment thresholds |
Recovery
Troubleshooting
Most production failures fall into a handful of buckets: proxy target mistakes, missing Supabase configuration, or platform timeouts after cold starts.
| Symptom | Likely cause | What to check first |
|---|---|---|
| Dashboard requests fail or parse as invalid JSON | Frontend proxy cannot reach the backend | Confirm BACKEND_INTERNAL_URL or NEXT_PUBLIC_API_URL points to the deployed FastAPI host. |
| Maintenance page appears | Workspace availability endpoint cannot reach the backend | Check /api/workspace-status, then /api/health, then the backend's /health directly. |
| Warning banner but UI still renders | Database degraded or not configured | Check backend SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY. |
| Failed to fetch after idle time on Render | Cold start plus host timeout | Use a keep-alive monitor against the frontend /api/health endpoint. |
Need more detail?
docs/API.md for production troubleshooting and environment guidance, or the support page if you want the shortest operator-focused summary.