Sortie embeds an HTTP server that exposes a JSON API, an HTML dashboard, health probes, and Prometheus metrics — all on a single port.
Enabling the server¶
The HTTP server is opt-in. Two ways to turn it on:
CLI flag — pass --port <N> when launching Sortie:
sortie --port 8080 WORKFLOW.md
Workflow config — set server.port in the WORKFLOW.md front matter extensions:
---
server:
port: 8080
# ... rest of config
---
When both are present, --port wins. The server binds to 127.0.0.1 only — it does not listen on all interfaces. Port 0 requests an OS-assigned ephemeral port, which Sortie logs at startup. Useful for tests and local development where you don't care about a stable port number.
All surfaces share the same port: the dashboard at /, health probes at /livez and /readyz, the JSON API at /api/v1/*, and Prometheus metrics at /metrics. Changing the port requires a restart — there is no hot-rebind.
For the full server extension schema, see WORKFLOW.md configuration reference. For Prometheus metric definitions, see Prometheus metrics reference.
GET / — HTML dashboard¶
Server-rendered HTML page showing real-time system state. Auto-refreshes in the browser.
curl http://localhost:8080/
The dashboard displays running sessions (identifier, state, turn count, duration, last event, tokens), the retry queue (identifier, attempt, due-in, error), summary cards (running count, retrying count, available slots, total tokens), uptime, version, and aggregate runtime and token totals.
Returns text/html. This is not a JSON endpoint.
GET /api/v1/state — System state¶
Returns a full runtime snapshot: running sessions, retry queue, aggregate totals, and rate limits.
curl http://localhost:8080/api/v1/state
Response¶
{
"generated_at": "2026-03-26T14:30:00Z",
"counts": {
"running": 2,
"retrying": 1
},
"running": [
{
"issue_id": "abc123",
"issue_identifier": "MT-649",
"state": "In Progress",
"session_id": "session-abc-001",
"turn_count": 7,
"last_event": "turn_completed",
"last_message": "",
"started_at": "2026-03-26T14:10:12Z",
"last_event_at": "2026-03-26T14:29:59Z",
"workspace_path": "/tmp/sortie_workspaces/MT-649",
"tokens": {
"input_tokens": 12500,
"output_tokens": 3200,
"total_tokens": 15700,
"cache_read_tokens": 8400
},
"model_name": "claude-sonnet-4-20250514",
"api_request_count": 12,
"requests_by_model": {
"claude-sonnet-4-20250514": 12
},
"tool_time_percent": 34.7,
"api_time_percent": 51.2
}
],
"retrying": [
{
"issue_id": "def456",
"issue_identifier": "MT-650",
"attempt": 3,
"due_at": "2026-03-26T14:35:00Z",
"error": "agent exited with code 1"
}
],
"agent_totals": {
"input_tokens": 45000,
"output_tokens": 18200,
"total_tokens": 63200,
"cache_read_tokens": 31500,
"seconds_running": 2847.3
},
"rate_limits": {}
}
Field notes¶
running[] entries:
| Field | Description |
|---|---|
tokens |
Nested object with input_tokens, output_tokens, total_tokens, and cache_read_tokens for this session. |
workspace_path |
Absolute filesystem path to the issue's workspace directory. |
model_name |
LLM model in use. Omitted when unknown. |
api_request_count |
Total API requests made by the agent in this session. |
requests_by_model |
Breakdown of API requests per model. Omitted when empty. |
tool_time_percent |
Percentage of elapsed wall-clock time spent in tool execution. null when not yet computed. |
api_time_percent |
Percentage of elapsed wall-clock time spent waiting on API calls. null when not yet computed. |
agent_totals: Cumulative across all sessions since Sortie started. seconds_running includes elapsed time from currently active sessions, not only completed ones.
rate_limits: Reserved for future use. Currently an empty object.
Status codes¶
| Code | Meaning |
|---|---|
200 OK |
Snapshot returned. |
503 Service Unavailable |
Orchestrator state snapshot could not be produced. |
GET /api/v1/{identifier} — Issue detail¶
Returns issue-specific runtime and debug details. The {identifier} path parameter is the issue identifier (e.g., MT-649), not the internal issue ID.
curl http://localhost:8080/api/v1/MT-649
Response (running issue)¶
{
"issue_identifier": "MT-649",
"issue_id": "abc123",
"status": "running",
"workspace": {
"path": "/tmp/sortie_workspaces/MT-649"
},
"attempts": {
"restart_count": 0,
"current_retry_attempt": 0
},
"running": {
"issue_id": "abc123",
"issue_identifier": "MT-649",
"state": "In Progress",
"session_id": "session-abc-001",
"turn_count": 7,
"last_event": "turn_completed",
"last_message": "Working on tests",
"started_at": "2026-03-26T14:10:12Z",
"last_event_at": "2026-03-26T14:29:59Z",
"workspace_path": "/tmp/sortie_workspaces/MT-649",
"tokens": {
"input_tokens": 12500,
"output_tokens": 3200,
"total_tokens": 15700,
"cache_read_tokens": 8400
},
"model_name": "claude-sonnet-4-20250514",
"api_request_count": 12,
"requests_by_model": {
"claude-sonnet-4-20250514": 12
},
"tool_time_percent": 34.7,
"api_time_percent": 51.2
},
"retry": null,
"recent_events": [],
"last_error": null,
"tracked": {}
}
Response (retrying issue)¶
When an issue is in the retry queue rather than actively running, status is "retrying", running is null, and retry is populated:
{
"issue_identifier": "MT-650",
"issue_id": "def456",
"status": "retrying",
"workspace": null,
"attempts": {
"restart_count": 2,
"current_retry_attempt": 3
},
"running": null,
"retry": {
"issue_id": "def456",
"issue_identifier": "MT-650",
"attempt": 3,
"due_at": "2026-03-26T14:35:00Z",
"error": "agent exited with code 1"
},
"recent_events": [],
"last_error": "agent exited with code 1",
"tracked": {}
}
Field notes¶
| Field | Description |
|---|---|
status |
One of "running" or "retrying". Derived from which queue the issue appears in. |
workspace |
Contains path when the issue has an active workspace. null for retrying issues or when the workspace path is unknown. |
attempts.restart_count |
How many times this issue has been restarted (attempt minus one, floored at zero). |
attempts.current_retry_attempt |
The current attempt number. 0 for running issues that haven't retried. |
running |
Full running entry (same shape as entries in /api/v1/state), or null. |
retry |
Full retry entry, or null. |
recent_events |
Reserved for future use. Currently an empty array. |
last_error |
Most recent error message from the retry queue, or null. |
tracked |
Reserved for future use. Currently an empty object. |
Status codes¶
| Code | Meaning |
|---|---|
200 OK |
Issue found and returned. |
404 Not Found |
Identifier not present in any active queue. The issue may have completed, or it may not exist. |
503 Service Unavailable |
Orchestrator state snapshot could not be produced. |
POST /api/v1/refresh — Trigger poll cycle¶
Queues an immediate poll and reconciliation cycle. Useful for CI integrations that push issues and want Sortie to pick them up without waiting for the next poll interval.
curl -X POST http://localhost:8080/api/v1/refresh
Response (202 Accepted)¶
{
"queued": true,
"coalesced": false,
"requested_at": "2026-03-26T14:30:05Z",
"operations": ["poll", "reconcile"]
}
coalesced: true means a refresh was already pending when your request arrived. The request was not lost — it merged with the existing pending signal. You don't need to retry.
Response (409 Conflict — draining)¶
If Sortie is shutting down, the refresh is rejected:
{
"queued": false,
"coalesced": false,
"requested_at": "2026-03-26T14:30:05Z",
"operations": []
}
Status codes¶
| Code | Meaning |
|---|---|
202 Accepted |
Refresh queued (or coalesced with a pending refresh). |
405 Method Not Allowed |
Used a method other than POST. |
409 Conflict |
Server is draining; refresh rejected. |
GET /livez — Liveness probe¶
Lightweight liveness check for container orchestrators. Returns 200 when the process is alive, 503 when draining.
curl http://localhost:8080/livez
Response (200 OK)¶
{
"status": "pass"
}
Response (503 — draining)¶
{
"status": "fail"
}
GET /readyz — Readiness probe¶
Deep readiness check that validates database connectivity, preflight configuration, and workflow loading. Use this for Kubernetes readiness probes or load balancer health checks.
curl http://localhost:8080/readyz
Response (200 OK)¶
{
"status": "pass",
"version": "0.5.0",
"uptime_seconds": 3742.8,
"checks": {
"database": "pass",
"preflight": "pass",
"workflow": "pass"
}
}
Response (503 — one or more checks failed)¶
{
"status": "fail",
"version": "0.5.0",
"uptime_seconds": 3742.8,
"checks": {
"database": "pass",
"preflight": "fail",
"workflow": "pass"
}
}
Each check is independent. status is "pass" only when every individual check passes.
| Check | What it validates |
|---|---|
database |
SQLite database is accessible and responds to a ping. |
preflight |
Dispatch preflight validation is passing (agent binary exists, workspace root is writable, etc.). |
workflow |
Workflow file has been successfully loaded at least once. |
Status codes¶
| Code | Meaning |
|---|---|
200 OK |
All checks pass. |
503 Service Unavailable |
One or more checks failed, or server is draining. |
GET /metrics — Prometheus metrics¶
Standard Prometheus text exposition format. Available on the same port as all other endpoints when the HTTP server is enabled.
curl http://localhost:8080/metrics
Returns text/plain with Prometheus metric families. For the full metric catalog — names, labels, types, PromQL examples, and cardinality model — see Prometheus metrics reference.
Error envelope¶
All JSON API errors use a consistent structure:
{
"error": {
"code": "issue_not_found",
"message": "issue identifier \"XYZ-999\" not found in current state"
}
}
Error codes¶
| Code | HTTP Status | Meaning |
|---|---|---|
issue_not_found |
404 | The requested issue identifier is not in any active queue. |
snapshot_unavailable |
503 | The orchestrator could not produce a state snapshot. |
method_not_allowed |
405 | The HTTP method is not supported on this endpoint. |
internal_error |
500 | Unexpected server error (e.g., JSON serialization failure). |
Method enforcement¶
Every endpoint enforces its allowed HTTP method. Sending the wrong method returns 405 Method Not Allowed with an Allow header indicating the correct method, and a JSON error envelope — not plain text.
curl -X DELETE http://localhost:8080/api/v1/state
{
"error": {
"code": "method_not_allowed",
"message": "method DELETE is not allowed on this endpoint"
}
}
The response includes the header Allow: GET (or Allow: POST for the refresh endpoint).
Endpoint summary¶
| Method | Path | Description | Content-Type |
|---|---|---|---|
| GET | / |
HTML dashboard | text/html |
| GET | /livez |
Liveness probe | application/json |
| GET | /readyz |
Readiness probe | application/json |
| GET | /api/v1/state |
Full system state snapshot | application/json |
| GET | /api/v1/{identifier} |
Per-issue detail | application/json |
| POST | /api/v1/refresh |
Trigger immediate poll cycle | application/json |
| GET | /metrics |
Prometheus metrics | text/plain |