Each section below covers one failure — the log line you see, why it happens, and what to do. For the full error catalog with every error kind and retry formula, see the error reference.
Agent won't start¶
level=ERROR msg="worker run failed, non-retryable, releasing claim" error="agent: agent_not_found: claude not found in PATH"
The agent binary isn't installed or isn't on PATH.
-
Check whether the binary exists:
which claude -
If it's installed under a different name or path, set
agent.command:agent: kind: claude-code command: /usr/local/bin/claude-code -
For SSH workers, the binary must exist on every remote host. Exit code
127in logs means the remote host is missing it:ssh build01.internal "which claude && echo ok" -
Confirm the fix:
sortie validate ./WORKFLOW.md
Agent crashes on authentication¶
level=ERROR msg="worker run failed, scheduling retry" error="agent: port_exit: exit status 1"
Workers start and immediately crash. The actual cause — a missing ANTHROPIC_API_KEY — lives inside the agent subprocess, not in Sortie's error output. This is the most common deployment failure.
-
Verify the variable is set:
echo "${ANTHROPIC_API_KEY:-(unset)}" -
For AWS Bedrock or Google Vertex AI, verify all required variables are set. See environment variables reference for the full list.
-
Run with
--log-level debugto see the agent's stderr, which contains the actual auth error.
Tracker returns 401 or 403¶
level=ERROR msg="poll failed" error="tracker: tracker_auth_error: HTTP 401: Unauthorized"
The API token is wrong, expired, or lacks required permissions. This error is non-retryable — Sortie stops polling until you fix it.
-
Verify the environment variable resolves to a non-empty value:
echo "${SORTIE_JIRA_API_KEY:-(unset)}" -
Test the token directly:
curl -s -H "Authorization: Bearer $SORTIE_JIRA_API_KEY" \ "https://yourcompany.atlassian.net/rest/api/3/myself" | head -5 -
If you use
handoff_state, the token needs write permissions:write:jira-work(classic) orwrite:issue:jira(granular).
Template render fails¶
level=ERROR msg="template render error in WORKFLOW.md (line 24): can't evaluate field titel in type map[string]any"
Sortie runs templates in strict mode — unknown variables are hard errors. Three common causes:
-
Typo in a field name. Check the name against the variable table. The error message names the exact field and line.
-
Unguarded nil field.
.issue.parentisnilwhen no parent exists. Wrap it:{{ if .issue.parent }}{{ .issue.parent.identifier }}{{ end }} -
Dot rebinding inside
range. Inside{{ range .issue.labels }},.is the current element. Use{{ $.issue.identifier }}to reach the root.
Run sortie validate ./WORKFLOW.md after every template edit to catch these before runtime.
Workspace won't create¶
level=ERROR msg="workspace create: permission denied: /opt/sortie_workspaces/PROJ-42"
Three variants:
-
Permission denied. The process user can't write to
workspace.root. Fix permissions or change the root to a writable path like~/sortie-workspaces. -
Containment violation (
path escapes root). An issue identifier produced a path outside the workspace root — a security boundary. Investigate the identifiers in your tracker. -
Disk full. Check with
df -h /opt/sortie_workspaces.
Hook script fails¶
level=WARN msg="worker run failed, scheduling retry" error="hook after_create: run: exit status 128"
A hook exited non-zero. after_create and before_run failures are fatal for the attempt; after_run and before_remove are logged but ignored.
-
Run with
--log-level debug— Sortie captures the hook's stdout and stderr. -
Test the hook manually:
mkdir /tmp/test-ws && cd /tmp/test-ws git clone --depth 1 git@github.com:acme/backend.git .Common causes: SSH key not forwarded, wrong repo URL, missing dependencies.
-
For timeout errors, increase
hooks.timeout_msin WORKFLOW.md.
Issues not being dispatched¶
level=INFO msg="tick completed" candidates=0 dispatched=0 running=0 retrying=0
Sortie is polling but finds nothing to dispatch.
-
State names must match exactly. Verify
tracker.active_statesmatches your tracker (case-sensitive)."To Do"and"to do"are different states. -
Use dry-run to see what Sortie would dispatch:
sortie --dry-run ./WORKFLOW.mdEach candidate gets a
would_dispatchorskip_reasonfield in the log. -
Concurrency cap reached. If
runningequalsagent.max_concurrent_agents, new issues wait. Increase the cap or wait for running agents to finish. -
Query filter too narrow. A typo in
tracker.query_filterreturns zero results. Use--dry-run --log-level debugto see the full query.
Sortie won't start at all¶
dispatch preflight failed: tracker.kind is required
Sortie validates the config at startup and reports all failures at once. Run sortie validate ./WORKFLOW.md to see every problem. The most common missing fields:
| Field | Required by |
|---|---|
tracker.kind |
Always |
tracker.project |
Jira adapter |
tracker.api_key |
Jira adapter (after $VAR expansion) |
active_states or terminal_states |
At least one non-empty |
If $VAR references aren't resolving, verify the variables are exported in the shell that runs Sortie:
env | grep SORTIE
See the workflow configuration reference for every field, default, and constraint.