How to Monitor with Logs
Sortie emits structured logs to stderr. The default format is key=value text; an optional json mode produces newline-delimited JSON for log aggregation systems. Logs are always on — no configuration needed. They are the first place to look when something goes wrong.
Note
Sortie has no built-in log file or rotation option. Logs go to stderr only — file retention and rotation are the responsibility of your runtime environment. Use journald on systemd hosts, a Docker logging driver in containers, or a process supervisor such as supervisord elsewhere.
Prerequisites
- Sortie installed and running
That’s it. Logs work with zero configuration.
Understand the log format
Sortie supports two log formats: text (default) and JSON.
Text format (default)
Sortie uses slog.TextHandler. Every line is a flat key=value record:
time=2026-03-26T14:30:01.305+00:00 level=INFO msg="tick completed" candidates=2 dispatched=2 running=2 retrying=0JSON format
When --log-format json is active (or logging.format: json in the workflow file), each line is a self-contained JSON object:
{"time":"2026-03-26T14:30:01.305+00:00","level":"INFO","msg":"tick completed","candidates":2,"dispatched":2,"running":2,"retrying":0}JSON format is designed for log aggregation systems (Loki, Datadog, CloudWatch, ELK) that ingest newline-delimited JSON. See switch to JSON format below.
Common fields
Three structural fields appear on every line in both formats:
time— UTC timestamplevel—INFO,WARN,ERROR, orDEBUGmsg— human-readable message
Context fields appear on all issue-related lines, added automatically by the logging subsystem:
issue_id— tracker-internal ID (e.g.,abc123)issue_identifier— human-readable ticket key (e.g.,MT-649)session_id— agent session identifier (present once a session starts)
The one rule you need to remember: WARN means Sortie is handling it. ERROR means you need to.
WARN lines indicate automatic recovery — a retry is scheduled, a transient failure is being worked around. ERROR lines mean Sortie gave up and needs operator attention. If you grep for nothing else, grep for level=ERROR.
Control log verbosity
By default Sortie logs at INFO level. Use the --log-level flag to change it:
# See debug-level detail: poll decisions, state transitions, adapter calls
sortie --log-level debug ./WORKFLOW.md
# Reduce noise in production — only warnings and errors
sortie --log-level warn ./WORKFLOW.mdAccepted values: debug, info, warn, error. The flag applies before the workflow file is loaded, so startup messages reflect the requested level immediately.
Alternatively, set the level in the workflow file:
logging:
level: debugThe CLI flag takes precedence when both are set. Changing logging.level in the workflow file requires a restart — it is not picked up by dynamic reload.
Key log messages to watch
Here are the log messages that matter most, grouped by lifecycle phase.
Poll cycle
time=2026-03-26T14:30:01.305+00:00 level=INFO msg="tick completed" candidates=2 dispatched=2 running=2 retrying=0This is the heartbeat. It fires every poll interval and tells you how many issues were found (candidates), how many were dispatched this tick (dispatched), how many agents are active (running), and how many issues are awaiting retry (retrying). When candidates=0 dispatched=0, Sortie is idle.
Workspace preparation
time=2026-03-26T14:30:02.150+00:00 level=INFO msg="workspace prepared" issue_id=abc123 issue_identifier=MT-649 workspace=/tmp/sortie_workspaces/MT-649Sortie created (or reused) a workspace directory and ran any configured hooks. The workspace field shows the absolute path.
Agent session
time=2026-03-26T14:30:03.420+00:00 level=INFO msg="agent session started" issue_id=abc123 issue_identifier=MT-649 session_id=session-abc-001
time=2026-03-26T14:30:03.500+00:00 level=INFO msg="turn started" issue_id=abc123 issue_identifier=MT-649 turn_number=1 max_turns=5
time=2026-03-26T14:31:45.800+00:00 level=INFO msg="turn completed" issue_id=abc123 issue_identifier=MT-649 turn_number=1 max_turns=5Each issue gets a session with one or more turns. turn_number and max_turns show where the agent is in its work budget.
Tool calls
time=2026-03-26T14:31:12.300+00:00 level=INFO msg="tool call completed" issue_id=abc123 issue_identifier=MT-649 session_id=session-abc-001 tool=tracker_api duration_ms=145 result=success
time=2026-03-26T14:31:13.100+00:00 level=INFO msg="tool call completed" issue_id=abc123 issue_identifier=MT-649 session_id=session-abc-001 tool=tracker_api duration_ms=89 result=error error="tracker_auth_error: invalid API key"Every tool invocation gets a log line with the tool name, wall-clock duration, and outcome. The error field only appears when result=error.
Worker exit
time=2026-03-26T14:35:20.100+00:00 level=INFO msg="worker exiting" issue_id=abc123 issue_identifier=MT-649 exit_kind=normal turns_completed=5The worker finished its loop. exit_kind=normal means the agent completed its turns without error.
Handoff transition
time=2026-03-26T14:35:21.500+00:00 level=INFO msg="handoff transition succeeded, releasing claim" issue_id=abc123 issue_identifier=MT-649 session_id=session-abc-001 handoff_state=DoneSortie transitioned the issue to the configured handoff_state in the tracker and released its claim. The issue is done.
Tracker comments
time=2026-03-26T14:32:00.200+00:00 level=INFO msg="dispatch comment posted" issue_id=abc123 issue_identifier=MT-649
time=2026-03-26T14:35:21.600+00:00 level=INFO msg="tracker comment posted" issue_id=abc123 issue_identifier=MT-649 lifecycle=completionWhen tracker.comments flags are enabled, Sortie posts audit comments on the tracker issue at dispatch, completion, or failure. INFO means the comment was delivered. If the comment API call fails:
time=2026-03-26T14:35:21.600+00:00 level=WARN msg="tracker comment failed" issue_id=abc123 issue_identifier=MT-649 lifecycle=completion error="tracker: tracker_auth_error: POST /rest/api/3/issue/abc123/comment: 403"WARN — the comment failed but the session lifecycle is unaffected. Check API token permissions if persistent.
Errors and retries
time=2026-03-26T14:35:22.000+00:00 level=WARN msg="worker run failed, scheduling retry" issue_id=abc123 issue_identifier=MT-649 session_id=session-abc-001 error="agent: turn_timeout: context deadline exceeded" next_attempt=2 delay_ms=20000WARN with scheduling retry — Sortie is recovering automatically. The next_attempt and delay_ms fields tell you when the retry fires.
time=2026-03-26T14:35:22.500+00:00 level=ERROR msg="worker run failed, non-retryable, releasing claim" issue_id=abc123 issue_identifier=MT-649 session_id=session-abc-001 error="agent: agent_not_found: claude not found in PATH"ERROR — Sortie gave up. This issue won’t be retried. Fix the underlying problem (in this case, install the agent binary) and Sortie will pick the issue up on the next poll.
Dispatch preflight failures
time=2026-03-26T14:30:01.300+00:00 level=ERROR msg="dispatch preflight failed" error="dispatch preflight failed: unsupported tracker kind: \"gitlab\""This fires before any work is dispatched. It means your workflow configuration is invalid. Sortie can’t dispatch anything until you fix the config and restart.
Common grep patterns
These commands work against the text log format. For JSON logs, see JSON log filtering with jq below.
Follow a specific issue across its entire lifecycle:
grep 'issue_identifier=MT-649' sortie.logFind all errors that need your attention:
grep 'level=ERROR' sortie.logFind retries (to see which issues are struggling):
grep 'scheduling retry' sortie.logWatch dispatches in real time:
tail -f sortie.log | grep 'tick completed'Find tool call failures:
grep 'tool call completed.*result=error' sortie.logFollow a specific agent session across turns and tool calls:
grep 'session_id=session-abc-001' sortie.logSwitch to JSON format
For deployments that route logs to an aggregation system, switch to JSON output:
sortie --log-format json ./WORKFLOW.mdOr set it in the workflow file and leave the CLI unchanged:
logging:
format: jsonThe CLI flag takes precedence when both are set. Both formats carry the same structured fields — only the serialization differs.
JSON log filtering with jq
When running with --log-format json, use jq instead of grep for precise field-level filtering.
Follow a specific issue:
jq 'select(.issue_identifier == "MT-649")' sortie.logFind all errors:
jq 'select(.level == "ERROR")' sortie.logFind retries with their delay:
jq 'select(.msg | contains("scheduling retry")) | {issue: .issue_identifier, next_attempt, delay_ms}' sortie.logWatch dispatches in real time:
tail -f sortie.log | jq 'select(.msg == "tick completed")'Find tool call failures with duration:
jq 'select(.msg == "tool call completed" and .result == "error") | {tool, error, duration_ms}' sortie.logExtract a timeline for a specific session:
jq 'select(.session_id == "session-abc-001") | {time, msg, level}' sortie.logRedirect logs to a file
Sortie logs to stderr by default. Redirect to a file with shell redirection:
sortie ./WORKFLOW.md 2>sortie.logOr use tee to keep both console and file output:
sortie ./WORKFLOW.md 2>&1 | tee sortie.logFor systemd services, logs go to journald automatically. Watch them in real time with:
journalctl -u sortie -fOr filter for errors only:
journalctl -u sortie -p errWhat we covered
You now know how to read Sortie’s structured logs in both text and JSON formats, follow specific issues through the dispatch lifecycle, distinguish between warnings (automatic recovery) and errors (needs your attention), switch to JSON for log aggregation, filter JSON logs with jq, find tool call failures, and persist logs to a file. For the complete error catalog, see the error reference. For metric-based monitoring with Prometheus and Grafana, see Monitor with Prometheus. For real-time visual monitoring, see the dashboard reference.
Was this page helpful?