Marathon: long-running agent tasks (CLI)
Marathon: long-running agent tasks (CLI)
Marathon is the Runtype CLI’s harness for long-running, multi-session agent tasks. It runs a saved agent across many sessions in your terminal with real-time streaming, local file and shell tools, automatic context compaction, checkpoints between sessions, and resumable state on disk.
Starting a marathon
Useful flags:
--max-sessions <n>and--max-cost <usd>: budget caps (defaults: 50 sessions)--model <modelId>: override the agent’s configured model--name <task>: task name used for the state files (defaults to the agent name)--resume [message]/--fresh: continue from saved state, or start over--sandbox <provider>: enable sandboxed code execution--session-search/--no-session-search: session context indexing and thesearch_session_historylocal tool are enabled by default; pass--no-session-searchto opt out
State persists under ~/.runtype/projects/<hash>/marathons/: a <task>.json snapshot, a <task>.tree.jsonl session tree (see below), a <task>.events/ directory of raw stream events, and a <task>/outputs/ artifact store for large offloaded tool outputs referenced by the context ledger. Keep the outputs directory with the .json and .tree.jsonl files when moving or backing up a marathon task.
Steering while the agent works
Press Enter at any time while the agent is streaming to open the steering composer:
- Enter queues your message. The in-flight session wraps up at the next tool call and your message is delivered at the start of the next session, usually within seconds.
- Tab toggles delivery timing between “next turn” and “after all work” (a follow-up that fires when the task would otherwise finish).
- Esc closes the composer and keeps your draft.
A counter above the status bar shows how many messages are queued. Note that early in a run the agent may be in a planning phase where file writes are restricted to the plan; a steer that asks for immediate writes takes effect once the plan is updated.
To interrupt the agent entirely, press Esc twice (outside the composer) while it works. The in-flight session aborts immediately — cost and progress observed so far are preserved — and the marathon lands at a “stopped” checkpoint. Any queued steering messages and composer draft are restored into the checkpoint input, so you can edit them and press Enter to continue with new instructions, or press Enter on an empty input to exit. A stopped task resumes later with --resume.
Checkpoints
After each session (and at the end of the task), marathon pauses at a checkpoint. Press Enter to continue, type a message to steer the next session, or use a slash command:
Session branches: /tree and /fork
Marathon records the conversation as a tree in <task>.tree.jsonl, so steering in a new direction never loses the original timeline.
/forklists the user messages on the current branch. Pick one, then type your new instruction: the conversation branches from that point and the next session runs with the truncated history plus your new steer. The original branch stays in the tree./treeshows the session tree with branches indented, the current head marked, and ledger metadata rows such as artifacts or compaction summaries labeled as metadata. Select a checkpoint row to switch the conversation head: the next session continues from that checkpoint with that branch’s history.
A typical flow: the agent went down the wrong path two sessions ago. At the next checkpoint, type /fork, pick the message where things were still on track, describe the better approach, and continue. If the new branch turns out worse, /tree switches you back.
Resuming
runtype marathon <agent> --name <task> --resume continues a saved task with its full history. Marathon shows a resume checkpoint first so you can steer, switch branches, or change settings before the next session starts.
Playbooks
A playbook replaces marathon’s default workflow (research, planning, execution) with milestones you define: their instructions, models, completion rules, and runtime guardrails. Pass one with --playbook:
Marathon resolves the name as an exact file path first, then .runtype/marathons/playbooks/<name>.yaml|yml|json|ts|mts in the current project, then the same paths under ~/.runtype/.
Milestones
Each milestone is a phase the agent works through in order. A minimal playbook:
Milestone fields:
Top-level fields beyond milestones: policy (below), stallPolicy (below), verification (require a passing check command before completion), rules (free-form standards applied to all milestones), and plugins (below).
Completion criteria
The type can also be a hook reference (see below) for fully custom logic.
Policies
The policy block narrows what the agent can do at runtime. Policies only restrict: they never override global safety denies (for example, .env files and private keys stay blocked). The matching guidance is added to the agent’s prompt automatically, so the model is told the rules instead of discovering them through blocked calls.
Stall policy
A session counts as empty when it makes no tool calls, even if the model produced text. The stallPolicy block controls what happens as empty sessions accumulate:
Milestone recovery messages trigger on the same counter, so a model that narrates intent without acting still gets corrected.
Hook references
Any behavior slot accepts a registered hook name instead of inline content. Names under builtin: expose the default workflow’s behaviors, so a playbook can reuse slices of the default instead of rebuilding them:
Referencing an unknown hook, or wiring a hook into the wrong slot, fails when the playbook loads rather than mid-run.
Plugins
YAML playbooks load custom hooks from JavaScript modules listed under plugins. Paths are relative to the playbook file and must stay inside its directory:
Hooks register under your own namespace (acme: here) and are referenced from any slot, for example recovery: acme:strict-recovery. Plugins run with your user privileges when the playbook loads, the same trust level as the verification commands a marathon already runs.
TypeScript playbooks
Playbooks can be TypeScript modules (.ts or .mts). They load at runtime with no build step, and every behavior slot accepts a plain function, so custom logic needs no plugin or hook registration:
definePlaybook comes from @runtypelabs/sdk (install it as a devDependency for editor types). It is optional: a plain object export with the same shape works without the package installed. To register named hooks instead, export a factory: export default ({ registerWorkflowHook }) => ({ ... }).