Tools Spec¶

Tool interaction is modeled as two strict types:

ToolCall — what the assistant asks to run
ToolResult — what the runtime reports back

Plus one orthogonal classifier — ToolMode — that the harness uses to decide whether a call needs explicit user consent.

ToolCall¶

Field	Type	Required	Notes
`id`	`string`	yes	Unique within a chat (use cuid2 or similar)
`name`	`string`	yes	Tool name registered with the runtime
`arguments`	`Record<string, unknown>`	yes	LLM-provided JSON args (validated by tool's schema)

additionalProperties is disabled so tool envelopes stay deterministic across versions. New per-call metadata should go through the harness's TraceSpan.attrs, not into ToolCall.

ToolResult¶

Field	Type	Required	Notes
`success`	`boolean`	yes	Hard distinction — `false` flips status to error
`terminal`	`boolean`	no	Explicitly mark the result as terminal
`needsFollowup`	`boolean`	no	Even on `success: false`, re-prompt the LLM
`nextAction`	`string`	no	Machine-readable hint for the next operation
`message`	`string`	no	User-facing text (rendered in the bubble)
`error`	`string`	no	Debug-friendly error string (logged + shown)
`data`	`Record<string, unknown>`	no	Arbitrary structured payload

additionalProperties is enabled for forward compatibility.

ToolMode (harness classifier)¶

The harness's decide_tool_mode(name) returns one of:

Mode	Meaning	Default UI treatment
`read`	Pure inspection (no side effects)	Auto-run, no consent
`safe_write`	Bounded mutation (e.g. update_event)	Auto-run with diff preview
`destructive`	Irreversible (delete_, drop_, …)	Auto-run, undo affordance
`local`	Touches the user's machine	Requires consent
`external`	Calls outside services	Auto-run, log

Pattern rules (TypeScript regex equivalents in @steerable/agent-ui/useToolCallStatus):

^get_  | ^list_  | ^read_  | ^search_   →  read
^create_ | ^update_ | ^add_ | ^set_     →  safe_write
^delete_ | ^remove_ | ^archive_ | ^drop_ →  destructive
^local_ | ^shell_ | ^exec_              →  local

You can override the inferred mode at registration time via the @tool decorator's mode= kwarg.

Completion semantics¶

isTerminalResult(result) (TS) / is_terminal_result(result.model_dump()) (Py) treats a result as terminal when:

terminal == true, or
success == false and needsFollowup != true

Use needsFollowup=True on a failure to ask the LLM to self-heal (write a different argument, try a different tool, etc.). Without it, a failed call ends the run.

Example pair¶

// ToolCall
{"id":"c_42","name":"create_event","arguments":{"title":"Lunch","start":"2026-05-15T12:00:00Z"}}

// ToolResult (success)
{"success":true,"message":"Event created.","data":{"eventId":"e_777"}}

// ToolResult (recoverable failure)
{"success":false,"needsFollowup":true,"error":"Invalid date format","message":"Please retry with ISO-8601."}

// ToolResult (terminal failure)
{"success":false,"terminal":true,"error":"Calendar service unavailable"}