phi-core — System Architecture
1. Component Map
Agent trait + BasicAgent (src/agents/)
Responsibility: Agent (trait, agents/agent.rs) defines the runtime interface — prompting, state access, control, and steering queues. BasicAgent (struct, agents/basic_agent.rs) is the default in-memory implementation: owns the conversation, tools, and ModelConfig (provider identity), and is the application-facing entry point. Construction: BasicAgent::new(ModelConfig::anthropic(...)). The optional provider_override field bypasses ProviderRegistry for custom or test providers. SubAgentTool (agents/sub_agent.rs) implements AgentTool to delegate tasks to a child agent_loop().
Public interface:
prompt(text)— Send a text prompt; returns an event stream receiver.prompt_messages(messages)— Send one or more messages as a prompt; returns an event stream receiver.prompt_with_sender(text, tx)— Send a text prompt, streaming events to a caller-provided sender.continue_loop()— Resume from existing context withContinuationKind::Default; returns an event stream receiver.continue_loop_with_sender(tx, kind)— Resume with an explicitContinuationKind(Default,Rerun { tag }, orBranch { tag }), streaming events to a caller-provided sender.steer(msg)— Queue a message that will be injected mid-run between tool executions.follow_up(msg)— Queue a message to be processed after the agent would otherwise stop.abort()— Cancel the in-progress run by signalling the cancellation token.reset()— Clear messages, queues, streaming state, and cancel token to return the agent to its initial state.save_messages()— Serialize the current conversation to a JSON string.restore_messages(json)— Replace the current conversation with messages deserialized from a JSON string.with_skills(skill_set)— Load skills and append their XML index to the system prompt per the AgentSkills standard.with_mcp_server_stdio(cmd, args, env)— Connect to an MCP server by spawning a child process and add its tools to the agent.with_mcp_server_http(url)— Connect to an MCP server via HTTP and add its tools to the agent.with_openapi_file(path, config, filter)— Load tools from an OpenAPI spec file and add them to the agent.with_openapi_url(url, config, filter)— Fetch an OpenAPI spec from a URL and add its tools to the agent.with_openapi_spec(spec_str, config, filter)— Parse an OpenAPI spec string (JSON or YAML) and add its tools to the agent.new_session()— Immediately rotate to a newsession_id; resets loop counters andlast_loop_id; returns the new session id.check_and_rotate(threshold)— Rotate to a new session if the agent has been idle longer thanthresholdsince the lastprompt_*call; returnsSome(new_session_id)on rotation,Noneotherwise.
BasicAgent state relevant to session management:
| Field | Type | Description |
|---|---|---|
agent_id | String | Stable identifier across all sessions for this instance |
session_id | String | Current session identifier; updated by new_session() |
loop_counters | HashMap<String, usize> | Per-config loop counter; cleared on new_session() |
last_loop_id | Option<String> | Most recent loop; cleared on new_session() |
last_active_at | Option<DateTime<Utc>> | Timestamp of last prompt_* call; used by check_and_rotate() |
AgentLoop (src/agent_loop/)
Responsibility: The core execution engine. Manages the turn loop, tool dispatch, steering injection, follow-up processing, and lifecycle event emission. Public interface:
agent_loop(prompts, context, config, tx, cancel)— Start an agent run from new prompt messages, applying input filters, emitting lifecycle events, and returning all new messages produced.agent_loop_continue(context, config, tx, cancel)— Resume from existing context (no new prompts); used for retries after errors or mid-conversation continuation.agent_loop_parallel(prompts, base_context, configs, strategy, tx, cancel) -> ParallelLoopResult— Run NAgentLoopConfigs concurrently and evaluate results viaEvaluationStrategy. Whenpromptsis non-empty, each branch usesagent_loop; whenpromptsis empty, each branch usesagent_loop_continue(the user query is already the last message inbase_context).base_contextis cloned per branch (toolsArc-shared; message history deep-copied). All branches share the samesession_id; each gets a distinctloop_id.ParallelLoopOutcome.original_context_lenmarks the base/branch message boundary. EmitsParallelLoopStart/ParallelLoopEndevents.selected_contextfeeds intoagent_loop_continue()for normal session resumption.derive_config_segment(config) -> String(pub crate) — Derives the stable{config_segment}portion of aloop_idfromconfig.config_idor provider/model/thinking fields.
EvaluationLoop (src/agent_loop/evaluation.rs)
Responsibility: Pluggable strategy for selecting among parallel loop outcomes. Decoupled from src/agent_loop/ to allow custom implementations without a circular dependency (trait is defined in src/types/; implementations live here).
Public interface:
EvaluationStrategy(trait, defined insrc/types/) —evaluate(prompts, outcomes, tx, cancel) -> (EvaluationDecision, Usage)EvaluationDecision(enum, defined insrc/types/) —Select(usize)— 0-based index of the winning outcome.ParallelLoopOutcome.original_context_len: usize— Number of messages in the cloned context at dispatch time. Allows strategies to split "original context" from "new branch output" messages without separate bookkeeping. Identical across all outcomes (same base context);outcomes[0]is the idiomatic source.TransparentEvaluation— Single-branch pass-through; panics if> 1outcome.PickFirstEvaluation— Always selects index 0. Useful for testing.TokenEfficientEvaluation— Selects the outcome with the lowest total token usage.ElaborateEvaluation— Selects the outcome with the highest total token usage.LlmJudgeEvaluation { judge_config, system_prompt }— Runs a separate LLM call to select the best branch. Supports bothagent_loopmode (query fromprompts) andagent_loop_continuemode (query extracted from lastMessage::Userincontext.messages[..original_context_len]). Includes prior conversation context in the judge prompt. Applies 2-iteration compaction: Iteration 1 compacts only prior context (3 tiers: tail-truncate → paragraph-summary → hard char limit), keeping outputs intact; Iteration 2 (if needed) compacts both context and outputs independently through the same tier pipeline. Budget derived fromjudge_config.context_config.max_context_tokens. Emits aProgressMessagewarning if comprehension criteria cannot be satisfied after iteration 2.
ContextManager (src/context/)
Responsibility: Token estimation, tiered context compaction, and execution limit tracking. Public interface:
estimate_tokens(text)— Rough token count heuristic: ~4 characters per token.compact_messages(messages, config)— Reduce message list to fit token budget using a tiered strategy: truncate tool outputs → summarize old turns → drop middle messages.CompactionStrategy(trait) — Interface for custom compaction logic; default implementation uses the tiered cascade (legacycompact_messages(); modern: CompactionBlock overlays).ContextTracker— Tracks context window usage by combining provider-reported token counts with local estimates for recent messages.ExecutionTracker— Tracks turns, cumulative tokens, and elapsed time against configured limits; signals when any limit is exceeded.ContextConfig— Tuning knobs for compaction: token budget, system-prompt overhead, head/tail message preservation counts, per-tool-output line limit.ExecutionLimits— Hard caps on agent execution: max turns, max total tokens, max wall-clock duration.
ProviderRegistry (src/provider/registry.rs, src/provider/mod.rs)
Responsibility: Dispatches StreamConfig to the correct provider implementation based on model_config.api: ApiProtocol. Built inline per agent_loop() call; zero allocation for a registry with all built-in providers pre-registered.
Public interface:
ProviderRegistry::default()— Pre-registers all 7 built-in providers; used automatically byagent_loop()whenAgentLoopConfig.provider_overrideisNone.ProviderRegistry::new()— Create an empty registry for custom provider sets.- Provider resolution:
model_config.apiselects the wire-protocol handler;model_configfields (id,api_key,base_url,compat, etc.) differentiate services within the same protocol.
StreamProvider implementations (src/provider/)
Responsibility: Translate the unified StreamConfig into provider-specific HTTP requests and parse streaming responses back into StreamEvents.
Providers: AnthropicProvider, OpenAiCompatProvider (15+ backends), OpenAiResponsesProvider, AzureOpenAiProvider, GoogleProvider, GoogleVertexProvider, BedrockProvider, MockProvider.
Public interface:
StreamProvider::stream(config, tx, cancel) -> Result<Message, ProviderError>— stream a single LLM response.StreamProvider::provider_id() -> &str— stable lowercase identifier for this provider (e.g."anthropic","openai","google","bedrock"). Used as the first segment of the auto-derivedconfig_idinloop_idconstruction.
ToolSystem (src/tools/)
Responsibility: Built-in tool implementations. Each implements AgentTool.
Tools: BashTool (shell execution), ReadFileTool (text + image files), WriteFileTool (create/overwrite), EditFileTool (surgical search/replace), ListFilesTool (directory listing), SearchTool (grep/ripgrep).
Public interface:
default_tools()— Returns the standard built-in toolset: bash, read-file, write-file, edit-file, list-files, search.AgentTool::name()— Unique tool identifier used in LLM tool-use calls and event correlation.AgentTool::label()— Human-readable display name for UI.AgentTool::description()— Free-text description sent to the LLM to explain when to use the tool.AgentTool::parameters_schema()— JSON Schema object describing the tool's accepted parameters.AgentTool::execute(params, ctx)— Run the tool with resolved parameters and a context carrying the cancellation token and progress callbacks.
SubAgentTool (src/agents/sub_agent.rs)
Responsibility: Implements AgentTool to delegate tasks to a child agent_loop() with isolated context, its own toolset, and a turn limit. The child gets its own agent_id, session_id, and loop_id; its parent_loop_id is linked back to the calling loop via with_parent_loop_id.
Public interface:
SubAgentTool::new(name, model_config).with_*(...)— Construct a sub-agent tool with its ownModelConfig(provider identity), system prompt, toolset, and turn limit, then register it as anAgentTool.SubAgentTool::with_provider_override(provider)— BypassProviderRegistrydispatch; used in tests to injectMockProvider.SubAgentTool::with_parent_loop_id(loop_id)— Supply the parent loop'sloop_idso the childAgentStartevent carriesparent_loop_id, enabling ancestry tracing across the event stream.
SkillSystem (src/context/skills.rs)
Responsibility: Loads SKILL.md files from one or more directories, parses YAML frontmatter, and formats them as an XML index injected into the system prompt.
Public interface:
SkillSet::load(dirs)— Load skills from multiple directories; later entries override earlier ones on name conflict.SkillSet::load_dir(dir, source)— Load skills from a single directory, tagging each with a source label.SkillSet::merge(other)— Merge anotherSkillSetin; the other's skills override on name conflict.SkillSet::format_for_prompt()— Render the skill list as an<available_skills>XML block ready for system-prompt injection.
McpClient (src/mcp/)
Responsibility: MCP client that connects to external tool servers over stdio or HTTP. Adapts discovered tools into AgentTool instances.
Public interface:
McpClient::connect_stdio(cmd, args, env)— Spawn a child process, complete the JSON-RPC initialize handshake, and return a connected client.McpClient::connect_http(url)— Connect to an HTTP-based MCP server and complete the initialize handshake.McpToolAdapter::from_client(client)— Query the server for available tools and return oneAgentTooladapter per tool.
OpenApiAdapter (src/openapi/, feature-gated)
Responsibility: Parses OpenAPI 3.x specs and generates one AgentTool per operation. Each tool makes an HTTP request to the spec's base URL.
Public interface:
from_file(path, config, filter)— Parse an OpenAPI spec from a local file and return one tool adapter per matching operation.from_url(url, config, filter)— Fetch an OpenAPI spec over HTTP and return one tool adapter per matching operation.from_str(spec, config, filter)— Parse an OpenAPI spec from an in-memory string (auto-detects JSON vs YAML) and return one tool adapter per matching operation. Availability: Only compiled when theopenapifeature flag is enabled.
SessionStore (src/session/)
Responsibility: Persistent session layer. Records every AgentEvent into a structured tree of Session + LoopRecord objects, and provides both free-function and trait-based APIs for flat JSON-file persistence.
Public interface:
SessionRecorder::new(config)— Create a recorder; callon_event(event)for every event on the agent'stxchannel.SessionRecorder::flush()— Finalize all open loops (status →Aborted) and move them into their sessions.SessionRecorder::drain_completed()— Consume and return all completed sessions.SessionRecorder::sessions()— Iterate all known sessions (completed + in-progress).SessionRecorder::get_session(id)— Look up a session bysession_id.SessionRecorder::current_loop(id)— Look up an in-progressLoopRecordbyloop_id.save_session(session, dir)— Write{dir}/{session_id}.json(createsdirif needed). Atomic via tmp-file + rename.load_session(session_id, dir)— Read{dir}/{session_id}.json.list_session_ids(dir)— List all session ids indir, newest first.load_sessions_for_agent(agent_id, dir)— Load all sessions matchingagent_id.delete_session(session_id, dir)— Remove{dir}/{session_id}.json.SessionStoretrait — asyncsave/load/list_ids/delete/list_for_agentfor callers that want a pluggable store (custom backends, mocks). (Added 0.7.0)FileSystemSessionStore::new(dir)— In-tree async impl ofSessionStore. Adds advisoryfs2exclusive lock on save (returnsSessionError::Lockedif a concurrent writer holds it). (Added 0.7.0) File format: Pretty-printed JSON. Flat directory — one file per session, no index. Writes are atomic (tmp + rename) regardless of API surface used.
RetryEngine (src/provider/retry.rs)
Responsibility: Computes exponential-backoff delay with ±20% jitter. Classifies which errors are retryable. Public interface:
RetryConfig— Parameters for automatic retry: initial delay, backoff multiplier, max delay, max attempt count.RetryConfig::delay_for_attempt(attempt)— Compute the sleep duration before attempt N using exponential backoff with ±20% jitter.is_retryable()(onProviderError) — Returns true only forRateLimitedandNetworkvariants; all other errors fail immediately.retry_after()(onProviderError) — Extracts the server-specified retry delay from aRateLimited { retry_after_ms: Some(...) }error, if present.
2. Dependency Graph
graph TD
App["Application Code"] --> Agent
Agent --> AgentLoop["AgentLoop\nagent_loop/"]
AgentLoop --> ContextManager["ContextManager\ncontext/"]
AgentLoop --> ProviderRegistry["Provider\ntraits.rs / registry.rs"]
AgentLoop --> ToolSystem["ToolSystem\ntools/"]
AgentLoop --> RetryEngine["RetryEngine\nprovider/retry.rs"]
ProviderRegistry --> Anthropic["AnthropicProvider"]
ProviderRegistry --> OpenAI["OpenAiCompatProvider\n(15+ backends)"]
ProviderRegistry --> OpenAIResp["OpenAiResponsesProvider"]
ProviderRegistry --> Azure["AzureOpenAiProvider"]
ProviderRegistry --> Google["GoogleProvider"]
ProviderRegistry --> Vertex["GoogleVertexProvider"]
ProviderRegistry --> Bedrock["BedrockProvider"]
ProviderRegistry --> Mock["MockProvider\n(tests)"]
Agent --> SkillSystem["SkillSystem\ncontext/skills.rs"]
Agent --> McpClient["McpClient\nmcp/"]
Agent --> OpenApiAdapter["OpenApiAdapter\nopenapi/ (feature)"]
McpClient --> ToolSystem
OpenApiAdapter --> ToolSystem
SubAgent["SubAgentTool\nsub_agent.rs"] --> AgentLoop
ToolSystem --> SubAgent
Types["types/\n(shared types)"] --> Agent
Types --> AgentLoop
Types --> ToolSystem
Types --> ProviderRegistry
SessionStore["SessionStore\nsession/"] --> Types
App --> SessionStore
3. Data Flow
3.1 Simple Text Prompt (no tool calls)
sequenceDiagram
participant App
participant Agent
participant AgentLoop
participant Provider
participant EventCh as EventChannel
App->>Agent: prompt("What is 2+2?")
Agent->>AgentLoop: agent_loop(prompts, context, config, tx, cancel)
AgentLoop->>EventCh: AgentStart
AgentLoop->>EventCh: TurnStart
AgentLoop->>EventCh: MessageStart (user)
AgentLoop->>EventCh: MessageEnd (user)
AgentLoop->>Provider: stream(StreamConfig)
Provider-->>EventCh: StreamEvent::Start
Provider-->>EventCh: StreamEvent::TextDelta x N
Provider-->>EventCh: StreamEvent::Done(Message)
AgentLoop->>EventCh: MessageStart (assistant placeholder)
AgentLoop->>EventCh: MessageUpdate x N (deltas)
AgentLoop->>EventCh: MessageEnd (assistant final)
AgentLoop->>EventCh: TurnEnd
AgentLoop->>EventCh: AgentEnd(messages)
App->>EventCh: receives events via rx.recv()
3.2 Tool Call Cycle
sequenceDiagram
participant AgentLoop
participant Provider
participant BashTool
participant EventCh as EventChannel
AgentLoop->>Provider: stream(config with tool defs)
Provider-->>AgentLoop: Done(Message{stop_reason: ToolUse, content: [ToolCall{...}]})
AgentLoop->>EventCh: TurnEnd(assistant message)
AgentLoop->>AgentLoop: extract tool_calls from assistant content
AgentLoop->>EventCh: ToolExecutionStart(id, name, args)
AgentLoop->>BashTool: execute(params, ToolContext)
BashTool-->>EventCh: ProgressMessage (via on_progress callback)
BashTool-->>AgentLoop: Ok(ToolResult)
AgentLoop->>EventCh: ToolExecutionEnd(id, name, result, is_error=false)
AgentLoop->>EventCh: MessageStart(ToolResult message)
AgentLoop->>EventCh: MessageEnd(ToolResult message)
AgentLoop->>AgentLoop: append tool results to context.messages
AgentLoop->>Provider: stream(config, now includes tool results)
Provider-->>AgentLoop: Done(Message{stop_reason: Stop})
AgentLoop->>EventCh: TurnEnd
AgentLoop->>EventCh: AgentEnd
3.3 Context Compaction Trigger
sequenceDiagram
participant AgentLoop
participant ContextManager
participant Provider
AgentLoop->>ContextManager: compact(messages, config)
ContextManager->>ContextManager: total_tokens(messages) > budget?
alt Level 1 fits
ContextManager-->>AgentLoop: truncated tool outputs
else Level 2 fits
ContextManager-->>AgentLoop: old turns summarized
else Level 3
ContextManager-->>AgentLoop: first + recent kept, middle dropped
end
AgentLoop->>Provider: stream(config with compacted messages)
3.4 Sub-Agent Delegation
sequenceDiagram
participant ParentLoop as Parent AgentLoop
participant SubAgentTool
participant ChildLoop as Child AgentLoop
participant ChildProvider as Provider
ParentLoop->>SubAgentTool: execute({task: "..."}, ToolContext)
SubAgentTool->>SubAgentTool: build AgentContext with child identity<br/>(new agent_id, session_id, loop_id="{child_session}.sub.1",<br/>parent_loop_id = parent's loop_id)
SubAgentTool->>ChildLoop: agent_loop([task_prompt], context, config, tx, cancel)
ChildLoop->>ChildLoop: emit AgentStart{loop_id, parent_loop_id}
ChildLoop->>ChildProvider: stream(...)
ChildProvider-->>ChildLoop: streaming events
ChildLoop-->>SubAgentTool: Vec<AgentMessage> (final messages)
SubAgentTool->>SubAgentTool: extract_final_text(messages)
SubAgentTool-->>ParentLoop: Ok(ToolResult{text, child_loop_id: Some(loop_id)})
Note over ParentLoop: ToolExecutionEnd{child_loop_id} emitted<br/>→ parent stream records child ancestry
4. Data Models
Content
Entity: Content (enum)
Variant Text:
text: String [the text content]
Variant Image:
data: String [base64-encoded binary]
mime_type: String [e.g. "image/png", "image/jpeg"]
Variant Thinking:
thinking: String [internal reasoning text]
signature: Option<String> [provider-specific thinking signature, optional]
Variant ToolCall:
id: String [unique call ID, e.g. UUID]
name: String [tool name matching AgentTool::name()]
arguments: JSON [parameter values matching tool's JSON Schema]
Serialization: tagged by "type" field ("text", "image", "thinking", "toolCall")
Message
Entity: Message (enum)
Variant User:
content: Vec<Content> [usually a single Text block]
timestamp: u64 [unix milliseconds]
Variant Assistant:
content: Vec<Content> [text, thinking, tool call blocks]
stop_reason: StopReason [why the model stopped]
model: String [model ID returned by provider]
provider: String [provider name, e.g. "anthropic"]
usage: Usage [token counts for this turn]
timestamp: u64 [unix milliseconds]
error_message: Option<String> [set when stop_reason == Error]
Variant ToolResult:
tool_call_id: String [matches Content::ToolCall.id]
tool_name: String [matches Content::ToolCall.name]
content: Vec<Content> [tool output, usually a Text block]
is_error: bool [true if tool execution failed]
timestamp: u64 [unix milliseconds]
Lifecycle: User messages are created by the caller. Assistant messages are
created by the provider after streaming completes. ToolResult messages
are created by the agent loop after tool execution.
AgentMessage
Entity: AgentMessage (enum, untagged)
Variant Llm(LlmMessage) [sent to the LLM; user/assistant/toolResult roles; LlmMessage wraps Message + Option<TurnId>]
Variant Extension(ExtensionMessage) [not sent to LLM; app-only metadata]
Note: stored in Agent.messages and AgentContext.messages
Extension messages are filtered out before LLM calls
ExtensionMessage
Entity: ExtensionMessage
role: String [always "extension"]
kind: String [app-defined event type, e.g. "ui_update"]
data: JSON [arbitrary app-defined payload]
StopReason
Entity: StopReason (enum)
Stop -> model completed naturally
Length -> max_tokens limit hit
ToolUse -> model returned tool calls (loop must continue)
Error -> provider or streaming error occurred
Aborted -> cancellation token was triggered
Serialization: camelCase ("stop", "length", "toolUse", "error", "aborted")
Usage
Entity: Usage
input: u64 [prompt tokens processed]
output: u64 [completion tokens generated]
cache_read: u64 [tokens served from prompt cache]
cache_write: u64 [tokens written to prompt cache]
total_tokens: u64 [sum, may be 0 if not reported]
Derived: cache_hit_rate() = cache_read / (input + cache_read + cache_write)
AgentEvent
Entity: AgentEvent (enum, #[serde(tag = "type")])
Every variant except AgentStart, ParallelLoopStart, and ParallelLoopEnd now carries
loop_id: String so that events from concurrent parallel branches can be reliably
attributed to the correct LoopRecord even when they are interleaved on one tx channel.
AgentStart {
agent_id: String [stable agent instance identifier]
session_id: String [groups all loops in one session]
loop_id: String ["{session_id}.{config_id}.{N}" — unique per call]
parent_loop_id: Option<String> [None for origin calls; Some for continuations/sub-agents]
continuation_kind: Option<ContinuationKind> [None=origin; Some(Default/Rerun/Branch)=continuation]
timestamp: DateTime<Utc>
metadata: Option<JSON>
}
AgentEnd {
loop_id: String [← identifies the loop]
messages: Vec<AgentMessage> [all new messages produced by this loop]
usage: Usage
timestamp: DateTime<Utc>
rejection: Option<String> [Some if input filter blocked the run]
}
TurnStart {
loop_id: String
turn_index: u32
timestamp: DateTime<Utc>
triggered_by: TurnTrigger [what caused this turn to begin]
}
TurnEnd {
loop_id: String
message: AgentMessage
usage: Usage
timestamp: DateTime<Utc>
tool_results: Vec<Message>
}
MessageStart { loop_id: String, message } [message streaming began]
MessageUpdate { loop_id: String, message, delta } [content delta arrived]
MessageEnd { loop_id: String, message } [message complete]
ToolExecutionStart { loop_id: String, tool_call_id, tool_name, args }
ToolExecutionUpdate { loop_id: String, tool_call_id, tool_name, partial_result }
ToolExecutionEnd {
loop_id: String
tool_call_id: String
tool_name: String
result: ToolResult
is_error: bool
child_loop_id: Option<String> [Some only when tool spawned a sub-agent loop]
}
ProgressMessage { loop_id: String, tool_call_id, tool_name, text }
InputRejected { loop_id: String, reason } [input filter blocked the prompt]
ParallelLoopStart { [loop_id NOT on this variant]
session_id: String
loop_ids: Vec<String> [one loop_id per branch, in config order]
timestamp: DateTime<Utc>
}
ParallelLoopEnd { [loop_id NOT on this variant]
session_id: String
selected_loop_id: String
selected_config_index: usize
evaluation_usage: Usage
timestamp: DateTime<Utc>
}
StreamDelta
Entity: StreamDelta (enum)
Text { delta: String } [text content chunk]
Thinking { delta: String } [thinking content chunk]
ToolCallDelta { delta: String } [tool call argument chunk]
ToolContext
Entity: ToolContext
tool_call_id: String [for correlation with AgentEvent]
tool_name: String [for correlation with AgentEvent]
cancel: CancellationToken [check is_cancelled() in long-running tools]
on_update: Option<ToolUpdateFn> [callback for streaming partial ToolResults]
on_progress: Option<ProgressFn> [callback for user-facing status text]
ContinuationKind
Entity: ContinuationKind (enum)
Default [unspecified continuation — preserves legacy semantics]
Rerun { tag: String } [retry from an equivalent context; tag is RFC 3339 UTC timestamp]
Branch { tag: String } [explore a different path from a branching point; tag is RFC 3339 UTC timestamp]
Set on AgentContext.continuation_kind before calling agent_loop_continue().
Surfaced in AgentStart.continuation_kind (None = origin call).
TurnTrigger semantics:
Default / Rerun → first turn uses TurnTrigger::Continuation
Branch → first turn uses TurnTrigger::Branch
TurnTrigger
Entity: TurnTrigger (enum)
User [first turn of an agent_loop() origin call with new user prompts]
SubAgent [first turn when running as a sub-agent via SubAgentTool]
Continuation [subsequent turns; tool round-trip, steering, or Default/Rerun continuation]
Branch [first turn of an agent_loop_continue(Branch) call]
Emitted in TurnStart.triggered_by.
Priority on first turn (run_loop):
1. Branch continuation → TurnTrigger::Branch
2. Any other continuation → TurnTrigger::Continuation
3. Origin call → config.first_turn_trigger (User or SubAgent)
Subsequent turns always use TurnTrigger::Continuation.
ToolResult / ToolError
Entity: ToolResult
content: Vec<Content> [tool output content blocks]
details: JSON [structured metadata, not sent to LLM, e.g. exit_code]
child_loop_id: Option<String> [set by sub-agent tools; None for all other tools]
Entity: ToolError (enum)
Failed(String) [general execution failure]
NotFound(String) [tool name not in registry]
InvalidArgs(String) [parameter validation failed]
Cancelled [CancellationToken was triggered]
ContextConfig
Entity: ContextConfig
max_context_tokens: usize [default: 100,000; total budget including system prompt]
system_prompt_tokens: usize [default: 4,000; reserved for system prompt]
keep_recent: usize [default: 10; messages always kept in full at tail]
keep_first: usize [default: 2; messages always kept at head]
tool_output_max_lines: usize [default: 50; L1 compaction per-tool-output limit]
Effective budget = max_context_tokens - system_prompt_tokens
ExecutionLimits / ExecutionTracker
Entity: ExecutionLimits
max_turns: usize [default: 50; LLM calls before forced stop]
max_total_tokens: usize [default: 1,000,000; cumulative token budget]
max_duration: Duration [default: 600s; wall-clock time limit]
Entity: ExecutionTracker (runtime state)
limits: ExecutionLimits [immutable config]
turns: usize [incremented after each LLM call]
tokens_used: usize [cumulative; updated from provider Usage]
started_at: Instant [set on construction]
RetryConfig
Entity: RetryConfig
max_retries: usize [default: 3; 0 = no retries]
initial_delay_ms: u64 [default: 1,000ms]
backoff_multiplier: f64 [default: 2.0; exponential growth factor]
max_delay_ms: u64 [default: 30,000ms; ceiling before jitter]
CacheConfig / CacheStrategy
Entity: CacheConfig
enabled: bool [master switch; default: true]
strategy: CacheStrategy
Entity: CacheStrategy (enum)
Auto [provider places breakpoints automatically]
Disabled [no caching hints sent]
Manual {
cache_system: bool [cache system prompt]
cache_tools: bool [cache tool definitions]
cache_messages: bool [cache second-to-last message]
}
StreamConfig (sent to provider)
Entity: StreamConfig
model_config: ModelConfig [REQUIRED — full provider identity: id, api_key, base_url, compat, cost]
system_prompt: String
messages: Vec<Message> [LLM-only messages, Extension filtered out]
tools: Vec<ToolDefinition> [schema-only; no execute functions]
thinking_level: ThinkingLevel
max_tokens: Option<u32> [overrides model_config.max_tokens when Some]
temperature: Option<f32>
cache_config: CacheConfig
Note: model identity (id, api_key, base_url, headers, compat) is accessed via
model_config.id, model_config.api_key, etc. No top-level model or api_key fields.
ToolDefinition (sent to LLM)
Entity: ToolDefinition
name: String [matches AgentTool::name()]
description: String [matches AgentTool::description()]
parameters: JSON [JSON Schema object matching AgentTool::parameters_schema()]
Skill / SkillSet
Entity: Skill
name: String [from YAML frontmatter; skill identifier]
description: String [from YAML frontmatter; one-line capability summary]
file_path: PathBuf [absolute path to the SKILL.md file]
base_dir: PathBuf [absolute path to the skill's directory]
source: String [origin label: "dir:0", "dir:1", etc.]
Entity: SkillSet
skills: Vec<Skill>
Lifecycle: Loaded from disk at startup via SkillSet::load(dirs).
Formatted as XML via format_for_prompt() and appended to system prompt.
Agent reads full SKILL.md on-demand when activating a skill via read_file tool.
QueueMode
Entity: QueueMode (enum) — controls steering/follow-up queue delivery
OneAtATime pop and return exactly one message per call (default)
All drain and return all queued messages at once
Used in: Agent.steering_mode, Agent.follow_up_mode
McpToolInfo / McpContent
Entity: McpToolInfo — tool metadata returned by MCP server
name: String [tool identifier used in tools/call]
description: Option<String> [human-readable description; default empty string]
inputSchema: JSON [JSON Schema for the tool's parameters]
Entity: McpContent (enum) — content item in a tool call result
Variant Text:
type: "text"
text: String
Variant Image:
type: "image"
data: String [base64-encoded]
mimeType: String
Entity: McpToolCallResult
content: Vec<McpContent> [output from the tool]
isError: bool [true if the tool reported an error]
OpenApiConfig / OpenApiAuth / OperationFilter
Entity: OpenApiConfig — configuration for OpenAPI tool generation
base_url: Option<String> [overrides spec servers[0].url; trailing slash stripped]
auth: OpenApiAuth [authentication method]
custom_headers: Map<String,String> [extra headers added to every request]
max_response_bytes: usize [default: 65536 (64KB); response body truncation limit]
timeout_secs: u64 [default: 30; per-request timeout]
name_prefix: Option<String> [if set, tool names formatted as "{prefix}__{operationId}"]
Entity: OpenApiAuth (enum)
None [no authentication]
Bearer(token: String) [Authorization: Bearer {token}]
ApiKey { header: String, value: String } [custom header: {header}: {value}]
Note: Bearer token and ApiKey value are redacted as "****" in debug output.
Entity: OperationFilter (enum) — controls which API operations become tools
All [include all operations that have an operationId]
ByOperationId(Vec<String>) [include only operations whose id is in the list]
ByTag(Vec<String>) [include operations tagged with any listed tag]
ByPathPrefix(String) [include operations whose path starts with the prefix]
Session / LoopRecord / SessionRecorder
Entity: Session
session_id: String
agent_id: String
created_at: DateTime<Utc>
last_active_at: DateTime<Utc>
formation: SessionFormation [Explicit | FirstLoop | InactivityTimeout{..}]
parent_spawn_ref: Option<SpawnRef> [set when this session was a sub-agent spawn]
loops: Vec<LoopRecord> [ordered by started_at]
Methods: root_loops(), children_of(loop_id), parallel_siblings(loop_id),
get_loop(loop_id), total_usage()
Entity: LoopRecord
loop_id: String
session_id: String
agent_id: String
parent_loop_id: Option<String>
continuation_kind: Option<ContinuationKind>
started_at: DateTime<Utc>
ended_at: Option<DateTime<Utc>>
status: LoopStatus [Pending | Running | Completed | Rejected | Aborted]
rejection: Option<String>
config: Option<LoopConfigSnapshot> [model, provider, config_id + name, api, base_url, reasoning, context_window, max_tokens, thinking_level, temperature]
messages: Vec<AgentMessage> [from AgentEnd.messages — authoritative]
usage: Usage
metadata: Option<JSON>
events: Vec<LoopEvent> [full event stream; MessageUpdate opt-in]
children_loop_ids: Vec<String> [same-session direct children]
child_loop_refs: Vec<ChildLoopRef> [cross-session sub-agent spawn links]
parallel_group: Option<ParallelGroupRecord>
Entity: ChildLoopRef — outbound cross-session link on the parent LoopRecord
tool_call_id: String
tool_name: String
child_loop_id: String
child_session_id: String
Entity: SpawnRef — inbound cross-session link on the child Session
parent_session_id: String
parent_loop_id: String
tool_call_id: String
tool_name: String
Entity: ParallelGroupRecord
all_loop_ids: Vec<String> [all branch loop_ids in config order]
selected_loop_id: String
selected_config_index: usize
evaluation_usage: Usage
is_selected: bool [true only on the winner's LoopRecord]
Entity: SessionRecorderConfig
formation_policy: SessionFormationPolicy [PerSessionId | InactivityTimeout{secs}]
include_streaming_events: bool [default: false — excludes MessageUpdate]
5. Integration Contracts
Anthropic Messages API
- Endpoint:
https://api.anthropic.com/v1/messages - Auth (standard):
x-api-key: {ANTHROPIC_API_KEY}+anthropic-version: 2023-06-01 - Auth (OAuth):
authorization: Bearer {TOKEN}+ beta headersclaude-code-20250219,oauth-2025-04-20,fine-grained-tool-streaming-2025-05-14;x-app: cli;anthropic-dangerous-direct-browser-access: true;user-agent: claude-cli/2.1.2 - Request: POST JSON with
model,system(array of text blocks),messages,tools,max_tokens(default 8192),stream: true - Response: Server-Sent Events stream; events:
message_start,content_block_start,content_block_delta,message_delta,message_stop - Tool args: Streamed as
InputJsonDeltatext fragments; buffered inarguments["__partial_json"]; parsed as complete JSON oncontent_block_stop - Thinking:
ThinkingLevelmapped to{type:"enabled", budget_tokens: N}— Minimal→128, Low→512, Medium→2048, High→8192 - Prompt caching:
cache_control: {type: "ephemeral"}placed at system/last-tool-def/second-to-last-message perCacheStrategy - Content format:
{type: "text"|"image"|"thinking"|"tool_use"|"tool_result", ...} - Tool results: Role "user", type "tool_result", fields:
tool_use_id,content,is_error
OpenAI-Compatible APIs (Chat Completions)
- Endpoints:
https://api.openai.com/v1/chat/completionsand 14+ compatible bases (xAI/Grok, Groq, Cerebras, Mistral, DeepSeek, etc.) - Auth:
Authorization: Bearer {API_KEY} - Request: POST JSON with
model,messages,tools,stream: true,stream_options: {include_usage: true} - max_tokens field name:
"max_tokens"(most) or"max_completion_tokens"(OpenAI) — controlled byOpenAiCompat.max_tokens_field - System prompt: First message with role
"system"or"developer"(OpenAI) — controlled bysupports_developer_role - Thinking:
reasoning_effort: "low"|"medium"|"high"ifsupports_reasoning_effort; response indelta.reasoning_content(OpenAI) ordelta.reasoning(xAI) - Response: SSE stream; each chunk has
choices[0].delta; tool args indelta.tool_calls[].function.arguments(incremental JSON string)
OpenAI Responses API
- Endpoint:
{base_url}/responses - Auth:
Authorization: Bearer {OPENAI_API_KEY} - System prompt:
"instructions"field (not"messages") - Message format: Different from Chat Completions — see Bedrock/Responses comparison below
- Thinking:
"reasoning": {effort: "low"|"medium"|"high"}field - SSE events:
response.output_text.delta,response.reasoning.delta,response.function_call_arguments.start/delta/done,response.completed
Azure OpenAI
- Endpoint:
{base_url}/responses?api-version=2025-01-01-preview(base_url pattern:https://{resource}.openai.azure.com/openai/deployments/{deployment}) - Auth:
api-key: {AZURE_OPENAI_API_KEY}header (notAuthorization: Bearer) - Request/Response: Same format as OpenAI Responses API
Google Generative AI (Gemini)
- Endpoint:
{base_url}/v1beta/models/{model}:streamGenerateContent?alt=sse&key={API_KEY} - Auth: API key as URL query parameter
?key=; no Authorization header - System prompt:
"systemInstruction": {parts: [{text: "..."}]} - Tools: Single object
{functionDeclarations: [...]}wrapping all tool definitions - Contents: Role "user" or "model"; ToolResults sent as
{role:"user", parts:[{functionResponse:{name, response:{result: text}}}]} - Tool args: Delivered complete in one event (no streaming deltas); tool IDs auto-generated as
"google-fc-{index}" - Response parsing: Custom SSE parser (not standard library); splits on
\n\n, extractsdata:line
Google Vertex AI
- Endpoint:
https://{region}-aiplatform.googleapis.com/v1/projects/{project}/locations/{region}/publishers/google/models/{model}:streamGenerateContent?alt=sse - Auth:
Authorization: Bearer {OAUTH_TOKEN}(OAuth2, not API key in URL) - Request/Response: Identical to Google Generative AI; tool IDs generated as
"vertex-fc-{index}"
Amazon Bedrock (ConverseStream)
- Endpoint:
{base_url}/model/{model}/converse-stream(base_url:https://bedrock-runtime.{region}.amazonaws.com) - Auth:
Authorization: Bearer {token}or custom headers frommodel_config.headers; minimal SigV4 support - System prompt:
"system"array:[{text: "..."}] - Tools:
toolConfig.tools:[{toolSpec: {name, description, inputSchema: {json: schema}}}] - Tool results:
{toolResult: {toolUseId, content: [...], status: "success"|"error"}} - Streaming format: Newline-delimited JSON (not standard SSE); events:
contentBlockDelta,contentBlockStart,contentBlockStop,messageStop,metadata
Model Context Protocol (MCP)
-
Protocol: JSON-RPC 2.0
-
Message types:
- Request:
{jsonrpc:"2.0", id:u64, method:String, params:Option<Value>} - Response:
{jsonrpc:"2.0", id:Option<u64>, result:Option<Value>, error:Option<{code:i64,message:String,data?}>} - Request IDs: auto-incremented
AtomicU64starting at 1
- Request:
-
Initialization handshake (3 steps):
- Client sends
initializewith{protocolVersion:"2024-11-05", capabilities:{}, clientInfo:{name:"phi-core",version:"<pkg>"}} - Server responds with
{protocolVersion, capabilities:{tools?,resources?,prompts?}, serverInfo:{name,version}} - Client sends
notifications/initializednotification (no params; server may ignore id)
- Client sends
-
Tool discovery: Client sends
tools/list→ server returns{tools: [{name, description?, inputSchema}]} -
Tool execution: Client sends
tools/call {name, arguments}→ server returns{content:[{type:"text",text}|{type:"image",data,mimeType}], isError:bool} -
Stdio transport: Spawns child process; newline-delimited JSON over stdin/stdout;
tokio::sync::Mutexfor concurrent access; shutdown: EOF on stdin then kill child -
HTTP transport: POST JSON-RPC body to configured URL; stateless (no persistent connection)
-
Tool adapter:
McpToolAdapterwrapsMcpToolInfo+Arc<Mutex<McpClient>>; optionalprefixfor namespace disambiguation ({prefix}__{name}) -
Error enum:
Transport(String),Protocol(String),JsonRpc{code,message},Serialization,Io,ConnectionClosed
OpenAPI
-
Spec formats: OpenAPI 3.x; auto-detected: first non-whitespace char
{or[→ JSON, else YAML -
Sources:
from_file(path)(async read),from_url(url)(HTTP GET via reqwest),from_str(text)(in-memory) -
Base URL resolution:
config.base_url→spec.servers[0].url→ error if neither set; trailing slashes stripped -
Parameter classification:
Pathparameters → URL{param}substitution (RFC 3986 percent-encoding); requiredQueryparameters →.query()chains; optionalHeaderparameters →.header()chains; optionalCookieparameters → skipped (unsupported)RequestBody(application/json only) → keyed as"body"(or"_request_body"on collision); required ifrequestBody.required
-
HTTP execution pipeline (per tool call):
- Validate params is object (or null treated as
{}) - Substitute path params with percent-encoded values; error if any missing
- Build URL:
{base_url}{path} - Chain
.query()for query params present in input - Chain
.header()for header params present in input - Apply auth:
Bearer→.bearer_auth(),ApiKey→.header(header, value),None→ nothing - Apply
custom_headers - If
has_body:.json(params["body"]) - Send request; read full body text; truncate to
max_response_bytesat UTF-8 boundary - Return:
"{METHOD} {URL} → {STATUS_CODE}\n\n{BODY}"
- Validate params is object (or null treated as
-
Operation filter:
OperationFilter::All|ByOperationId|ByTag|ByPathPrefix; operations withoutoperationIdalways skipped with warning -
Tool naming: Default =
operationId; with prefix ={prefix}__{operationId}
File System
- Read:
tokio::fs::read_to_stringfor text (max 1MB),tokio::fs::readfor images (max 20MB) - Write:
tokio::fs::writewith automatic parent dir creation - Edit: Read → string replace (exact match, once) → write
- List: Spawns
findcommand via BashTool - Search: Spawns
greporrgcommand via BashTool
Shell
- Execution:
tokio::process::Command::new("bash").arg("-c").arg(command) - Timeout:
tokio::time::sleepwith default 120s, configurable - Output capture:
stdout+stderrpiped, truncated at 256KB each - Safety: Deny patterns checked before execution (substring match)
- Exit code: Returned in
ToolResult.details.exit_code; tool always returns Ok (non-zero is not a ToolError)
6. State Management
Agent-Level State (in Agent struct)
All fields on Agent:
| Field | Type | Notes |
|---|---|---|
system_prompt | String | Immutable once set; injected into every LLM call |
model | String | Model identifier |
api_key | String | API authentication key |
thinking_level | ThinkingLevel | Off/Minimal/Low/Medium/High |
max_tokens | Option<u32> | Max completion tokens |
temperature | Option<f32> | Sampling temperature |
model_config | Option<ModelConfig> | Provider-specific extras (base_url, headers, compat flags) |
messages | Vec<AgentMessage> | Grows on each prompt() call; reset by reset(); replaced by restore_messages() |
tools | Vec<Box<dyn AgentTool>> | Tool instances (heap-allocated trait objects) |
provider | Box<dyn StreamProvider> | Boxed, not Arc; owned exclusively by Agent |
steering_queue | Arc<Mutex<Vec<AgentMessage>>> | Written by steer(), drained by agent loop before each tool execution check |
follow_up_queue | Arc<Mutex<Vec<AgentMessage>>> | Written by follow_up(), drained when agent loop would stop |
steering_mode | QueueMode | Default: OneAtATime |
follow_up_mode | QueueMode | Default: OneAtATime |
context_config | Option<ContextConfig> | If None, context compaction is disabled |
execution_limits | Option<ExecutionLimits> | If None, no hard limits enforced |
cache_config | CacheConfig | Prompt caching hints (Anthropic) |
tool_execution | ToolExecutionStrategy | Parallel (default), Sequential, or Batched |
retry_config | RetryConfig | Backoff for RateLimited/Network errors |
before_turn | Option<BeforeTurnFn> | Signature: fn(&[AgentMessage], turn_number: usize) -> bool; return false to abort |
after_turn | Option<AfterTurnFn> | Signature: fn(&[AgentMessage], &Usage) |
on_error | Option<OnErrorFn> | Signature: fn(&str) |
input_filters | Vec<Arc<dyn InputFilter>> | Applied in order before LLM call |
| (compaction strategies) | (moved to ContextConfig.compaction) | in_memory_strategy and block_strategy fields on CompactionConfig (G5) |
cancel | Option<CancellationToken> | Created when prompt() starts, consumed by abort() |
is_streaming | bool | Set true on prompt() entry, false on exit |
agent_id | String | UUID v4 generated once at Agent::new(); stable for the Agent's lifetime. Injected into every AgentContext built by this agent. |
session_id | String | UUID v4 generated once at Agent::new(); groups all loops under one session. Stable for the Agent's lifetime. |
loop_counters | HashMap<String, usize> | Per-"{session_id}.{config_id}" monotonic counter; incremented by next_loop_id() to produce the N component of loop_id. |
last_loop_id | Option<String> | loop_id of the most recently started loop; set after each prompt_* or continue_loop_* call. Becomes parent_loop_id on the next continuation. |
before_loop | Option<BeforeLoopFn> | Hook called once before AgentStart. Signature: fn(&[AgentMessage], loop_index: usize) -> bool; return false to abort before AgentStart. |
after_loop | Option<AfterLoopFn> | Hook called once after AgentEnd. Signature: fn(&[AgentMessage], &Usage). |
before_tool_execution | Option<BeforeToolExecutionFn> | Hook called before each ToolExecutionStart. Signature: fn(&str, &str, &JSON) -> bool (tool_name, call_id, args); return false to skip. |
after_tool_execution | Option<AfterToolExecutionFn> | Hook called after each ToolExecutionEnd. Signature: fn(&str, &str, bool) (tool_name, call_id, is_error). |
before_tool_execution_update | Option<BeforeToolExecutionUpdateFn> | Hook called before each ToolExecutionUpdate. Signature: fn(&str, &str, &str) -> bool (tool_name, call_id, text); return false to suppress the event. |
after_tool_execution_update | Option<AfterToolExecutionUpdateFn> | Hook called after each ToolExecutionUpdate (only when not suppressed). Signature: fn(&str, &str, &str). |
Invariants:
assert!(!self.is_streaming)fires ifprompt()is called while already running — callers must usesteer()orfollow_up()during active runscancelis alwaysSomewhileis_streamingis truemessagesmust not end in anAssistantmessage beforeagent_loop_continue()is calledagent_idandsession_idare alwaysSomein anyAgentContextbuilt byAgent; direct callers ofagent_loop_continuemust also set them
AgentContext (per-run, passed into agent loop)
| State Element | Type | Description |
|---|---|---|
system_prompt | String | Immutable for the duration of the run |
messages | Vec<AgentMessage> | Mutated in-place: prompts appended, assistant messages appended, tool results appended; may be replaced by compaction |
tools | &[Box<dyn AgentTool>] | Immutable for the duration of the run |
agent_id | Option<String> | Stable agent instance ID. Set by Agent::prompt_*; also written back by agent_loop when None. Required (non-None) for agent_loop_continue. |
session_id | Option<String> | Stable session ID. Same lifecycle as agent_id. |
loop_id | Option<String> | Per-call identifier of the form "{session_id}.{config_id}.{N}". Set by Agent before calling agent_loop/agent_loop_continue; falls back to UUID if None at loop entry. |
parent_loop_id | Option<String> | loop_id of the loop this call continues from. None for origin calls. Set by Agent::continue_loop_with_sender to Agent.last_loop_id. |
continuation_kind | Option<ContinuationKind> | How this call relates to prior loops. None for origin; Some(Default|Rerun|Branch) for continuations. |
ExecutionTracker (per-run)
| State | Initial | Transitions |
|---|---|---|
turns | 0 | Incremented after each LLM call |
tokens_used | 0 | Incremented by token count of each LLM response |
started_at | Instant::now() | Immutable; compared against max_duration on each check |
Steering/Follow-up Queue Modes
QueueMode::OneAtATime(default for both queues): on each read, lock mutex, pop the first message only, return asVecof 1QueueMode::All: on each read, lock mutex, drain all queued messages, return the full vec
Both queues are passed to AgentLoopConfig as closures (get_steering_messages, get_follow_up_messages) that capture the Arc<Mutex<>> pointer, enabling external callers to enqueue messages while the agent loop is running on another task.
Event Hook Ordering
All hooks fire in a guaranteed strict order relative to their paired events. This ordering is enforced at runtime and is an invariant of the system:
before_loop → AgentStart
before_turn → TurnStart
[MessageStart/End for initial prompts — first turn of agent_loop() only]
[MessageStart/End for injected steering messages]
[LLM: MessageStart → MessageUpdate* → MessageEnd]
[per tool call:]
before_tool_execution → ToolExecutionStart
(before_tool_execution_update → ToolExecutionUpdate → after_tool_execution_update)*
ToolExecutionEnd → after_tool_execution
TurnEnd → after_turn
(repeat inner block for each follow-up / steering-triggered turn)
AgentEnd → after_loop
Short-circuit rules — hook returns false:
| Hook | When false is returned | Behaviour |
|---|---|---|
before_loop | Before AgentStart | Loop is aborted; AgentEnd { messages: [] } is emitted; function returns immediately |
before_turn | Before TurnStart | Turn is skipped; TurnStart/TurnEnd are not emitted; AgentEnd is not guaranteed |
before_tool_execution | Before ToolExecutionStart | Tool call is skipped; ToolExecutionStart/End are not emitted; a skipped error ToolResult is returned to the LLM |
before_tool_execution_update | Before ToolExecutionUpdate | Event is suppressed; after_tool_execution_update is not called; tool keeps running and final ToolResult is unaffected |
7. Error Handling Strategy
Provider Errors (ProviderError)
| Error | Retryable | Handling |
|---|---|---|
RateLimited { retry_after_ms } | Yes | Exponential backoff; respects Retry-After header if present |
Network(msg) | Yes | Exponential backoff |
Auth(msg) | No | Propagated immediately as StopReason::Error message |
Api(msg) | No | Propagated as StopReason::Error message |
ContextOverflow { msg } | No | Detected on HTTP 400/413; triggers compaction on next turn (see below) |
Cancelled | No | Loop exits cleanly, AgentEnd emitted |
Other(msg) | No | Propagated as StopReason::Error message |
Context Overflow Recovery
- Provider returns HTTP 400/413 matching any of 15+ known overflow phrases.
ProviderError::classify()returnsContextOverflow.- The overflow may arrive as an HTTP error (caught in retry loop) or as a streaming error event (
StreamEvent::Errorwith matching message), caught byMessage::is_context_overflow(). - On the next turn, if
context_configis set,compact_messages()is called before the LLM call. - If no
context_configis set, the error message is included in conversation history and the loop continues — the LLM may self-recover or the next turn will also fail.
Tool Errors (ToolError)
Cancelled: Tool execution skipped;ToolResultcontent = "Skipped due to queued user message." withis_error: trueFailed(msg): Converted toToolResultwith error text;is_error: true; always returned to LLM so it can self-correctInvalidArgs(msg): Same as Failed; LLM can retry with corrected parametersNotFound(msg): Produced when tool name inToolCallhas no matchingAgentTool; same handling as Failed
Input Filter Errors
Reject(reason): EmitsAgentEvent::InputRejected, immediately emitsAgentEvent::AgentEnd { messages: [] }, returns empty message listWarn(msg): Warning text appended to last user message content; loop continues
Execution Limit Exhaustion
- When any limit is exceeded, a synthetic user message
[Agent stopped: {reason}]is appended to context and emitted as events. - Loop returns immediately after appending the message.
- No error is thrown;
AgentEndis emitted normally.
Before-Turn Abort
- If
before_turncallback returnsfalse, the loop returns immediately with noAgentEndemitted. - This is the only path where
AgentEndis not guaranteed.
Error Propagation Across Components
Provider → ProviderError → stream_assistant_response() → Message{stop_reason: Error}
↓
on_error callback invoked
↓
AgentEvent::TurnEnd emitted
↓
agent loop returns