The Agent Loop
The agent loop is the core of phi-core. It implements the fundamental cycle:
User prompt → LLM call → Tool execution → LLM call → ... → Final response
The agent_loop module contains the core loop logic in mod.rs and the evaluation sub-module for evaluational parallelism strategies.
How It Works
┌──────────────────────────────────────────────┐
│ agent_loop() │
│ │
│ 1. Add prompts to context │
│ 2. Emit AgentStart + TurnStart │
│ │
│ ┌─────────── Inner Loop ──────────────┐ │
│ │ • Check steering messages │ │
│ │ • Check execution limits │ │
│ │ • Compact context (if configured) │ │
│ │ • Stream LLM response │ │
│ │ • Extract tool calls │ │
│ │ • Execute tools (with steering) │ │
│ │ • Emit TurnEnd │ │
│ │ • Continue if tool_calls or steer │ │
│ └─────────────────────────────────────┘ │
│ │
│ 3. Check follow-up messages │
│ 4. If follow-ups exist, loop again │
│ 5. Emit AgentEnd │
└──────────────────────────────────────────────┘
Entry Points
agent_loop()
Starts a new agent run with prompt messages:
#![allow(unused)] fn main() { pub async fn agent_loop( prompts: Vec<AgentMessage>, context: &mut AgentContext, config: &AgentLoopConfig, tx: mpsc::UnboundedSender<AgentEvent>, cancel: CancellationToken, ) -> Vec<AgentMessage> }
The prompts are added to context, then the loop runs. Returns all new messages generated during the run.
agent_loop_continue()
Resumes from existing context (e.g., after an error, retry, or branch):
#![allow(unused)] fn main() { pub async fn agent_loop_continue( context: &mut AgentContext, config: &AgentLoopConfig, tx: mpsc::UnboundedSender<AgentEvent>, cancel: CancellationToken, ) -> Vec<AgentMessage> }
Preconditions: context.agent_id and context.session_id must be Some — the function panics with a descriptive message otherwise. In practice, any context that passed through agent_loop() at least once already has these set. When constructing a context manually (e.g., from a persisted snapshot), set them explicitly before calling this function.
The last message in context must also not be an assistant message.
AgentLoopConfig
#![allow(unused)] fn main() { pub struct AgentLoopConfig { /// REQUIRED — complete provider identity: model id, api_key, base_url, protocol, cost rates. pub model_config: ModelConfig, /// Optional override — bypasses ProviderRegistry, used for MockProvider in tests. pub provider_override: Option<Arc<dyn StreamProvider>>, pub config_id: Option<String>, pub thinking_level: ThinkingLevel, pub max_tokens: Option<u32>, pub temperature: Option<f32>, pub convert_to_llm: Option<ConvertToLlmFn>, pub transform_context: Option<TransformContextFn>, pub get_steering_messages: Option<GetMessagesFn>, pub get_follow_up_messages: Option<GetMessagesFn>, pub context_config: Option<ContextConfig>, pub execution_limits: Option<ExecutionLimits>, pub cache_config: CacheConfig, pub tool_execution: ToolExecutionStrategy, pub retry_config: RetryConfig, pub before_loop: Option<BeforeLoopFn>, pub after_loop: Option<AfterLoopFn>, pub before_turn: Option<BeforeTurnFn>, pub after_turn: Option<AfterTurnFn>, pub on_error: Option<OnErrorFn>, pub before_tool_execution: Option<BeforeToolExecutionFn>, pub after_tool_execution: Option<AfterToolExecutionFn>, pub before_tool_execution_update: Option<BeforeToolExecutionUpdateFn>, pub after_tool_execution_update: Option<AfterToolExecutionUpdateFn>, pub before_compaction_start: Option<BeforeCompactionStartFn>, pub after_compaction_end: Option<AfterCompactionEndFn>, pub input_filters: Vec<Arc<dyn InputFilter>>, pub first_turn_trigger: TurnTrigger, pub context_translation: Option<Arc<dyn ContextTranslationStrategy>>, pub prun_pending: Option<Arc<Mutex<Vec<PrunRequest>>>>, } }
| Field | Purpose |
|---|---|
model_config | Required. Complete provider identity: model id, api_key, base_url, api protocol, cost rates, compat flags. The provider is resolved from model_config.api via ProviderRegistry. |
provider_override | Custom Arc<dyn StreamProvider> — bypasses registry when Some. Used for MockProvider in tests or fully custom backends. |
config_id | Optional stable identity for this config; auto-derived as "{provider_id}.{model_slug}[.thinking]" when None. Used as the middle segment of loop_id. |
thinking_level | Off, Minimal, Low, Medium, High |
convert_to_llm | Custom AgentMessage[] → Message[] conversion |
transform_context | Pre-processing hook for context pruning |
get_steering_messages | Returns user interruptions during tool execution |
get_follow_up_messages | Returns queued work after agent would stop |
context_config | Token budget and compaction settings |
execution_limits | Max turns, tokens, duration |
cache_config | Prompt caching behavior (see Prompt Caching) |
tool_execution | Parallel, Sequential, or Batched (see Tools) |
retry_config | Retry behavior for transient errors (see Retry) |
before_loop | Called once before AgentStart; return false to abort the entire run (see Callbacks) |
after_loop | Called once after AgentEnd with all new messages and accumulated usage (see Callbacks) |
before_turn | Called before each LLM call; return false to abort (see Callbacks) |
after_turn | Called after each turn with messages and usage (see Callbacks) |
on_error | Called on StopReason::Error with the error string (see Callbacks) |
before_tool_execution | Called before each tool call; return false to skip it (see Callbacks) |
after_tool_execution | Called after each tool call completes (see Callbacks) |
before_tool_execution_update | Called before each streaming tool update; return false to suppress the event (see Callbacks) |
after_tool_execution_update | Called after each streaming tool update event (see Callbacks) |
before_compaction_start | Called before compaction starts with (estimated_tokens, message_count); return false to skip compaction for this cycle (see Callbacks) |
after_compaction_end | Called after compaction completes with (messages_before, messages_after, tokens_before, tokens_after) (see Callbacks) |
input_filters | Input filters applied to user messages before the LLM call (see Tools) |
first_turn_trigger | The TurnTrigger for the first TurnStart event; defaults to TurnTrigger::User, set to SubAgent by sub-agent callers |
context_translation | Optional ContextTranslationStrategy for cross-provider compatibility — translates content types (e.g., Content::Thinking) when targeting a different provider (G8) |
prun_pending | Shared state for PrunTool to communicate pruning requests to the loop; set automatically by with_prun_tool() |
0.9.0 — async lifecycle hooks.
BeforeLoopFn,AfterLoopFn,BeforeTurnFn,AfterTurnFn,OnErrorFn,BeforeToolExecutionFn,AfterToolExecutionFn,BeforeCompactionStartFn, andAfterCompactionEndFnare now async — their function bodies returnPin<Box<dyn Future<Output = T> + Send>>(alias:HookFuture<'_, T>). Sync closure bodies migrate by wrapping inBox::pin(async move { ... }). Closures can now.awaitLLM calls and other async work directly without atokio::task::block_in_placebridge.Pre-existing-behaviour preservation note (phi-core 0.9.0): tool-update hooks stay sync.
BeforeToolExecutionUpdateFnandAfterToolExecutionUpdateFnremainArc<dyn Fn(&str, &str, &str) -> bool + Send + Sync>/Arc<dyn Fn(&str, &str, &str) + Send + Sync>respectively. Async-ifying them would cascade into theToolUpdateFncallback type and everyAgentTool::executebody that invokesctx.on_update(...)— a materially wider migration than the 0.9.0 scope. The veto decision inBeforeToolExecutionUpdateFnis synchronous so the surrounding emit gate works without an.awaitat every streamed tool-update; consumers that need async work at update-time should dispatch viatokio::spawn(...)inside the sync closure body. Tracked under the CHANGELOG[Unreleased]"Forward markers" section for a future release.InputFilter::filter()is also nowasync fnvia#[async_trait]— see Tools and the per-turn debug-capture surface atdebugging.md.
Steering & Follow-Ups
Steering
Steering messages interrupt the agent between tool executions. When the agent is executing multiple tool calls from a single LLM response, steering is checked after each tool completes. If a steering message is found:
- The current tool finishes normally
- All remaining tool calls are skipped with
is_error: trueand "Skipped due to queued user message" - The steering message is injected into context
- The loop continues with a new LLM call that sees the interruption
#![allow(unused)] fn main() { // While agent is running tools, redirect it: agent.steer(AgentMessage::Llm(Message::user("Stop that. Instead, explain what you found."))); }
Follow-Ups
Follow-up messages are checked after the agent would normally stop (no more tool calls, no steering). If follow-ups exist, the loop continues with them as new input — the agent doesn't need to be re-prompted.
#![allow(unused)] fn main() { // Queue work for after the agent finishes its current task: agent.follow_up(AgentMessage::Llm(Message::user("Now run the tests."))); agent.follow_up(AgentMessage::Llm(Message::user("Then commit the changes."))); }
Queue Modes
Both queues support two delivery modes:
| Mode | Behavior |
|---|---|
QueueMode::OneAtATime | Delivers one message per turn (default) |
QueueMode::All | Delivers all queued messages at once |
#![allow(unused)] fn main() { agent.set_steering_mode(QueueMode::All); agent.set_follow_up_mode(QueueMode::OneAtATime); }
Queue Management
#![allow(unused)] fn main() { agent.clear_steering_queue(); // Drop all pending steers agent.clear_follow_up_queue(); // Drop all pending follow-ups agent.clear_all_queues(); // Drop everything }
Low-Level API
When using agent_loop() directly, steering and follow-ups are provided via callback functions:
#![allow(unused)] fn main() { let config = AgentLoopConfig { get_steering_messages: Some(Box::new(|| { // Return Vec<AgentMessage> — checked between tool calls vec![] })), get_follow_up_messages: Some(Box::new(|| { // Return Vec<AgentMessage> — checked when agent would stop vec![] })), // ... }; }
Custom Compaction
By default, when context exceeds the token budget in ContextConfig, phi-core runs a 3-level compaction strategy: truncate tool outputs → summarize old turns → drop middle messages (legacy in-memory path via compact_messages()). When a Session is available, the modern system uses non-destructive CompactionBlock overlays — see compaction. You can replace this with your own CompactionStrategy.
CompactionStrategyvsBlockCompactionStrategy
CompactionStrategy— Legacy in-memory approach. Destructive: it mutates the message list directly. Used whenAgentContext.sessionisNone(no session persistence).BlockCompactionStrategy— New overlay approach. Non-destructive: it creates aCompactionBlockon theLoopRecordrather than altering the original messages. Used whenAgentContext.sessionisSome(session-backed execution). Original messages remain authoritative for replay and branching.
Example of a custom CompactionStrategy:
#![allow(unused)] fn main() { use phi_core::context::{CompactionStrategy, ContextConfig, CompactionConfig, compact_messages}; use phi_core::types::*; use std::sync::Arc; struct MyCompaction; impl CompactionStrategy for MyCompaction { fn compact( &self, messages: Vec<AgentMessage>, config: &ContextConfig, ) -> Vec<AgentMessage> { // Your logic here — then optionally delegate to the default: compact_messages(messages, config) } } // Modern pattern: set strategies via ContextConfig.compaction let context_config = ContextConfig { compaction: CompactionConfig { // in_memory_strategy: used when AgentContext.session is None (sub-agents, tests) in_memory_strategy: Some(Arc::new(MyCompaction)), // block_strategy: used when AgentContext.session is Some (session-backed execution) // block_strategy: Some(Arc::new(MyBlockCompaction)), ..CompactionConfig::default() }, ..ContextConfig::default() }; let agent = BasicAgent::new(model_config) .with_context_config(context_config); }
The in-memory strategy is called once per turn, right before the LLM call, whenever context_config is Some and AgentContext.session is None. When in_memory_strategy is None, DefaultCompaction (which wraps compact_messages()) is used automatically. When a session is present, block_strategy is used instead (defaulting to DefaultBlockCompaction).
Use Cases
Memory-aware compaction — Index messages into a vector store before they're dropped, so the agent can recall them later via a search tool:
#![allow(unused)] fn main() { struct MemoryAwareCompaction { memory: Arc<dyn MemoryStore>, } impl CompactionStrategy for MemoryAwareCompaction { fn compact( &self, messages: Vec<AgentMessage>, config: &ContextConfig, ) -> Vec<AgentMessage> { let compacted = compact_messages(messages.clone(), config); // Index what was dropped let dropped: Vec<_> = messages.iter() .filter(|m| !compacted.contains(m)) .collect(); if !dropped.is_empty() { self.memory.index(dropped); } compacted } } }
Semantic pointer compaction — Replace dropped messages with a marker so the agent knows context was lost:
#![allow(unused)] fn main() { struct SemanticPointerCompaction; impl CompactionStrategy for SemanticPointerCompaction { fn compact( &self, messages: Vec<AgentMessage>, config: &ContextConfig, ) -> Vec<AgentMessage> { let compacted = compact_messages(messages.clone(), config); let dropped_count = messages.len() - compacted.len(); if dropped_count == 0 { return compacted; } // Insert a marker after the first kept messages let mut result = compacted; let insert_at = config.compaction.keep_first_turns.min(result.len()); result.insert(insert_at, AgentMessage::Extension( ExtensionMessage::new("compaction_marker", serde_json::json!({ "dropped": dropped_count, "note": format!("{} earlier messages were compacted", dropped_count), })) )); result } } }
Priority-preserving compaction — Never drop messages containing important keywords:
#![allow(unused)] fn main() { struct PriorityPreservingCompaction { preserve_keywords: Vec<String>, } impl CompactionStrategy for PriorityPreservingCompaction { fn compact( &self, messages: Vec<AgentMessage>, config: &ContextConfig, ) -> Vec<AgentMessage> { let (priority, normal): (Vec<_>, Vec<_>) = messages.into_iter() .partition(|m| self.is_priority(m)); let mut compacted = compact_messages(normal, config); // Re-insert priority messages — they're never dropped for msg in priority { compacted.push(msg); } compacted } } }
Evaluational Parallelism
agent_loop_parallel runs the same prompt through multiple AgentLoopConfigs concurrently, evaluates the results with a pluggable EvaluationStrategy, and returns the winning branch. This is useful for multi-model comparison, A/B prompt testing, and selecting the best response among different reasoning approaches.
#![allow(unused)] fn main() { use phi_core::{agent_loop_parallel, PickFirstEvaluation, AgentContext, AgentLoopConfig}; use std::sync::Arc; let result = agent_loop_parallel( prompts, base_context, // cloned per branch; Arc tools shared vec![config_a, config_b], Arc::new(PickFirstEvaluation), tx, cancel, ).await; // result.selected_context feeds directly into agent_loop_continue() // result.selected_messages is the winning branch's output }
See Evaluational Parallelism for the full guide including built-in strategies, the LLM judge, and session continuity.