The Agent Loop

The agent loop is the core of phi-core. It implements the fundamental cycle:

User prompt → LLM call → Tool execution → LLM call → ... → Final response

The agent_loop module contains the core loop logic in mod.rs and the evaluation sub-module for evaluational parallelism strategies.

How It Works

┌──────────────────────────────────────────────┐
│                  agent_loop()                │
│                                              │
│  1. Add prompts to context                   │
│  2. Emit AgentStart + TurnStart              │
│                                              │
│  ┌─────────── Inner Loop ──────────────┐     │
│  │  • Check steering messages          │     │
│  │  • Check execution limits           │     │
│  │  • Compact context (if configured)  │     │
│  │  • Stream LLM response              │     │
│  │  • Extract tool calls               │     │
│  │  • Execute tools (with steering)    │     │
│  │  • Emit TurnEnd                     │     │
│  │  • Continue if tool_calls or steer  │     │
│  └─────────────────────────────────────┘     │
│                                              │
│  3. Check follow-up messages                 │
│  4. If follow-ups exist, loop again          │
│  5. Emit AgentEnd                            │
└──────────────────────────────────────────────┘

Entry Points

`agent_loop()`

Starts a new agent run with prompt messages:

#![allow(unused)]
fn main() {
pub async fn agent_loop(
    prompts: Vec<AgentMessage>,
    context: &mut AgentContext,
    config: &AgentLoopConfig,
    tx: mpsc::UnboundedSender<AgentEvent>,
    cancel: CancellationToken,
) -> Vec<AgentMessage>
}

The prompts are added to context, then the loop runs. Returns all new messages generated during the run.

`agent_loop_continue()`

Resumes from existing context (e.g., after an error, retry, or branch):

#![allow(unused)]
fn main() {
pub async fn agent_loop_continue(
    context: &mut AgentContext,
    config: &AgentLoopConfig,
    tx: mpsc::UnboundedSender<AgentEvent>,
    cancel: CancellationToken,
) -> Vec<AgentMessage>
}

Preconditions: context.agent_id and context.session_id must be Some — the function panics with a descriptive message otherwise. In practice, any context that passed through agent_loop() at least once already has these set. When constructing a context manually (e.g., from a persisted snapshot), set them explicitly before calling this function.

The last message in context must also not be an assistant message.

AgentLoopConfig

#![allow(unused)]
fn main() {
pub struct AgentLoopConfig {
    /// REQUIRED — complete provider identity: model id, api_key, base_url, protocol, cost rates.
    pub model_config: ModelConfig,
    /// Optional override — bypasses ProviderRegistry, used for MockProvider in tests.
    pub provider_override: Option<Arc<dyn StreamProvider>>,
    pub config_id: Option<String>,
    pub thinking_level: ThinkingLevel,
    pub max_tokens: Option<u32>,
    pub temperature: Option<f32>,
    pub convert_to_llm: Option<ConvertToLlmFn>,
    pub transform_context: Option<TransformContextFn>,
    pub get_steering_messages: Option<GetMessagesFn>,
    pub get_follow_up_messages: Option<GetMessagesFn>,
    pub context_config: Option<ContextConfig>,
    pub execution_limits: Option<ExecutionLimits>,
    pub cache_config: CacheConfig,
    pub tool_execution: ToolExecutionStrategy,
    pub retry_config: RetryConfig,
    pub before_loop: Option<BeforeLoopFn>,
    pub after_loop: Option<AfterLoopFn>,
    pub before_turn: Option<BeforeTurnFn>,
    pub after_turn: Option<AfterTurnFn>,
    pub on_error: Option<OnErrorFn>,
    pub before_tool_execution: Option<BeforeToolExecutionFn>,
    pub after_tool_execution: Option<AfterToolExecutionFn>,
    pub before_tool_execution_update: Option<BeforeToolExecutionUpdateFn>,
    pub after_tool_execution_update: Option<AfterToolExecutionUpdateFn>,
    pub before_compaction_start: Option<BeforeCompactionStartFn>,
    pub after_compaction_end: Option<AfterCompactionEndFn>,
    pub input_filters: Vec<Arc<dyn InputFilter>>,
    pub first_turn_trigger: TurnTrigger,
    pub context_translation: Option<Arc<dyn ContextTranslationStrategy>>,
    pub prun_pending: Option<Arc<Mutex<Vec<PrunRequest>>>>,
}
}

Field	Purpose
`model_config`	Required. Complete provider identity: model id, api_key, base_url, api protocol, cost rates, compat flags. The provider is resolved from `model_config.api` via `ProviderRegistry`.
`provider_override`	Custom `Arc<dyn StreamProvider>` — bypasses registry when `Some`. Used for `MockProvider` in tests or fully custom backends.
`config_id`	Optional stable identity for this config; auto-derived as `"{provider_id}.{model_slug}[.thinking]"` when `None`. Used as the middle segment of `loop_id`.
`thinking_level`	`Off`, `Minimal`, `Low`, `Medium`, `High`
`convert_to_llm`	Custom `AgentMessage[] → Message[]` conversion
`transform_context`	Pre-processing hook for context pruning
`get_steering_messages`	Returns user interruptions during tool execution
`get_follow_up_messages`	Returns queued work after agent would stop
`context_config`	Token budget and compaction settings
`execution_limits`	Max turns, tokens, duration
`cache_config`	Prompt caching behavior (see Prompt Caching)
`tool_execution`	Parallel, Sequential, or Batched (see Tools)
`retry_config`	Retry behavior for transient errors (see Retry)
`before_loop`	Called once before `AgentStart`; return `false` to abort the entire run (see Callbacks)
`after_loop`	Called once after `AgentEnd` with all new messages and accumulated usage (see Callbacks)
`before_turn`	Called before each LLM call; return `false` to abort (see Callbacks)
`after_turn`	Called after each turn with messages and usage (see Callbacks)
`on_error`	Called on `StopReason::Error` with the error string (see Callbacks)
`before_tool_execution`	Called before each tool call; return `false` to skip it (see Callbacks)
`after_tool_execution`	Called after each tool call completes (see Callbacks)
`before_tool_execution_update`	Called before each streaming tool update; return `false` to suppress the event (see Callbacks)
`after_tool_execution_update`	Called after each streaming tool update event (see Callbacks)
`before_compaction_start`	Called before compaction starts with `(estimated_tokens, message_count)`; return `false` to skip compaction for this cycle (see Callbacks)
`after_compaction_end`	Called after compaction completes with `(messages_before, messages_after, tokens_before, tokens_after)` (see Callbacks)
`input_filters`	Input filters applied to user messages before the LLM call (see Tools)
`first_turn_trigger`	The `TurnTrigger` for the first `TurnStart` event; defaults to `TurnTrigger::User`, set to `SubAgent` by sub-agent callers
`context_translation`	Optional `ContextTranslationStrategy` for cross-provider compatibility — translates content types (e.g., `Content::Thinking`) when targeting a different provider (G8)
`prun_pending`	Shared state for `PrunTool` to communicate pruning requests to the loop; set automatically by `with_prun_tool()`

0.9.0 — async lifecycle hooks. BeforeLoopFn, AfterLoopFn, BeforeTurnFn, AfterTurnFn, OnErrorFn, BeforeToolExecutionFn, AfterToolExecutionFn, BeforeCompactionStartFn, and AfterCompactionEndFn are now async — their function bodies return Pin<Box<dyn Future<Output = T> + Send>> (alias: HookFuture<'_, T>). Sync closure bodies migrate by wrapping in Box::pin(async move { ... }). Closures can now .await LLM calls and other async work directly without a tokio::task::block_in_place bridge.

Pre-existing-behaviour preservation note (phi-core 0.9.0): tool-update hooks stay sync. BeforeToolExecutionUpdateFn and AfterToolExecutionUpdateFn remain Arc<dyn Fn(&str, &str, &str) -> bool + Send + Sync> / Arc<dyn Fn(&str, &str, &str) + Send + Sync> respectively. Async-ifying them would cascade into the ToolUpdateFn callback type and every AgentTool::execute body that invokes ctx.on_update(...) — a materially wider migration than the 0.9.0 scope. The veto decision in BeforeToolExecutionUpdateFn is synchronous so the surrounding emit gate works without an .await at every streamed tool-update; consumers that need async work at update-time should dispatch via tokio::spawn(...) inside the sync closure body. Tracked under the CHANGELOG [Unreleased] "Forward markers" section for a future release. InputFilter::filter() is also now async fn via #[async_trait] — see Tools and the per-turn debug-capture surface at debugging.md.

Steering & Follow-Ups

Steering

Steering messages interrupt the agent between tool executions. When the agent is executing multiple tool calls from a single LLM response, steering is checked after each tool completes. If a steering message is found:

The current tool finishes normally
All remaining tool calls are skipped with is_error: true and "Skipped due to queued user message"
The steering message is injected into context
The loop continues with a new LLM call that sees the interruption

#![allow(unused)]
fn main() {
// While agent is running tools, redirect it:
agent.steer(AgentMessage::Llm(Message::user("Stop that. Instead, explain what you found.")));
}

Follow-Ups

Follow-up messages are checked after the agent would normally stop (no more tool calls, no steering). If follow-ups exist, the loop continues with them as new input — the agent doesn't need to be re-prompted.

#![allow(unused)]
fn main() {
// Queue work for after the agent finishes its current task:
agent.follow_up(AgentMessage::Llm(Message::user("Now run the tests.")));
agent.follow_up(AgentMessage::Llm(Message::user("Then commit the changes.")));
}

Queue Modes

Both queues support two delivery modes:

Mode	Behavior
`QueueMode::OneAtATime`	Delivers one message per turn (default)
`QueueMode::All`	Delivers all queued messages at once

#![allow(unused)]
fn main() {
agent.set_steering_mode(QueueMode::All);
agent.set_follow_up_mode(QueueMode::OneAtATime);
}

Queue Management

#![allow(unused)]
fn main() {
agent.clear_steering_queue();   // Drop all pending steers
agent.clear_follow_up_queue();  // Drop all pending follow-ups
agent.clear_all_queues();       // Drop everything
}

Low-Level API

When using agent_loop() directly, steering and follow-ups are provided via callback functions:

#![allow(unused)]
fn main() {
let config = AgentLoopConfig {
    get_steering_messages: Some(Box::new(|| {
        // Return Vec<AgentMessage> — checked between tool calls
        vec![]
    })),
    get_follow_up_messages: Some(Box::new(|| {
        // Return Vec<AgentMessage> — checked when agent would stop
        vec![]
    })),
    // ...
};
}

Custom Compaction

By default, when context exceeds the token budget in ContextConfig, phi-core runs a 3-level compaction strategy: truncate tool outputs → summarize old turns → drop middle messages (legacy in-memory path via compact_messages()). When a Session is available, the modern system uses non-destructive CompactionBlock overlays — see compaction. You can replace this with your own CompactionStrategy.

CompactionStrategy vs BlockCompactionStrategy

CompactionStrategy — Legacy in-memory approach. Destructive: it mutates the message list directly. Used when AgentContext.session is None (no session persistence).

BlockCompactionStrategy — New overlay approach. Non-destructive: it creates a CompactionBlock on the LoopRecord rather than altering the original messages. Used when AgentContext.session is Some (session-backed execution). Original messages remain authoritative for replay and branching.

Example of a custom CompactionStrategy:

#![allow(unused)]
fn main() {
use phi_core::context::{CompactionStrategy, ContextConfig, CompactionConfig, compact_messages};
use phi_core::types::*;
use std::sync::Arc;

struct MyCompaction;

impl CompactionStrategy for MyCompaction {
    fn compact(
        &self,
        messages: Vec<AgentMessage>,
        config: &ContextConfig,
    ) -> Vec<AgentMessage> {
        // Your logic here — then optionally delegate to the default:
        compact_messages(messages, config)
    }
}

// Modern pattern: set strategies via ContextConfig.compaction
let context_config = ContextConfig {
    compaction: CompactionConfig {
        // in_memory_strategy: used when AgentContext.session is None (sub-agents, tests)
        in_memory_strategy: Some(Arc::new(MyCompaction)),
        // block_strategy: used when AgentContext.session is Some (session-backed execution)
        // block_strategy: Some(Arc::new(MyBlockCompaction)),
        ..CompactionConfig::default()
    },
    ..ContextConfig::default()
};

let agent = BasicAgent::new(model_config)
    .with_context_config(context_config);
}

The in-memory strategy is called once per turn, right before the LLM call, whenever context_config is Some and AgentContext.session is None. When in_memory_strategy is None, DefaultCompaction (which wraps compact_messages()) is used automatically. When a session is present, block_strategy is used instead (defaulting to DefaultBlockCompaction).

Use Cases

Memory-aware compaction — Index messages into a vector store before they're dropped, so the agent can recall them later via a search tool:

#![allow(unused)]
fn main() {
struct MemoryAwareCompaction {
    memory: Arc<dyn MemoryStore>,
}

impl CompactionStrategy for MemoryAwareCompaction {
    fn compact(
        &self,
        messages: Vec<AgentMessage>,
        config: &ContextConfig,
    ) -> Vec<AgentMessage> {
        let compacted = compact_messages(messages.clone(), config);

        // Index what was dropped
        let dropped: Vec<_> = messages.iter()
            .filter(|m| !compacted.contains(m))
            .collect();
        if !dropped.is_empty() {
            self.memory.index(dropped);
        }

        compacted
    }
}
}

Semantic pointer compaction — Replace dropped messages with a marker so the agent knows context was lost:

#![allow(unused)]
fn main() {
struct SemanticPointerCompaction;

impl CompactionStrategy for SemanticPointerCompaction {
    fn compact(
        &self,
        messages: Vec<AgentMessage>,
        config: &ContextConfig,
    ) -> Vec<AgentMessage> {
        let compacted = compact_messages(messages.clone(), config);
        let dropped_count = messages.len() - compacted.len();

        if dropped_count == 0 {
            return compacted;
        }

        // Insert a marker after the first kept messages
        let mut result = compacted;
        let insert_at = config.compaction.keep_first_turns.min(result.len());
        result.insert(insert_at, AgentMessage::Extension(
            ExtensionMessage::new("compaction_marker", serde_json::json!({
                "dropped": dropped_count,
                "note": format!("{} earlier messages were compacted", dropped_count),
            }))
        ));
        result
    }
}
}

Priority-preserving compaction — Never drop messages containing important keywords:

#![allow(unused)]
fn main() {
struct PriorityPreservingCompaction {
    preserve_keywords: Vec<String>,
}

impl CompactionStrategy for PriorityPreservingCompaction {
    fn compact(
        &self,
        messages: Vec<AgentMessage>,
        config: &ContextConfig,
    ) -> Vec<AgentMessage> {
        let (priority, normal): (Vec<_>, Vec<_>) = messages.into_iter()
            .partition(|m| self.is_priority(m));

        let mut compacted = compact_messages(normal, config);

        // Re-insert priority messages — they're never dropped
        for msg in priority {
            compacted.push(msg);
        }
        compacted
    }
}
}

Evaluational Parallelism

agent_loop_parallel runs the same prompt through multiple AgentLoopConfigs concurrently, evaluates the results with a pluggable EvaluationStrategy, and returns the winning branch. This is useful for multi-model comparison, A/B prompt testing, and selecting the best response among different reasoning approaches.

#![allow(unused)]
fn main() {
use phi_core::{agent_loop_parallel, PickFirstEvaluation, AgentContext, AgentLoopConfig};
use std::sync::Arc;

let result = agent_loop_parallel(
    prompts,
    base_context,           // cloned per branch; Arc tools shared
    vec![config_a, config_b],
    Arc::new(PickFirstEvaluation),
    tx,
    cancel,
).await;

// result.selected_context feeds directly into agent_loop_continue()
// result.selected_messages is the winning branch's output
}

See Evaluational Parallelism for the full guide including built-in strategies, the LLM judge, and session continuity.

phi-core Documentation