> ## Documentation Index
> Fetch the complete documentation index at: https://polos.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Agents

Agents use LLMs to reason about tasks and autonomously decide which actions to take. In Polos, agents are **durable** - they survive failures and resume exactly where they stopped.

<CodeGroup>
  ```python theme={null}
  import asyncio
  from polos import PolosClient, Agent, tool, WorkflowContext
  from pydantic import BaseModel

  class WeatherInput(BaseModel):
      city: str

  @tool(description="Get weather for a city")
  async def get_weather(ctx: WorkflowContext, input: WeatherInput):
      return await weather_api.get(input.city)

  weather_agent = Agent(
      id="weather-agent",
      provider="openai",
      model="gpt-4o",
      system_prompt="You are a helpful weather assistant.",
      tools=[get_weather]
  )

  async def main():
      client = PolosClient()
      # Run the agent
      response = await weather_agent.run(client, "What's the weather in NYC?")
      print(response.result)

  if __name__ == "__main__":
      asyncio.run(main())
  ```

  ```typescript theme={null}
  import { PolosClient, defineAgent, defineTool } from "@polos/sdk";
  import { openai } from "@ai-sdk/openai";
  import { z } from "zod";

  const weatherInputSchema = z.object({
    city: z.string(),
  });

  const getWeather = defineTool(
    {
      id: "get-weather",
      description: "Get weather for a city",
      inputSchema: weatherInputSchema,
    },
    async (ctx, input) => {
      return await weatherApi.get(input.city);
    }
  );

  const weatherAgent = defineAgent({
    id: "weather-agent",
    model: openai("gpt-4o"),
    systemPrompt: "You are a helpful weather assistant.",
    tools: [getWeather],
  });

  async function main() {
    const client = PolosClient.fromEnv();
    // Run the agent
    const response = await weatherAgent.run(client, "What's the weather in NYC?");
    console.log(response.result);
  }

  main();
  ```
</CodeGroup>

That's it. Your agent automatically:

* Calls tools when needed
* Survives crashes and resumes mid-reasoning
* Maintains conversation history
* Prevents duplicate API calls

## How agents work

When you run an agent:

1. **LLM reasons about the task** - The agent analyzes your request and decides what to do
2. **Calls tools if needed** - If the agent needs information or wants to take action, it calls the appropriate tools
3. **Iterates until complete** - The agent continues reasoning and calling tools until it has a final answer or hits a stop condition
4. **Returns the result** - You get the final response

Agents are durable. If your agent crashes mid-execution (say, after calling the weather API but before responding), Polos automatically resumes it from where it stopped. No duplicate API calls, saving you tokens and cost.

Under the hood, agents are built on Polos [workflows](/fundamentals/workflows) with automatic state persistence. Learn more about [how durability works here](/fundamentals/durable-execution).

## Running agents

### Direct execution

Use `agent.run()` to generate complete response from LLM.

<CodeGroup>
  ```python theme={null}
  response = await weather_agent.run(
      client,
      "Compare the weather in NYC and London",
      reasoning={"effort": "medium"}
  )

  print(response.result)
  ```

  ```typescript theme={null}
  const response = await weatherAgent.run(
    client,
    "Compare the weather in NYC and London",
    { reasoning: { effort: "medium" } }
  );

  console.log(response.result);
  ```
</CodeGroup>

The agent:

* Calls LLM with the user input
* Executes tool calls suggested by the LLM - in this case, `get_weather` for NYC and `get_weather` for London
* Calls LLM with the results (or errors) of the tool calls
* Returns the final LLM response if no more tool calls are needed

### Streaming responses

Stream responses for real-time user experience:

<CodeGroup>
  ```python theme={null}
  result = await weather_agent.stream(client, "What's the weather in Tokyo?")

  # Stream text chunks as they arrive
  async for chunk in result.text_chunks:
      print(chunk, end="", flush=True)
  ```

  ```typescript theme={null}
  const result = await weatherAgent.stream(client, "What's the weather in Tokyo?");

  // Stream text chunks as they arrive
  for await (const chunk of result.textChunks) {
    process.stdout.write(chunk);
  }
  ```
</CodeGroup>

## Tools

Tools give agents the ability to take actions. Define them with the `@tool` decorator:

<CodeGroup>
  ```python theme={null}
  from polos import tool, WorkflowContext
  from pydantic import BaseModel
  from typing import List

  class SearchInput(BaseModel):
      query: str

  class SearchOutput(BaseModel):
      results: List[str]

  class EmailInput(BaseModel):
      to: str
      subject: str
      body: str

  class EmailOutput(BaseModel):
      status: str

  @tool(description="Search the web")
  async def search_web(ctx: WorkflowContext, input: SearchInput) -> SearchOutput:
      results = await search_api.query(input.query)
      return SearchOutput(results=results[:5])

  @tool(description="Send an email")
  async def send_email(ctx: WorkflowContext, input: EmailInput) -> EmailOutput:
      await email_service.send(
          to=input.to,
          subject=input.subject,
          body=input.body
      )
      return EmailOutput(status="sent")

  research_agent = Agent(
      id="research-agent",
      provider="anthropic",
      model="claude-sonnet-4",
      system_prompt="You are a research assistant. Search for information and email summaries.",
      tools=[search_web, send_email]
  )
  ```

  ```typescript theme={null}
  import { defineAgent, defineTool } from "@polos/sdk";
  import { anthropic } from "@ai-sdk/anthropic";
  import { z } from "zod";

  const searchInputSchema = z.object({
    query: z.string(),
  });

  const searchOutputSchema = z.object({
    results: z.array(z.string()),
  });

  const emailInputSchema = z.object({
    to: z.string(),
    subject: z.string(),
    body: z.string(),
  });

  const emailOutputSchema = z.object({
    status: z.string(),
  });

  const searchWeb = defineTool(
    {
      id: "search-web",
      description: "Search the web",
      inputSchema: searchInputSchema,
      outputSchema: searchOutputSchema,
    },
    async (ctx, input) => {
      const results = await searchApi.query(input.query);
      return { results: results.slice(0, 5) };
    }
  );

  const sendEmail = defineTool(
    {
      id: "send-email",
      description: "Send an email",
      inputSchema: emailInputSchema,
      outputSchema: emailOutputSchema,
    },
    async (ctx, input) => {
      await emailService.send({
        to: input.to,
        subject: input.subject,
        body: input.body,
      });
      return { status: "sent" };
    }
  );

  const researchAgent = defineAgent({
    id: "research-agent",
    model: anthropic("claude-sonnet-4"),
    systemPrompt: "You are a research assistant. Search for information and email summaries.",
    tools: [searchWeb, sendEmail],
  });
  ```
</CodeGroup>

The LLM sees each tool's description and function signature, then decides when to call them based on the user's request.

Tools are durable (under the hood, they are workflows) - if an agent crashes after calling a tool, the tool result is cached. On resume, the agent doesn't re-execute the tool; it uses the cached result.

## Sandbox tools

Give agents the ability to write code, run shell commands, and explore a codebase inside an isolated environment. A single `sandboxTools()` call creates six tools (`exec`, `read`, `write`, `edit`, `glob`, `grep`):

<CodeGroup>
  ```python theme={null}
  from polos import Agent, sandbox_tools, SandboxToolsConfig, DockerEnvironmentConfig

  tools = sandbox_tools(SandboxToolsConfig(
      env="docker",
      scope="session",
      docker=DockerEnvironmentConfig(image="node:20-slim"),
  ))

  coding_agent = Agent(
      id="coding_agent",
      provider="anthropic",
      model="claude-opus-4-5",
      system_prompt="You are a coding assistant.",
      tools=tools,
  )
  ```

  ```typescript theme={null}
  import { defineAgent, sandboxTools } from '@polos/sdk';
  import { anthropic } from '@ai-sdk/anthropic';

  const tools = sandboxTools({
    env: 'docker',
    scope: 'session',
    docker: { image: 'node:20-slim' },
  });

  const codingAgent = defineAgent({
    id: 'coding_agent',
    model: anthropic('claude-opus-4-5'),
    systemPrompt: 'You are a coding assistant.',
    tools: [...tools],
  });
  ```
</CodeGroup>

Use `env: 'docker'` for isolated container execution, or `env: 'local'` to run directly on the host (with approval-based security by default). See [Sandbox](/agents/sandbox) for the full reference.

## Triggering agents from Slack

Agents can be triggered directly from Slack by @mentioning your bot. The output streams back to the originating thread:

```
@polos @coding_agent Build a REST API with Express and SQLite
```

When the agent suspends for approval (e.g., before running a shell command), you'll see the approval message in the same Slack thread. See [Slack Integration](/agents/slack-integration) for setup instructions.

## Structured outputs

Instead of natural language, agents can return structured data:

<CodeGroup>
  ```python theme={null}
  from pydantic import BaseModel, Field
  from polos import PolosClient, Agent

  class PersonInfo(BaseModel):
      name: str = Field(description="Full name")
      age: int = Field(description="Age in years", ge=0, le=130)
      email: str = Field(description="Email address")
      location: str = Field(description="City")

  person_extractor = Agent(
      id="person-extractor",
      provider="openai",
      model="gpt-4o",
      system_prompt="Extract person information from text.",
      output_schema=PersonInfo
  )

  async def main():
      client = PolosClient()
      response = await person_extractor.run(
          client, "Hi, I'm Alice, 28 years old, living in SF. Email: alice@example.com"
      )

      # response.result is a PersonInfo object
      print(response.result.name)      # "Alice"
      print(response.result.age)       # 28
      print(response.result.location)  # "SF"

  if __name__ == "__main__":
      asyncio.run(main())
  ```

  ```typescript theme={null}
  import { PolosClient, defineAgent } from "@polos/sdk";
  import { openai } from "@ai-sdk/openai";
  import { z } from "zod";

  const personInfoSchema = z.object({
    name: z.string().describe("Full name"),
    age: z.number().int().min(0).max(130).describe("Age in years"),
    email: z.string().describe("Email address"),
    location: z.string().describe("City"),
  });

  const personExtractor = defineAgent({
    id: "person-extractor",
    model: openai("gpt-4o"),
    systemPrompt: "Extract person information from text.",
    outputSchema: personInfoSchema,
  });

  async function main() {
    const client = PolosClient.fromEnv();
    const response = await personExtractor.run(
      client,
      "Hi, I'm Alice, 28 years old, living in SF. Email: alice@example.com"
    );

    // response.result is typed from the Zod schema
    console.log(response.result.name);     // "Alice"
    console.log(response.result.age);      // 28
    console.log(response.result.location); // "SF"
  }

  main();
  ```
</CodeGroup>

Perfect for data extraction, form processing, or building structured APIs.

## Stop conditions

Control when an agent stops executing to prevent runaway costs or infinite loops:

<CodeGroup>
  ```python theme={null}
  from polos import max_steps, max_tokens, MaxStepsConfig, MaxTokensConfig

  research_agent = Agent(
      id="research-agent",
      provider="openai",
      model="gpt-4o",
      system_prompt="Research topics thoroughly.",
      tools=[search_web, read_article],
      stop_conditions=[
          max_steps(MaxStepsConfig(limit=15)),        # Stop after 15 LLM calls
          max_tokens(MaxTokensConfig(limit=50000)),   # Stop if tokens exceed 50k
      ]
  )
  ```

  ```typescript theme={null}
  import { defineAgent, maxSteps, maxTokens } from "@polos/sdk";
  import { openai } from "@ai-sdk/openai";

  const researchAgent = defineAgent({
    id: "research-agent",
    model: openai("gpt-4o"),
    systemPrompt: "Research topics thoroughly.",
    tools: [searchWeb, readArticle],
    stopConditions: [
      maxSteps({ count: 15 }),    // Stop after 15 LLM calls
      maxTokens({ limit: 50000 }), // Stop if tokens exceed 50k
    ],
  });
  ```
</CodeGroup>

Here, we are using **built-in stop conditions:**

* `max_steps` - Limit reasoning iterations
* `max_tokens` - Cap total token usage (input + output)

You can also create custom stop conditions for specific needs (e.g., stop when certain tools are called, or when specific keywords appear).

## Conversational memory

Agents automatically maintain conversation history:

<CodeGroup>
  ```python theme={null}
  conversation_id = uuid.uuid4()

  # First message
  response1 = await chat_agent.run(
      client, "What's the weather in NYC?", conversation_id=conversation_id
  )

  # Follow-up (agent remembers context)
  response2 = await chat_agent.run(
      client, "How about tomorrow?", conversation_id=conversation_id
  )
  # Agent knows we're still talking about NYC
  ```

  ```typescript theme={null}
  import { v4 as uuidv4 } from "uuid";

  const conversationId = uuidv4();

  // First message
  const response1 = await chatAgent.run(
    client,
    "What's the weather in NYC?",
    { conversationId }
  );

  // Follow-up (agent remembers context)
  const response2 = await chatAgent.run(
    client,
    "How about tomorrow?",
    { conversationId }
  );
  // Agent knows we're still talking about NYC
  ```
</CodeGroup>

Conversation history is durable - if the agent crashes, it resumes with complete context.

## Using agents in workflows

Agents are workflows, so you can compose them with other workflows:

<CodeGroup>
  ```python theme={null}
  from polos import workflow, WorkflowContext

  @workflow
  async def customer_support(ctx: WorkflowContext, input: CustomerSupportInput):
      # Agent handles the customer query
      response = await ctx.step.agent_invoke_and_wait(
          "customer_support_agent", # step key
          customer_support_agent.with_input(input.question)
      )

      # Update your customer support software with the interaction
      await ctx.step.run("log", log_interaction, response)

      # Send follow-up email
      await ctx.step.run("email", send_followup, input.customer_email, response)

      return response
  ```

  ```typescript theme={null}
  import { defineWorkflow } from "@polos/sdk";

  const customerSupport = defineWorkflow<CustomerSupportInput, void, string>(
    { id: "customer-support" },
    async (ctx, input) => {
      // Agent handles the customer query
      const response = await ctx.step.agentInvokeAndWait(
        "customer_support_agent", // step key
        customerSupportAgent.withInput(input.question)
      );

      // Update your customer support software with the interaction
      await ctx.step.run("log", logInteraction, response);

      // Send follow-up email
      await ctx.step.run("email", sendFollowup, input.customerEmail, response);

      return response;
    }
  );
  ```
</CodeGroup>

## Human-in-the-loop

Combine agents with approval gates for sensitive operations:

<CodeGroup>
  ```python theme={null}
  @workflow
  async def approval_workflow(ctx: WorkflowContext, input: dict):
      # Agent generates a plan
      plan = await ctx.step.agent_invoke_and_wait(
          "generate_plan",
          planning_agent.with_input(input.task)
      )

      # Suspend the workflow and wait for human approval
      resume_data = await ctx.step.suspend(
          "suspend_step",
          data={"plan": plan}
      )

      # Resumes here when the decision is received
      decision = resume_data.get("data", {})
      if decision.get("approved"):
          # Execute the approved plan
          result = await ctx.step.agent_invoke_and_wait(
              "execute", executor_agent.with_input(plan)
          )
          return result
      else:
          return None
  ```

  ```typescript theme={null}
  import { defineWorkflow } from "@polos/sdk";

  const approvalWorkflow = defineWorkflow<{ task: string }, void, string | null>(
    { id: "approval-workflow" },
    async (ctx, input) => {
      // Agent generates a plan
      const plan = await ctx.step.agentInvokeAndWait(
        "generate_plan",
        planningAgent.withInput(input.task)
      );

      // Suspend the workflow and wait for human approval
      const resumeData = await ctx.step.suspend(
        "suspend_step",
        { data: { plan } }
      );

      // Resumes here when the decision is received
      const decision = resumeData?.data ?? {};
      if (decision.approved) {
        // Execute the approved plan
        const result = await ctx.step.agentInvokeAndWait(
          "execute",
          executorAgent.withInput(plan)
        );
        return result;
      } else {
        return null;
      }
    }
  );
  ```
</CodeGroup>

## Key takeaways

* Agents handle LLM reasoning automatically - you just define tools and let them work
* Run with `agent.run()` or stream with `agent.stream()`
* Tools (defined with `@tool`) give agents the ability to act
* Agents are durable - they survive crashes and resume from the last completed step
* Use structured outputs for reliable data extraction
* Stop conditions control execution and prevent runaway costs
* Conversational memory maintained automatically
* Compose agents in workflows for complex multi-step tasks
* Sandbox tools let agents write and execute code in isolated environments
* Trigger agents from Slack with @mentions

## Learn more

* **[Agent Guide](/agents/overview)** – Advanced agent patterns and techniques
* **[Sandbox](/agents/sandbox)** – Isolated execution environments
* **[Slack Integration](/agents/slack-integration)** – Trigger agents from Slack
* **[Examples](/guides/cookbook-examples)** – Real-world agent implementations