Tools

Tools give agents the ability to take actions - calling APIs, querying databases, sending emails, or any other operation your agent needs to perform.

Defining tools

Tools are defined using the @tool decorator. Each tool needs:

A description (guides the LLM on when to use it)
Pydantic models for input and output
An async function that performs the action

from polos import tool, WorkflowContext
from pydantic import BaseModel, Field

class WeatherInput(BaseModel):
    city: str = Field(description="City name")
    unit: str = Field(default="celsius", description="Temperature unit")

class WeatherOutput(BaseModel):
    city: str
    temperature: float
    condition: str
    humidity: int

@tool(description="Get current weather for a city")
async def get_weather(ctx: WorkflowContext, input: WeatherInput) -> WeatherOutput:
    # Call weather API
    response = await weather_api.get(
        city=input.city,
        unit=input.unit
    )

    return WeatherOutput(
        city=input.city,
        temperature=response.temp,
        condition=response.conditions,
        humidity=response.humidity
    )

Why Pydantic models?

Input models:

Define the schema the LLM sees
Validate tool inputs automatically
Provide clear descriptions via Field(description=...)

Output models:

Ensure type safety
Make outputs JSON-serializable for durability
Document what the tool returns

Using tools with agents

Add tools to an agent’s tools parameter:

from polos import Agent, PolosClient

weather_agent = Agent(
    id="weather-agent",
    provider="openai",
    model="gpt-4o",
    system_prompt="You are a weather assistant. Use tools to answer weather questions.",
    tools=[get_weather]
)

client = PolosClient()
response = await weather_agent.run(client, "What's the weather in Tokyo?")

What happens:

LLM analyzes the request
LLM decides to call get_weather with city="Tokyo"
Agent automatically executes the tool
Agent feeds the tool output back to the LLM
LLM generates a natural language response

Tools are workflows

Under the hood, tools are workflows. This means:

Tools are durable - If a tool fails mid-execution, it resumes from the last completed step
Tool results are cached - On agent replay, completed tools return cached results (no re-execution)
Tools can use workflow features - Call other workflows, wait for events, use steps

@tool(description="Process order and send confirmation")
async def process_order(ctx: WorkflowContext, input: OrderInput) -> OrderOutput:
    # Tool can use workflow steps
    order = await ctx.step.run("create_order", create_order, input)

    # Tool can call other workflows
    payment = await ctx.step.invoke_and_wait(
        "process_payment",
        payment_workflow,
        {"order_id": order.id}
    )

    # Tool can wait
    await ctx.step.wait_for("delay", seconds=5)

    return OrderOutput(order_id=order.id, status="completed")

Tool approval

Tools can require human approval before executing. When a tool has approval="always", Polos suspends the workflow before running the tool, presents an approval form to the user, and only proceeds if the user approves. If the user rejects, the agent receives an error with optional feedback and can adjust its approach.

@tool(description="Send an email", approval="always")
async def send_email(ctx: WorkflowContext, input: EmailInput) -> EmailOutput:
    await email_service.send(to=input.to, subject=input.subject, body=input.body)
    return EmailOutput(status="sent")

Approval values

Value	Behavior
`"always"`	Suspends before every call. The user sees the tool name, input, and can approve or reject with feedback.
`"none"`	No approval required. The tool runs immediately.
Not set (default)	Same as `"none"` - no approval required.

What happens during approval

When a tool with approval="always" is called:

The workflow suspends with a _form containing the tool name and input
The user is prompted to approve or reject the call
If approved, the tool executes normally
If rejected, the agent receives an error: Tool "send_email" was rejected by the user. with any feedback the user provided
The agent can read the feedback and try a different approach

Sandbox tool approval

Sandbox tools have additional approval mechanisms beyond the approval parameter:

Exec security controls command execution via security mode ("approval-always", "allowlist", or "allow-always"). See Exec Security.
File approval controls write/edit operations via file_approval / fileApproval ("always" or "none"). Defaults to "always" in local mode.
Path restriction controls read-only tools (read, glob, grep) accessing files outside the workspace - these suspend for approval automatically.

See Sandbox Tools and Local Sandbox for details.

Error handling

When tools encounter errors, the agent feeds the error back to the LLM so it can correct mistakes or try alternative approaches.

class CalculatorInput(BaseModel):
    expression: str

class CalculatorOutput(BaseModel):
    result: float
    error: str | None = None

@tool(description="Evaluate mathematical expressions")
async def calculator(ctx: WorkflowContext, input: CalculatorInput) -> CalculatorOutput:
    result = eval(input.expression)
    return CalculatorOutput(result=result, error=None)

Example flow:

User: "What's 10 divided by zero?"

LLM → calculator(expression="10 / 0")
Tool → throws "ZeroDivisionError: division by zero"
Agent → sends error back to the LLM
LLM → "I apologize, but division by zero is undefined in mathematics..."

The LLM automatically:

Sees the error message
Understands what went wrong
Can retry with corrected inputs
Or explain the limitation to the user

Multiple tools

Agents can use multiple tools in a single conversation:

class SearchInput(BaseModel):
    query: str

class SearchOutput(BaseModel):
    results: list[str]

class EmailInput(BaseModel):
    to: str
    subject: str
    body: str

class EmailOutput(BaseModel):
    status: str

@tool(description="Search the web")
async def search_web(ctx: WorkflowContext, input: SearchInput) -> SearchOutput:
    results = await search_api.query(input.query)
    return SearchOutput(results=results[:5])

@tool(description="Send an email")
async def send_email(ctx: WorkflowContext, input: EmailInput) -> EmailOutput:
    await email_service.send(
        to=input.to,
        subject=input.subject,
        body=input.body
    )
    return EmailOutput(status="sent")

assistant = Agent(
    id="assistant",
    provider="anthropic",
    model="claude-sonnet-4-5",
    system_prompt="You are a helpful assistant. Search for information and send emails when needed.",
    tools=[search_web, send_email]
)

client = PolosClient()
response = await assistant.run(
    client,
    "Search for Python tutorials and email the top 3 results to alice@example.com"
)

The agent will:

Call search_web("Python tutorials")
Process the results
Call send_email(to="alice@example.com", subject="...", body="...")
Return confirmation

Tool descriptions

Write clear tool descriptions to help the LLM understand when to use each tool:

# ❌ BAD: Vague description
@tool(description="Does stuff with data")
async def process_data(ctx, input):
    ...

# ✅ GOOD: Clear, specific description
@tool(description="Analyze CSV data and return summary statistics including mean, median, and standard deviation")
async def analyze_csv(ctx: WorkflowContext, input: AnalyzeInput) -> AnalyzeOutput:
    ...

# ✅ GOOD: Includes usage guidance
@tool(description="Send an email. Use this when the user explicitly asks to send an email or notify someone.")
async def send_email(ctx: WorkflowContext, input: EmailInput) -> EmailOutput:
    ...

Best practices:

Describe what the tool does
Mention when it should be used
Include key parameters in the description
Use active voice (“Get weather data” not “This gets weather data”)

Field descriptions

Use Pydantic Field to document input parameters:

from pydantic import BaseModel, Field

class SearchInput(BaseModel):
    query: str = Field(description="The search query string")
    max_results: int = Field(
        default=10,
        description="Maximum number of results to return (1-100)"
    )
    filter_by_date: bool = Field(
        default=False,
        description="If true, only return results from the last 30 days"
    )

@tool(description="Search the web for information")
async def search(ctx: WorkflowContext, input: SearchInput) -> SearchOutput:
    ...

The LLM sees these descriptions and uses them to construct correct tool calls.

Tool durability in practice

Because tools are workflows, they benefit from durability:

@tool(description="Process large dataset")
async def process_dataset(ctx: WorkflowContext, input: DatasetInput) -> DatasetOutput:
    # Step 1: Download data (durable)
    data = await ctx.step.run("download", download_data, input.url)

    # Step 2: Transform data (durable)
    transformed = await ctx.step.run("transform", transform_data, data)

    # Step 3: Upload results (durable)
    url = await ctx.step.run("upload", upload_results, transformed)

    return DatasetOutput(result_url=url)

If this tool crashes after downloading but before transforming:

On retry, download returns cached data (no re-download)
transform executes for the first time
Agent continues without losing progress

Key takeaways

Tools are workflows - They’re durable and can use all workflow features
Use Pydantic models for input and output
Errors are feedback - Tools return errors to the LLM, which can correct and retry
Write clear descriptions - Help the LLM understand when and how to use tools
Use approval="always" for sensitive operations - Require human approval before execution

Introduction

Getting Started

Fundamentals

Agents

Workflows

Observability

Guides and Examples

Community

Defining tools

Why Pydantic models?

Using tools with agents

Tools are workflows

Tool approval

Approval values

What happens during approval

Sandbox tool approval

Error handling

Multiple tools

Tool descriptions

Field descriptions

Tool durability in practice

Key takeaways

Introduction

Getting Started

Fundamentals

Agents

Workflows

Observability

Guides and Examples

Community

​Defining tools

​Why Pydantic models?

​Using tools with agents

​Tools are workflows

​Tool approval

​Approval values

​What happens during approval

​Sandbox tool approval

​Error handling

​Multiple tools

​Tool descriptions

​Field descriptions

​Tool durability in practice

​Key takeaways

Defining tools

Why Pydantic models?

Using tools with agents

Tools are workflows

Tool approval

Approval values

What happens during approval

Sandbox tool approval

Error handling

Multiple tools

Tool descriptions

Field descriptions

Tool durability in practice

Key takeaways