AI agents are useful. They’re also not free. Every task an agent performs has a cost, split across three categories: LLM tokens, tool calls, and infrastructure. Understanding these costs is the first step to controlling them.

Here’s what things actually cost in early 2026, with real numbers and practical strategies for keeping spending under control.

LLM Token Costs

The LLM is usually the largest cost component. Every prompt, every tool result that goes into context, and every generated response costs tokens.

Current pricing (approximate, per million tokens):

ModelInputOutput
Claude Sonnet 4$3$15
Claude Opus 4$15$75
GPT-4.1$2$8
GPT-4.1 mini$0.40$1.60
Gemini 2.5 Pro$1.25-2.50$10-15

A typical agent task, one that involves reasoning about a problem, calling a few tools, reading the results, and generating a final response, uses somewhere between 5,000 and 50,000 tokens total, depending on the complexity and number of tool calls.

Example: A research task. The agent receives a 200-token prompt, calls 3 tools that each return 1,000 tokens of results, reasons about the results (2,000 tokens of internal context), and generates a 500-token response. Total: roughly 6,000 tokens. At Claude Sonnet 4 pricing, that’s about $0.03.

Example: A complex multi-step task. The agent processes a long document (10,000 tokens), calls 8 tools with results totaling 8,000 tokens, does multiple rounds of reasoning (15,000 tokens), and generates a detailed 2,000-token report. Total: roughly 35,000 tokens. At Sonnet 4 pricing, about $0.15. At Opus 4 pricing, about $0.75.

The model you choose matters more than almost any other cost decision.

Tool Call Costs

Every external tool call has a price. This varies widely depending on the tool and platform.

Typical per-call pricing on tool platforms:

Tool TypePrice RangeTypical Price
Web search$0.003 - $0.01$0.005
News search$0.005 - $0.015$0.0075
Image generation (fast)$0.005 - $0.02$0.006
Image generation (high quality)$0.03 - $0.10$0.06
Email send$0.005 - $0.02$0.01
Data API (government, financial)$0.003 - $0.01$0.005
PDF extraction$0.005 - $0.01$0.005

Example: A research task with tools. The agent makes 5 Google searches ($0.025), 2 news lookups ($0.015), and generates 1 image ($0.006). Total tool cost: $0.046. Add $0.03 in LLM tokens, and the full task costs about $0.08.

Example: A lead generation task. The agent searches for companies (3 searches at $0.005), looks up LinkedIn jobs (2 calls at $0.0075), checks company websites via web scraping (3 calls at $0.005), and sends 5 emails ($0.05). Total tool cost: $0.10. Add $0.15 in LLM tokens for the reasoning, and the full task is about $0.25.

Tool costs are usually smaller than LLM costs for simple tasks, but they can dominate for tool-heavy workflows.

Infrastructure Costs

The agent needs somewhere to run, and that runtime has costs.

Hosting. If your agent runs as a serverless function (Cloudflare Workers, AWS Lambda, Vercel), you pay per invocation. Costs are typically negligible compared to LLM and tool costs, often under $0.001 per task.

Persistent infrastructure. If your agent needs a database, message queue, or other always-on services, those have monthly costs independent of usage. A basic setup (small database, KV store, worker runtime) runs $20 to $100 per month on most cloud platforms.

Monitoring and logging. Observability tools cost money at scale. Log storage, dashboards, and alerting add $10 to $50 per month for a small deployment.

Infrastructure costs are mostly fixed. They matter when you’re running at low volume (a $50/month database is expensive if you’re handling 10 tasks per day) but become negligible per-task at scale.

Putting It All Together

Here’s what different agent workloads cost per task:

WorkloadLLM CostTool CostTotal
Simple Q&A with one search$0.02$0.005$0.025
Research (5 searches, 2 news)$0.05$0.04$0.09
Lead gen (search + email)$0.15$0.10$0.25
Content creation with image$0.10$0.07$0.17
Complex analysis (many tools)$0.30$0.15$0.45

At 100 tasks per day, the research workload costs about $9/day or $270/month. At 1,000 tasks per day, it’s $90/day or $2,700/month.

How to Control Costs

Choose the Right Model for Each Step

Don’t use Opus for everything. Use a fast, cheap model (GPT-4.1 mini, Claude Haiku) for simple decisions like routing and tool selection. Reserve the expensive models for complex reasoning where quality matters. This alone can cut LLM costs by 50-80%.

Cache Tool Results

If your agent frequently searches for the same queries, cache the results. A Google search result from 5 minutes ago is probably still valid. Caching eliminates redundant tool calls without sacrificing quality.

Set Per-Task Budgets

Limit how much any single task can spend. A runaway agent loop that keeps calling tools is the most common cause of unexpected costs. Set a hard cap (say, $1 per task) and fail gracefully when it’s hit.

Monitor and Alert

Track cost per task, cost per user, and total daily spend. Set alerts at 50%, 75%, and 90% of your daily budget. Catch runaway spending before it becomes a problem.

Use Credits-Based Platforms

Platforms like AgentPatch use a prepaid credits model. You load a balance, and tool calls deduct from it. This creates a natural spending cap: the agent can’t spend more than what’s been prepaid. It also bundles small transactions, avoiding per-call credit card fees that would make micropayments uneconomical.

Batch When Possible

If your agent needs to make many similar tool calls, check whether the tool supports batch operations. Searching for 10 queries in a single call is usually cheaper than 10 separate calls.

The Trend

Costs are falling. LLM pricing has dropped 5 to 10x over the past year, and tool call pricing follows a similar trend as competition increases and infrastructure costs decrease.

The tasks that cost $0.50 today might cost $0.05 in a year. The agents that are borderline economical now will be clearly cost-effective soon. Build the cost monitoring infrastructure now, so you can track the improvement and expand your agent’s capabilities as costs come down.