AI agents are useful. They’re also not free. Every task an agent performs has a cost, split across three categories: LLM tokens, tool calls, and infrastructure. Understanding these costs is the first step to controlling them.
Here’s what things actually cost in early 2026, with real numbers and practical strategies for keeping spending under control.
LLM Token Costs
The LLM is usually the largest cost component. Every prompt, every tool result that goes into context, and every generated response costs tokens.
Current pricing (approximate, per million tokens):
| Model | Input | Output |
|---|---|---|
| Claude Sonnet 4 | $3 | $15 |
| Claude Opus 4 | $15 | $75 |
| GPT-4.1 | $2 | $8 |
| GPT-4.1 mini | $0.40 | $1.60 |
| Gemini 2.5 Pro | $1.25-2.50 | $10-15 |
A typical agent task, one that involves reasoning about a problem, calling a few tools, reading the results, and generating a final response, uses somewhere between 5,000 and 50,000 tokens total, depending on the complexity and number of tool calls.
Example: A research task. The agent receives a 200-token prompt, calls 3 tools that each return 1,000 tokens of results, reasons about the results (2,000 tokens of internal context), and generates a 500-token response. Total: roughly 6,000 tokens. At Claude Sonnet 4 pricing, that’s about $0.03.
Example: A complex multi-step task. The agent processes a long document (10,000 tokens), calls 8 tools with results totaling 8,000 tokens, does multiple rounds of reasoning (15,000 tokens), and generates a detailed 2,000-token report. Total: roughly 35,000 tokens. At Sonnet 4 pricing, about $0.15. At Opus 4 pricing, about $0.75.
The model you choose matters more than almost any other cost decision.
Tool Call Costs
Every external tool call has a price. This varies widely depending on the tool and platform.
Typical per-call pricing on tool platforms:
| Tool Type | Price Range | Typical Price |
|---|---|---|
| Web search | $0.003 - $0.01 | $0.005 |
| News search | $0.005 - $0.015 | $0.0075 |
| Image generation (fast) | $0.005 - $0.02 | $0.006 |
| Image generation (high quality) | $0.03 - $0.10 | $0.06 |
| Email send | $0.005 - $0.02 | $0.01 |
| Data API (government, financial) | $0.003 - $0.01 | $0.005 |
| PDF extraction | $0.005 - $0.01 | $0.005 |
Example: A research task with tools. The agent makes 5 Google searches ($0.025), 2 news lookups ($0.015), and generates 1 image ($0.006). Total tool cost: $0.046. Add $0.03 in LLM tokens, and the full task costs about $0.08.
Example: A lead generation task. The agent searches for companies (3 searches at $0.005), looks up LinkedIn jobs (2 calls at $0.0075), checks company websites via web scraping (3 calls at $0.005), and sends 5 emails ($0.05). Total tool cost: $0.10. Add $0.15 in LLM tokens for the reasoning, and the full task is about $0.25.
Tool costs are usually smaller than LLM costs for simple tasks, but they can dominate for tool-heavy workflows.
Infrastructure Costs
The agent needs somewhere to run, and that runtime has costs.
Hosting. If your agent runs as a serverless function (Cloudflare Workers, AWS Lambda, Vercel), you pay per invocation. Costs are typically negligible compared to LLM and tool costs, often under $0.001 per task.
Persistent infrastructure. If your agent needs a database, message queue, or other always-on services, those have monthly costs independent of usage. A basic setup (small database, KV store, worker runtime) runs $20 to $100 per month on most cloud platforms.
Monitoring and logging. Observability tools cost money at scale. Log storage, dashboards, and alerting add $10 to $50 per month for a small deployment.
Infrastructure costs are mostly fixed. They matter when you’re running at low volume (a $50/month database is expensive if you’re handling 10 tasks per day) but become negligible per-task at scale.
Putting It All Together
Here’s what different agent workloads cost per task:
| Workload | LLM Cost | Tool Cost | Total |
|---|---|---|---|
| Simple Q&A with one search | $0.02 | $0.005 | $0.025 |
| Research (5 searches, 2 news) | $0.05 | $0.04 | $0.09 |
| Lead gen (search + email) | $0.15 | $0.10 | $0.25 |
| Content creation with image | $0.10 | $0.07 | $0.17 |
| Complex analysis (many tools) | $0.30 | $0.15 | $0.45 |
At 100 tasks per day, the research workload costs about $9/day or $270/month. At 1,000 tasks per day, it’s $90/day or $2,700/month.
How to Control Costs
Choose the Right Model for Each Step
Don’t use Opus for everything. Use a fast, cheap model (GPT-4.1 mini, Claude Haiku) for simple decisions like routing and tool selection. Reserve the expensive models for complex reasoning where quality matters. This alone can cut LLM costs by 50-80%.
Cache Tool Results
If your agent frequently searches for the same queries, cache the results. A Google search result from 5 minutes ago is probably still valid. Caching eliminates redundant tool calls without sacrificing quality.
Set Per-Task Budgets
Limit how much any single task can spend. A runaway agent loop that keeps calling tools is the most common cause of unexpected costs. Set a hard cap (say, $1 per task) and fail gracefully when it’s hit.
Monitor and Alert
Track cost per task, cost per user, and total daily spend. Set alerts at 50%, 75%, and 90% of your daily budget. Catch runaway spending before it becomes a problem.
Use Credits-Based Platforms
Platforms like AgentPatch use a prepaid credits model. You load a balance, and tool calls deduct from it. This creates a natural spending cap: the agent can’t spend more than what’s been prepaid. It also bundles small transactions, avoiding per-call credit card fees that would make micropayments uneconomical.
Batch When Possible
If your agent needs to make many similar tool calls, check whether the tool supports batch operations. Searching for 10 queries in a single call is usually cheaper than 10 separate calls.
The Trend
Costs are falling. LLM pricing has dropped 5 to 10x over the past year, and tool call pricing follows a similar trend as competition increases and infrastructure costs decrease.
The tasks that cost $0.50 today might cost $0.05 in a year. The agents that are borderline economical now will be clearly cost-effective soon. Build the cost monitoring infrastructure now, so you can track the improvement and expand your agent’s capabilities as costs come down.