Cost + token accounting

Set a hard dollar ceiling per run and track token usage with either a zero-dep heuristic or provider-accurate counters.

A long-running agent with tools can make dozens of LLM calls before you see the invoice. costGuard lets you set a ceiling and decide what happens when it's hit — stop cleanly, throw, or warn — so runaway runs don't reach production costs.

#costGuard

import { costGuard } from '@agentskit/observability'

const runtime = createRuntime({
  adapter,
  observers: [costGuard({ maxUsd: 0.50, onExceed: 'throw' })],
})

onExceed: 'throw' | 'stop' | 'warn'.

#multiTenantCostGuard

Same accounting partitioned by tenant for SaaS deployments.

import { multiTenantCostGuard } from '@agentskit/observability'

const guard = multiTenantCostGuard({
  budgets: { 'acme-co': 5, 'startup-co': 1 },
  defaultBudgetUsd: 0.10,         // unlisted tenants
  onExceeded: ({ tenant, costUsd, budgetUsd }) => {
    metrics.increment('agent.budget.exceeded', { tenant })
    // Reject the next request at the gateway, log+drop, etc.
  },
})

createRuntime({ adapter, observers: [guard] })

// Wire your request scope: AsyncLocalStorage or set-before-call
guard.setTenant(req.tenant)
await runtime.run(task)

Why no auto-abort. SaaS multi-tenant deployments typically reject the inbound request at the gateway, not mid-run. Wire the abort to the controller you already track per request.

#Token counters

import { approximateCounter, createProviderCounter } from '@agentskit/observability'

// Zero-dep heuristic
const fast = approximateCounter()

// Provider-accurate (uses adapter-reported usage when available)
const exact = createProviderCounter({ adapter })

Explore nearby

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

Cost + token accounting

#costGuard

#multiTenantCostGuard

#Token counters

#Related

Explore nearby

On this page