agentskit.js
Recipes

Speculative execution

Run the same request across N adapters in parallel, keep the winner, abort the losers.

Latency matters. So does quality. speculate lets you have both: kick off a cheap+fast adapter and a slow+accurate one together, take the first to finish, and cancel the loser before it burns tokens.

Install

Built into @agentskit/runtime — nothing extra to install.

Quick start — fastest wins

import { speculate } from '@agentskit/runtime'
import { anthropic, openai } from '@agentskit/adapters'

const result = await speculate({
  candidates: [
    { id: 'haiku', adapter: anthropic({ apiKey: ..., model: 'claude-haiku-4-5' }) },
    { id: 'sonnet', adapter: anthropic({ apiKey: ..., model: 'claude-sonnet-4-6' }) },
  ],
  request: {
    messages: [{ id: '1', role: 'user', content: 'Summarize this.', status: 'complete', createdAt: new Date() }],
  },
})

console.log(result.winner.id, result.winner.text)
console.log('loser latency:', result.losers.map(l => l.latencyMs))

The loser is aborted as soon as the winner settles.

Picker strategies

pickBehavior
'first' (default)First candidate to finish without error
'longest'Candidate with the most output text
functionCustom picker: receives all results, returns winner id
await speculate({
  candidates: [...],
  request,
  pick: results => {
    // Prefer the candidate whose output contains a JSON object.
    const parsed = results.find(r => r.text.trim().startsWith('{'))
    return parsed?.id ?? results[0].id
  },
})

Timeout

Bound each candidate with timeoutMs. A candidate that times out is aborted and marked with an error, but doesn't fail the whole run as long as another candidate succeeds.

await speculate({
  candidates: [...],
  request,
  timeoutMs: 5_000,
})

Opt out of aborting a loser

abortOnLoser: false keeps a candidate running to completion even after it's declared the loser — useful when you want to record all variants for offline analysis.

{ id: 'sonnet', adapter: sonnet, abortOnLoser: false }

Result shape

{
  winner: { id, text, chunks, latencyMs, error?, aborted? },
  losers: SpeculativeResult[],
  all: SpeculativeResult[],
}

See also

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

On this page