agentskit.js
Evals

Deterministic replay

Record LLM responses to a cassette file, then replay them in CI without network calls for fast, deterministic tests.

Non-determinism and API latency make agent tests slow and flaky. createRecordingAdapter captures every response to a cassette file on first run; createReplayAdapter replays those responses in subsequent runs β€” same output, no network, sub-millisecond per call.

#Record

import { createRecordingAdapter } from '@agentskit/eval'

const rec = createRecordingAdapter({
  inner: openai({ apiKey }),
  cassettePath: '.agentskit/cassettes/triage.jsonl',
})

Run your suite once with rec β€” every call captured.

#Replay

import { createReplayAdapter } from '@agentskit/eval'

const replay = createReplayAdapter({
  cassettePath: '.agentskit/cassettes/triage.jsonl',
})

Use replay in CI β€” zero network, deterministic.

#Time travel

import { createTimeTravelSession } from '@agentskit/eval'

const session = createTimeTravelSession({ cassettePath })
session.rewindTo(step)
session.override(step, { output: 'alternate response' })
const forked = session.fork()

#Replay against different model

import { replayAgainst } from '@agentskit/eval'

const diff = await replayAgainst({
  cassettePath,
  adapter: anthropic(...),
})

Explore nearby

✎ Edit this page on GitHubΒ·Found a problem? Open an issue β†’Β·How to contribute β†’

On this page