whisper

import { whisper } from '@agentskit/tools'

const runtime = createRuntime({
  adapter,
  tools: [...whisper({ apiKey: process.env.OPENAI_API_KEY! })],
})

#Sub-tools

Name	Purpose
`whisperTranscribe`	Transcribe an audio buffer → text + segments

Bundled: whisper(config).

#Config

type WhisperConfig = {
  apiKey: string
  model?: 'whisper-1' | 'gpt-4o-transcribe' | 'gpt-4o-mini-transcribe'
  defaultLanguage?: string    // ISO-639-1 hint
  fetch?: typeof fetch          // provider upload transport
  fetchUntrusted?: typeof fetch // explicit policy transport for audio URLs
}

Audio URLs are model-controlled and use safeFetch by default. Supplying fetch customizes only the provider upload and does not disable the SSRF gate; override fetchUntrusted only with an equivalent egress-policy transport.

#Example — meeting notes agent

const runtime = createRuntime({
  adapter,
  tools: [
    ...s3({ client, bucket: 'recordings' }),
    ...whisper({ apiKey }),
  ],
})

await runtime.run('Transcribe the latest recording from S3 and draft meeting notes with action items.')

#Comparison

Tool	Latency	Cost	Speaker diarization
whisper	medium	low	no
deepgram	low (realtime)	medium	yes

Integrations overview · elevenlabs — TTS pair.

Explore nearby

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →