agentskit.js
ToolsIntegrations

whisper

OpenAI Whisper — speech-to-text for audio transcription. 99 languages.

import { whisper } from '@agentskit/tools'

const runtime = createRuntime({
  adapter,
  tools: [...whisper({ apiKey: process.env.OPENAI_API_KEY! })],
})

Sub-tools

NamePurpose
whisperTranscribeTranscribe an audio buffer → text + segments

Bundled: whisper(config).

Config

type WhisperConfig = {
  apiKey: string
  model?: 'whisper-1' | 'gpt-4o-transcribe' | 'gpt-4o-mini-transcribe'
  defaultLanguage?: string    // ISO-639-1 hint
  fetch?: typeof fetch
}

Example — meeting notes agent

const runtime = createRuntime({
  adapter,
  tools: [
    ...s3({ client, bucket: 'recordings' }),
    ...whisper({ apiKey }),
  ],
})

await runtime.run('Transcribe the latest recording from S3 and draft meeting notes with action items.')

Comparison

ToolLatencyCostSpeaker diarization
whispermediumlowno
deepgramlow (realtime)mediumyes
✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

On this page