groq

Groq — ultra-low-latency inference on custom LPU hardware. Llama + Mixtral + Gemma.

import { groq } from '@agentskit/adapters'

const adapter = groq({
  apiKey: process.env.GROQ_API_KEY!,
  model: 'llama-3.3-70b-versatile',
})

#Options

Option	Type	Default
`apiKey`	`string`	required
`model`	`string`	`llama-3.3-70b-versatile`
`baseUrl`	`string`	`https://api.groq.com/openai/v1`
`retry`	`RetryOptions`	inherited

#Capabilities

{ streaming: true, tools: true, usage: true } — Groq exposes a strict OpenAI-compatible surface, so the request shape matches openai({ baseUrl }).

#Why groq

Sub-100 ms first-token latency — best for realtime voice + chat.
OpenAI-compatible.

#Env

Var	Purpose
`GROQ_API_KEY`	API key

Providers overview · voice mode component (issue #479)

Explore nearby

Peer
Providers
25 native chat and embedder adapters, plus higher-order adapters that compose candidates. Separate from the 140-provider models catalog.
Peer
Choosing an adapter
Capability decision table and rules of thumb for picking a chat adapter.
Peer
Hosted chat adapters
17 managed-LLM adapters. Same contract; swap by changing one import.

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

#Options

#Capabilities

#Why groq

#Env

#Related

Explore nearby

On this page