Choosing an adapter
Capability decision table and rules of thumb for picking a chat adapter.
Every chat adapter implements the same AdapterFactory contract, so swapping one out is a one-line change. This page exists to answer the only question that actually matters: which one should I start with?
#TL;DR
- Want the safest default? β
openaiwithgpt-4o-mini. - Want the strongest tool use? β
anthropicwithclaude-sonnet-4-6. - Want it free / local? β
ollama. - Already running on AWS / GCP / Azure? β use the corresponding adapter (
bedrockonce it ships,vertex,azureOpenAI). - Want to defer the decision? β wrap candidates in
createRouterorcreateFallbackAdapter.
#Capability matrix
| Adapter | Streaming | Tools | Multi-modal | Reasoning | Usage | Self-hosted | Cost tier |
|---|---|---|---|---|---|---|---|
openai | β | β | β (gpt-4 / o*) | β (o1 / o3) | β | β | $$ |
anthropic | β | β | β | β (sonnet / opus) | β | β | $$$ |
gemini | β | β | β | β οΈ model-dep. | β | β | $$ |
grok | β | β | β οΈ | β | β | β | $$ |
deepseek | β | β | β | β | β | β | $ |
kimi | β | β | β | β | β | β | $ |
mistral | β | β | β | β | β | β | $$ |
cohere | β | β | β | β | β | β | $$ |
groq | β | β | β | β | β | β | $ |
together | β | β | β οΈ | β | β | β | $ |
fireworks | β | β | β οΈ | β | β | β | $ |
openrouter | β | β οΈ | β οΈ | β οΈ | β οΈ | β | $β$$$ |
huggingface | β | β | β οΈ | β | β | β | $ |
ollama | β | β οΈ model-dep. | β οΈ (llava) | β | β | β | free |
lmstudio | β | β οΈ | β | β | β | β | free |
vllm | β | β οΈ | β | β | β | β | free |
llamacpp | β | β οΈ | β | β | β | β | free |
langchain | β | passthrough | passthrough | passthrough | passthrough | passthrough | passthrough |
langgraph | β | passthrough | passthrough | passthrough | passthrough | passthrough | passthrough |
vercelAI | β | passthrough | passthrough | passthrough | passthrough | passthrough | passthrough |
generic | β | bring-your-own | bring-your-own | bring-your-own | bring-your-own | bring-your-own | β |
Legend: β
supported Β· β οΈ depends on model / config Β· β not supported Β· passthrough = inherits whatever the wrapped runtime exposes. Cost tiers are relative ranges, not contractual prices β consult the provider for current rates.
#When to pick which
#openai
The path of least resistance. Best reasoning models (o1, o3), best multi-modal coverage on gpt-4o, and the most stable tool-use semantics. Pick this first if you have no constraints.
#anthropic
Strongest tool use and the most useful reasoning trace today. Pick this when the agent has to chain non-trivial tools, or when output discipline (instruction-following on long prompts) matters more than raw speed.
#gemini
Cheapest first-class multi-modal β long context, native image / audio / video understanding. Pick this when the agent has to read long documents or non-text inputs.
#grok
Useful when you want the X-flavored knowledge graph or low-latency chat from xAI. Capabilities are narrower than OpenAI / Anthropic β verify your specific use case before committing.
#deepseek
Strong reasoning at a low price. Pick this when budget is the binding constraint and you can tolerate occasional latency spikes from the hosted endpoint.
#kimi
Long-context Chinese-leaning model from Moonshot. Pick this for Chinese-first agents or very long-context summarization.
#mistral
Balanced cost / quality with a European data-residency story. Pick this when residency matters more than raw capability.
#cohere
Strong on retrieval + RAG-flavored workloads. Pick this when paired with Cohere's rerankers, or when "command" models hit the right cost / quality point for your use case.
#groq
Fastest first-token latency on Llama / Mixtral, served on LPUs. Pick this when latency dominates UX β voice mode, autocomplete, anything sub-second perceived response.
#together / fireworks
OpenAI-compatible aggregators of open-weight models. Pick these when you want a wide menu of open models without running infra yourself.
#openrouter
Single key, hundreds of models. Pick this for prototyping or for fallback chains that span providers β but verify capabilities per model, since the long tail varies.
#huggingface
Hosted inference for the Hub. Pick this for niche or fine-tuned models that aren't on the big-name aggregators yet.
#ollama / lmstudio / vllm / llamacpp
Local / self-hosted runtimes. Pick these when data must not leave your hardware, when offline is a hard requirement, or for cost-zero development. Tool-use support varies by model β verify before committing.
#langchain / langgraph
Drop-in adapters for existing LangChain Runnable / LangGraph compiled graphs. Pick these when migrating an existing LangChain codebase incrementally rather than rewriting it.
#vercelAI
Bridge to a Vercel AI SDK route handler. Pick this when your existing app already streams via the Vercel AI SDK and you want AgentsKit on top without changing the route.
#generic
Bring your own ReadableStream. Pick this when you have a custom backend or a provider that doesn't have a first-party adapter yet.
#Higher-order: don't pick β combine
If "which one?" is hard to answer, you probably want to pick more than one:
createRouterβ auto-pick by cost / latency / tags / custom predicate.createFallbackAdapterβ ordered try-next when a candidate fails.createEnsembleAdapterβ fan-out and merge.