Providers
llamacpp
llama.cpp server — run GGUF models on CPU or GPU with minimal overhead.
import { llamacpp } from '@agentskit/adapters'
const adapter = llamacpp({
url: 'http://localhost:8080',
})Options
| Option | Type | Default |
|---|---|---|
url | string | http://localhost:8080 |
fetch | typeof fetch | global |
Why llamacpp
- Runs everywhere, including Raspberry Pi + embedded.
- GGUF quantizations from 4-bit to 16-bit.