agentskit.js
Providers

huggingface

Hugging Face Inference Endpoints + Serverless — run any HF-hosted chat model.

import { huggingface } from '@agentskit/adapters'

const adapter = huggingface({
  apiKey: process.env.HF_TOKEN!,
  model: 'meta-llama/Meta-Llama-3-70B-Instruct',
})

Options

OptionTypeDefault
apiKeystringrequired
modelstringrequired
baseUrlstringhttps://api-inference.huggingface.co
fetchtypeof fetchglobal

Env

VarPurpose
HF_TOKENRead token from hf.co/settings/tokens

Notes

  • Serverless tier has cold starts. Pin a dedicated Inference Endpoint for production latency.
  • For open weights locally see ollama · vllm · llamacpp.
✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

On this page