agentskit.js
Cookbook

Build an Ask-the-docs chat

The exact grounded RAG chat in this site's corner — index your docs, ground a free model, stream cited answers. Copy it, or scaffold it with one command.

This is the recipe for the chat you're using right now: a grounded RAG assistant over your own docs that streams concise, cited answers — running at $0 on the OpenRouter free pool, built entirely on @agentskit/*.

Two ways in: scaffold it with the CLI, or wire it by hand.

#Fastest path — the CLI

npx agentskit add docs-chat

docs-chat is a UI component, not just an agent file — agentskit add now installs it via the component registry flow defined in RFC-0006.

#What the command actually does

Scan — if .agentskit/components.json is absent, the CLI scans your project: UI binding (react, svelte, vue, …), meta-framework (next-app, next-pages, sveltekit, nuxt, remix, tanstack-start, vite, and more), package manager, TypeScript vs JavaScript, monorepo root. Ambiguous signals surface as validation errors — never a silent guess. With components.json committed (written once by agentskit init), the scan is skipped entirely and every add runs non-interactively.

Validate — before anything is written the CLI checks compatibility and refuses with a concrete error plus a pointer to a supported alternative if the detected environment can't satisfy the component's runtimeRequirement or embeddingBackend (for example, onnx-node is blocked on edge or Expo targets). Peer-range conflicts across all resolved dependencies are surfaced in aggregate, not on first hit. Pass --dry-run to see the full plan — files, targets, deps, env, conflicts — without writing anything.

Place — files land in framework-correct locations. The server handler is placed per serverTargetByMeta[metaFramework]: a Next.js App Router route handler (app/api/ask/route.ts), a SvelteKit +server.ts, a Nuxt server/api/*.post.ts, a Remix resource route, and so on. The client component composes the matching @agentskit/* binding. You own every file — edit guardrails, styling tokens, and adapter choices freely.

Record — the installer appends a tamper-evident entry to .agentskit/install-log.jsonl and updates the installed marker in components.json with per-file SHA-256 and the pinned registry ref, enabling agentskit diff docs-chat and agentskit update docs-chat later.

#Safety guarantees

  • Per-file SHA-256 verification — every fetched file is verified against the signed manifest before any write. A mismatch aborts the entire install.
  • Path-containment guardpath.resolve(dest) is asserted to stay inside the target directory on every file, both on the write path and on diff reads. A ../../.env-style path escape is an IntegrityError.
  • Transactional (all-or-nothing) — files are staged in a sibling temp directory, verified, then moved atomically. Any pre-commit failure rolls back all partial writes and reports "rolled back N files." Your tree is never left dirty.
  • Append-only audit log.agentskit/install-log.jsonl chains entries via prevEntryHash (SHA-256 of the prior entry); a future agentskit audit command walks the chain and fails on any gap or mismatch.

#Framework support

The first shipping port is React × Next.js (app router). Other frameworks (sveltekit, nuxt, remix, tanstack-start, angular, expo, ink) are rolling out port-by-port, each gated by the binding stability requirements in RFC-0004. The CLI will refuse to install a port that has not shipped — it will not copy broken source into an unsupported framework.

#Zero-prompts via agentskit init

npx agentskit init        # writes .agentskit/components.json — commit this file
npx agentskit add docs-chat

After init, every subsequent add reads the committed config and runs non-interactively. In CI pass --yes to exit non-zero on any blocker instead of hanging.

#After install

The ready output prints a per-framework usage snippet wiring createAskHandler to your retriever and adapter, the required env vars (written to .env.example), and a "run the indexer before first use" step. The installer copies an indexer (agentskit ask index ./docs) and an empty index stub — it never ships AgentsKit's own corpus.

Point the handler at your retriever and adapter as shown in the sections below, then run the indexer. A full runnable version lives in apps/example-rag-chat — swap the sample docs for yours.

#1. Index your docs (RAG)

Chunk + embed your docs once, into any @agentskit/memory vector store. Embedding stays free + local with an ONNX model:

import { createRAG } from '@agentskit/rag'
import { fileVectorMemory } from '@agentskit/memory'
import { pipeline } from '@huggingface/transformers'

// Local ONNX embedder — $0, no API key.
let extractor: Awaited<ReturnType<typeof pipeline>> | null = null
const embed = async (text: string): Promise<number[]> => {
  extractor ??= await pipeline('feature-extraction', 'Xenova/bge-small-en-v1.5')
  const out = await extractor(text, { pooling: 'mean', normalize: true })
  return Array.from(out.data as Float32Array)
}

const rag = createRAG({ embed, store: fileVectorMemory({ path: './docs-index' }), topK: 6 })
await rag.ingest(docs) // docs: { id, content, metadata: { path, title } }[]

For a committed, read-only index (great for serverless), generate it at build time and ship the JSON — see the docs-site's scripts/gen-ask-index.mjs.

#2. Stream grounded, cited answers

createRAG returns a Retriever — drop it straight into useChat (or createRuntime). Two design choices make it reliable on free models:

  • Co-locate the context with the question (free models attend to recent tokens far better than a long system prompt).
  • Emit citations from what you retrieved — don't depend on a weak model to call a cite tool.
import { useChat } from '@agentskit/react'
import { openrouter, createFallbackAdapter } from '@agentskit/adapters'

const FREE = ['meta-llama/llama-3.3-70b-instruct:free', 'qwen/qwen3-next-80b-a3b-instruct:free']
const adapter = createFallbackAdapter(
  FREE.map((model) => ({ id: model, adapter: openrouter({ apiKey, model }) })),
)

const chat = useChat({
  adapter,
  retriever: rag,
  systemPrompt: 'Answer ONLY from the provided docs. Be concise. Decline + name the nearest page when uncovered. Never mention other frameworks.',
})

createFallbackAdapter cascades across the free pool when one is rate-limited (429). Keep answers short and AgentsKit-specific with a tight system prompt.

#3. The widget (optional, fancy)

The floating chat is a headless, slotted component — open by default, branded logo, animated loading, and a "build this" link, all overridable:

<AskDocsWidget
  logo={<YourMark />}          // header logo slot
  loadingState={<YourSpinner />} // loading slot
  title="Ask our docs"
  docsHref="/docs/cookbook/ask-the-docs"
/>

Generative UI (option buttons, forms, runnable code) activates with ASK_RICH_UI=1 and a capable model — free models stay on reliable markdown text.

#Guardrails (built in)

Before any model call, a cheap triage runs — it saves your free-tier quota on trivia and blocks the obvious attacks:

  • Greetings ("hi", "oi") and noise ("test", empty, gibberish) → an instant canned reply, no LLM.
  • Prompt injection ("ignore previous instructions", "reveal your system prompt", "you are now…", "jailbreak") → a firm decline that never changes role or leaks the prompt — no LLM.
  • Real questions (even one word like "memory") fall through.

It's additive and extensible — keep the defenses, add your own:

import { triageMessage } from './lib/ask/guard'

const triage = triageMessage(userText, {
  greetings: ['salut', 'ciao'],
  noise: ['blah'],
  injectionPatterns: [/give me the raw config/i],
  replies: { greeting: 'Hey! Ask me about our product docs.' },
})
if (triage.kind === 'canned') return stream(triage.reply) // skip the model

This sits on top of the other layers: client system messages are stripped, the retrieved context is fenced as untrusted data, and the grounded prompt keeps answers on-topic — defense-in-depth.

#Going further

  • Durable rate limit — front the route with Upstash (in-memory fallback for dev).
  • Eval it — measure retrieval recall@k / MRR deterministically + an LLM judge; gate in CI.
  • Run code in-browser@agentskit/sandbox/web webWorkerBackend (zero-vendor) powers runnable snippets.

That's the whole thing: index → ground → stream → cite. Grounding quality scales with model quality — start free, bring your own key when you want the rich UI.

Explore nearby

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

On this page

Ask the docs

Ask anything about AgentsKit. Answers come from the docs corpus via OpenRouter free-tier models. Rate limited per IP.