Build an Ask-the-docs chat

Recipe for a grounded RAG docs chat using AgentsKit Chat on the AgentsKit foundation. For the product chat framework home, start at chat.agentskit.io.

Ownership. Product chat applications and the shared Ask shell live in AgentsKit Chat. This cookbook is a foundation-side recipe that dogfoods Chat on top of @agentskit/* (adapters, memory, RAG). Prefer the Chat docs when you want the versioned multi-surface application layer.

This is the recipe for the chat you're using right now: a grounded RAG assistant over your own docs that streams concise, cited answers — running at $0 on the OpenRouter free pool, composed with AgentsKit Chat over the AgentsKit foundation.

The production widget dogfoods the public AgentsKit Chat framework. AgentChat owns the canonical message timeline, streaming lifecycle, cancellation, retry/edit/regenerate behavior, persistence through AgentsKit memory, component validation, and accessible rendering diagnostics. The docs host keeps only its corpus branding, Markdown slot, generated local knowledge, and a small adapter that converts the existing Ask NDJSON boundary into ordered assistant content. Citations use the standard source-list contract rather than a site-private citation component. Known exact questions now use the deterministic local answer plane before this RAG path, so they require no backend request.

import { createAssistantContentEncoder } from '@agentskit/chat/protocol'

const content = createAssistantContentEncoder()
yield { type: 'text', content: content.encode({ kind: 'text', text: answerChunk }) }
yield { type: 'text', content: content.encode({
  kind: 'component',
  frame: { componentKey: 'source-list', /* validated sources + fallback */ },
}) }

Every text delta passes through the encoder, so model output cannot manufacture component framing. Unknown tools stay inert, and malformed sources never reach the standard renderer.

Two ways in: scaffold it with the CLI, or wire it by hand.

#Fastest path — the CLI

npx agentskit add docs-chat

docs-chat is a UI component, not just an agent file — agentskit add now installs it via the component registry flow defined in RFC-0006.

#What the command actually does

Scan — if .agentskit/components.json is absent, the CLI scans your project: UI binding (react, svelte, vue, …), meta-framework (next-app, next-pages, sveltekit, nuxt, remix, tanstack-start, vite, and more), package manager, TypeScript vs JavaScript, monorepo root. Ambiguous signals surface as validation errors — never a silent guess. With components.json committed (written once by agentskit init), the scan is skipped entirely and every add runs non-interactively.

Validate — before anything is written the CLI checks compatibility and refuses with a concrete error plus a pointer to a supported alternative if the detected environment can't satisfy the component's runtimeRequirement or embeddingBackend (for example, onnx-node is blocked on edge or Expo targets). Peer-range conflicts across all resolved dependencies are surfaced in aggregate, not on first hit. Pass --dry-run to see the full plan — files, targets, deps, env, conflicts — without writing anything.

Place — files land in framework-correct locations. The server handler is placed per serverTargetByMeta[metaFramework]: a Next.js App Router route handler (app/api/ask/route.ts), a SvelteKit +server.ts, a Nuxt server/api/*.post.ts, a Remix resource route, and so on. The client component composes the matching @agentskit/* binding. You own every file — edit guardrails, styling tokens, and adapter choices freely.

Record — the installer appends a tamper-evident entry to .agentskit/install-log.jsonl and updates the installed marker in components.json with per-file SHA-256 and the pinned registry ref, enabling agentskit diff docs-chat and agentskit update docs-chat later.

#Safety guarantees

Per-file SHA-256 verification — every fetched file is verified against the signed manifest before any write. A mismatch aborts the entire install.
Path-containment guard — path.resolve(dest) is asserted to stay inside the target directory on every file, both on the write path and on diff reads. A ../../.env-style path escape is an IntegrityError.
Transactional (all-or-nothing) — files are staged in a sibling temp directory, verified, then moved atomically. Any pre-commit failure rolls back all partial writes and reports "rolled back N files." Your tree is never left dirty.
Append-only audit log — .agentskit/install-log.jsonl chains entries via prevEntryHash (SHA-256 of the prior entry); a future agentskit audit command walks the chain and fails on any gap or mismatch.

#Framework support

The first shipping port is React × Next.js (app router). Other frameworks (sveltekit, nuxt, remix, tanstack-start, angular, expo, ink) are rolling out port-by-port, each gated by the binding stability requirements in RFC-0004. The CLI will refuse to install a port that has not shipped — it will not copy broken source into an unsupported framework.

#Zero-prompts via `agentskit init`

npx agentskit init        # writes .agentskit/components.json — commit this file
npx agentskit add docs-chat

After init, every subsequent add reads the committed config and runs non-interactively. In CI pass --yes to exit non-zero on any blocker instead of hanging.

#After install

The ready output prints a per-framework usage snippet wiring createAskHandler to your retriever and adapter, the required env vars (written to .env.example), and a "run the indexer before first use" step. The installer copies an indexer (agentskit ask index ./docs) and an empty index stub — it never ships AgentsKit's own corpus.

Point the handler at your retriever and adapter as shown in the sections below, then run the indexer. A full runnable version lives in apps/example-rag-chat — swap the sample docs for yours.

#1. Index your docs (RAG)

Chunk + embed your docs once, into any @agentskit/memory vector store. Embedding stays free + local with an ONNX model:

import { createRAG } from '@agentskit/rag'
import { fileVectorMemory } from '@agentskit/memory'
import { pipeline } from '@huggingface/transformers'

// Local ONNX embedder — $0, no API key.
let extractor: Awaited<ReturnType<typeof pipeline>> | null = null
const embed = async (text: string): Promise<number[]> => {
  extractor ??= await pipeline('feature-extraction', 'Xenova/bge-small-en-v1.5')
  const out = await extractor(text, { pooling: 'mean', normalize: true })
  return Array.from(out.data as Float32Array)
}

const rag = createRAG({ embed, store: fileVectorMemory({ path: './docs-index' }), topK: 6 })
await rag.ingest(docs) // docs: { id, content, metadata: { path, title } }[]

For a committed, read-only index (great for serverless), generate it at build time and ship the JSON — see the docs-site's scripts/gen-ask-index.mjs.

#2. Stream grounded, cited answers

createRAG returns a Retriever — drop it straight into an AgentsKit adapter consumed by AgentChat (or into useChat / createRuntime for a lower-level surface). Two design choices make it reliable on free models:

Co-locate the context with the question (free models attend to recent tokens far better than a long system prompt).
Emit citations from what you retrieved — don't depend on a weak model to call a cite tool.

import { useChat } from '@agentskit/react'
import { openrouter, createFallbackAdapter } from '@agentskit/adapters'

const FREE = ['meta-llama/llama-3.3-70b-instruct:free', 'qwen/qwen3-next-80b-a3b-instruct:free']
const adapter = createFallbackAdapter(
  FREE.map((model) => ({ id: model, adapter: openrouter({ apiKey, model }) })),
)

const chat = useChat({
  adapter,
  retriever: rag,
  systemPrompt: 'Answer ONLY from the provided docs. Be concise. Decline + name the nearest page when uncovered. Never mention other frameworks.',
})

createFallbackAdapter cascades across the free pool when one is rate-limited (429). Keep answers short and AgentsKit-specific with a tight system prompt.

The floating chat is a headless, slotted component — open by default, branded logo, animated loading, and a "build this" link, all overridable:

<AskDocsWidget
  logo={<YourMark />}          // header logo slot
  loadingState={<YourSpinner />} // loading slot
  title="Ask our docs"
  docsHref="/docs/cookbook/ask-the-docs"
/>

Generative UI (option buttons, forms, runnable code) activates with ASK_RICH_UI=1 and a capable model — free models stay on reliable markdown text.

#Guardrails (built in)

Before any model call, a cheap triage runs — it saves your free-tier quota on trivia and blocks the obvious attacks:

Greetings ("hi", "oi") and noise ("test", empty, gibberish) → an instant canned reply, no LLM.
Prompt injection ("ignore previous instructions", "reveal your system prompt", "you are now…", "jailbreak") → a firm decline that never changes role or leaks the prompt — no LLM.
Real questions (even one word like "memory") fall through.

It's additive and extensible — keep the defenses, add your own:

import { triageMessage } from './lib/ask/guard'

const triage = triageMessage(userText, {
  greetings: ['salut', 'ciao'],
  noise: ['blah'],
  injectionPatterns: [/give me the raw config/i],
  replies: { greeting: 'Hey! Ask me about our product docs.' },
})
if (triage.kind === 'canned') return stream(triage.reply) // skip the model

This sits on top of the other layers: client system messages are stripped, the retrieved context is fenced as untrusted data, and the grounded prompt keeps answers on-topic — defense-in-depth.

#Going further

Durable rate limit — front the route with Upstash (in-memory fallback for dev).
Eval it — measure retrieval recall@k / MRR deterministically + an LLM judge; gate in CI.
Run code in-browser — @agentskit/sandbox/web webWorkerBackend (zero-vendor) powers runnable snippets.

That's the whole thing: index → ground → stream → cite. Grounding quality scales with model quality — start free, bring your own key when you want the rich UI.

Explore nearby

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

Build an Ask-the-docs chat

#Fastest path — the CLI

#What the command actually does

#Safety guarantees

#Framework support

#Zero-prompts via `agentskit init`

#After install

#1. Index your docs (RAG)

#2. Stream grounded, cited answers

#3. The widget (optional, fancy)

#Guardrails (built in)

#Going further

Explore nearby

On this page

Build an Ask-the-docs chat

#Fastest path — the CLI

#What the command actually does

#Safety guarantees

#Framework support

#Zero-prompts via agentskit init

#After install

#1. Index your docs (RAG)

#2. Stream grounded, cited answers

#3. The widget (optional, fancy)

#Guardrails (built in)

#Going further

Explore nearby

On this page

#Zero-prompts via `agentskit init`