Retriever

One narrow interface that serves RAG, BM25, web search, code search, and memory recall through composition.

A Retriever answers one question: "given a query and the conversation so far, what documents are relevant?"

It is the inversion-of-control seam between agent talking and agent reading. The runtime calls retrieve() once per turn before generating, hands the results to the model as context, and the model responds.

The contract is intentionally narrow — one method — so RAG, BM25, hybrid search, web search, code search, and conversational memory recall all fit the same shape. Composition handles the variety.

The interface

import type { Retriever, RetrieverRequest, RetrievedDocument } from '@agentskit/core'

export interface Retriever {
  retrieve: (request: RetrieverRequest) => MaybePromise<RetrievedDocument[]>
}

export interface RetrieverRequest {
  query: string
  messages: Message[]
}

export interface RetrievedDocument {
  id: string
  content: string
  source?: string
  score?: number
  metadata?: Record<string, unknown>
}

query is the focused string. messages is the surrounding conversation, available for retrievers that want to do query rewriting or conversational compression. Most retrievers ignore it.

Using RAG

import { createRAG } from '@agentskit/rag'
import { fileVectorMemory } from '@agentskit/memory'
import { openaiEmbed } from '@agentskit/adapters'

const rag = createRAG({
  store: fileVectorMemory({ path: './embeddings.json' }),
  embed: openaiEmbed({ apiKey: KEY, model: 'text-embedding-3-small' }),
  topK: 5,
})

await rag.ingest([
  { content: 'AgentsKit core is 10KB gzipped.' },
  { content: 'Adapters are interchangeable per ADR 0001.' },
])

createRuntime({ adapter, retriever: rag })

RAG extends Retriever, so it plugs directly into the runtime. The runtime calls retrieve() on every turn.

Web search as a retriever

Anything that returns ranked documents fits:

const webRetriever: Retriever = {
  async retrieve({ query }) {
    const res = await fetch(`https://api.example-search.com?q=${query}`)
    const hits = await res.json()
    return hits.map((h, i) => ({
      id: h.url,
      content: h.snippet,
      source: h.url,
      score: hits.length - i,    // rank-as-score is fine
    }))
  },
}

Composing retrievers

A composite retriever is itself a Retriever. This is what makes reranking, hybrid search, and ensembles fall out naturally.

function hybrid(vectors: Retriever, keywords: Retriever): Retriever {
  return {
    async retrieve(request) {
      const [v, k] = await Promise.all([
        vectors.retrieve(request),
        keywords.retrieve(request),
      ])
      return mergeAndRerank([...v, ...k])   // your choice of reranker
    },
  }
}

createRuntime({ adapter, retriever: hybrid(vectorRAG, bm25) })

The runtime never sees the difference. The composer is responsible for renormalizing scores from heterogeneous sources.

Empty results are success, not failure

const docs = await retriever.retrieve({ query: 'something we don't know about', messages: [] })
// docs is [], no error

[] means "no relevant documents" — a valid answer. Errors throw (Promise rejection); they are not encoded in the result.

This is intentional asymmetry vs Adapter (errors are chunks) and Tool (errors are status). Retrievers are typically called once per turn, synchronously from the runtime's perspective. Wrapping every retrieve in error chunks would add ceremony for no gain.

When to write your own

You have a custom search backend (internal docs, code search, ticket search, etc.) — implement Retriever and you're done.
You're building a reranker, ensemble, or hybrid — wrap N retrievers in your own.
You're testing — return a fixed array.

Common pitfalls

Pitfall	What to do instead
Returning `null` instead of `[]` on no results	Always return an array. Empty is success.
Mixing scored and unscored documents in one result	Either every doc has `score`, or none do (R6)
Inconsistent score scale across calls	Pick one (cosine, BM25, normalized rank) and keep it for the lifetime of the retriever instance
Mutating the input request	Treat `RetrieverRequest` as read-only
Configuring topK per call	Configure at construction; the contract has no per-call options in v1

Going deeper

The full list of invariants (eleven of them, R1–R11) is in ADR 0004 — Retriever contract.

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

Retriever

On this page