From LlamaIndex
Side-by-side migration guide. Map LlamaIndex retrieval and agent abstractions to AgentsKit RAG + runtime.
LlamaIndex is the king of "RAG over your documents". AgentsKit treats RAG as one piece of a larger agent runtime β adapters, tools, memory, skills, observability all share the same contracts. Migrate when:
- You want one runtime for "agent that uses RAG" instead of two libraries (LlamaIndex + your own loop).
- You need swappable providers in one line β adapters are the seam.
- You want TypeScript-first ergonomics. AgentsKit is TS, not Python.
- You want first-class tool calling, durable execution, and topologies alongside RAG.
Stay with LlamaIndex when:
- You're 100% Python and don't want a TS dependency.
- You depend on the LlamaIndex agent recipes (
OpenAIAssistantAgent,ReActAgentwith very specific prompt structures). - You need exotic indexers (TreeIndex, ListIndex, KeywordTableIndex) AgentsKit doesn't have.
#Quick reference
| LlamaIndex | AgentsKit | Notes |
|---|---|---|
VectorStoreIndex.from_documents(docs) | createRAG({ embed, store }).ingest(docs) | One factory; pluggable embedder + vector store. |
index.as_retriever() | The RAG instance itself implements Retriever | Use it as retriever: on the runtime. |
index.as_query_engine() | runtime.run(query) with retriever set | Retrieval per-turn is automatic. |
Document(text, metadata) | { content, metadata } | Same shape, plain object. |
SimpleNodeParser, SentenceSplitter | chunkText({ chunkSize, chunkOverlap }) | Lower-level helper if you want to chunk yourself. |
OpenAIEmbedding | openaiEmbedder({ apiKey, model }) | Embedders are first-class adapters. |
Chroma, Pinecone, Qdrant, Weaviate | chroma, pinecone, qdrant, weaviateVectorStore | All in @agentskit/memory. |
OpenAIAgent.from_tools(tools) | createRuntime({ adapter, tools }) | Same idea; one runtime for every model. |
ReActAgent | createRuntime (the loop is ReAct by default) | No separate class. |
Workflow (LlamaIndex 0.10+) | compileFlow for DAGs, topologies for multi-agent | YAML DAG vs explicit code; pick per use case. |
#1. Basic RAG over local files
#Before β LlamaIndex (Python)
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.openai import OpenAIEmbedding
documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(
documents,
embed_model=OpenAIEmbedding(model="text-embedding-3-small"),
)
query_engine = index.as_query_engine()
print(query_engine.query("How do I deploy?"))#After β AgentsKit
import { createRAG } from '@agentskit/rag'
import { fileVectorMemory } from '@agentskit/memory'
import { openaiEmbedder, openai } from '@agentskit/adapters'
import { createRuntime } from '@agentskit/runtime'
import { readdir, readFile } from 'node:fs/promises'
import { join } from 'node:path'
const rag = createRAG({
embed: openaiEmbedder({ apiKey: KEY, model: 'text-embedding-3-small' }),
store: fileVectorMemory({ path: './embeddings.json' }),
})
const docs = await Promise.all(
(await readdir('./docs')).map(async f => ({
content: await readFile(join('./docs', f), 'utf8'),
source: f,
})),
)
await rag.ingest(docs)
const runtime = createRuntime({
adapter: openai({ apiKey: KEY, model: 'gpt-4o-mini' }),
retriever: rag,
})
console.log(await runtime.run('How do I deploy?'))The shape is the same: ingest β retrieve β answer. runtime.run() calls the retriever per turn automatically.
#2. Reranking
LlamaIndex has LLMRerank and SentenceTransformerRerank. AgentsKit ships pluggable rerankers:
import { createRerankedRetriever, voyageReranker } from '@agentskit/rag'
const reranked = createRerankedRetriever(rag, {
candidatePool: 30,
topK: 5,
rerank: voyageReranker({ apiKey: VOYAGE_KEY, model: 'rerank-2' }),
})
createRuntime({ adapter, retriever: reranked })Built-in: BM25 (default), voyageReranker, jinaReranker. Custom: pass any RerankFn. See RAG reranking.
#3. Hybrid retrieval
LlamaIndex 0.10+ has QueryFusionRetriever for vector + keyword. AgentsKit:
import { createHybridRetriever } from '@agentskit/rag'
const hybrid = createHybridRetriever(rag, {
vectorWeight: 0.7,
bm25Weight: 0.3,
})#4. Document loaders
LlamaIndex has hundreds of LlamaHub loaders. AgentsKit ships the
common ones in @agentskit/rag:
import {
loadUrl, loadGitHubFile, loadGitHubTree,
loadNotionPage, loadConfluencePage, loadGoogleDriveFile,
loadPdf, loadS3, loadGcs, loadDropbox, loadOneDrive,
} from '@agentskit/rag'
await rag.ingest(await loadGitHubTree('agentskit-io', 'agentskit', { token }))
await rag.ingest(await loadNotionPage('PAGE_ID', { token: NOTION_KEY }))For everything else, the loader contract is () => Promise<Array<{ content, source?, metadata? }>> β write your own in 10 lines.
#5. Tool-using agent
#Before β LlamaIndex
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
def get_weather(city: str) -> str:
...
agent = ReActAgent.from_tools(
[FunctionTool.from_defaults(fn=get_weather)],
llm=OpenAI(model="gpt-4o"),
)
agent.chat("What's the weather in Lisbon?")#After β AgentsKit
import { createRuntime } from '@agentskit/runtime'
import { openai } from '@agentskit/adapters'
import type { ToolDefinition } from '@agentskit/core'
const weather: ToolDefinition = {
name: 'weather',
description: 'Get the weather for a city',
schema: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'] },
execute: async ({ city }) => fetch(`https://wttr.in/${city}?format=j1`).then(r => r.json()),
}
const runtime = createRuntime({
adapter: openai({ apiKey: KEY, model: 'gpt-4o' }),
tools: [weather],
})
await runtime.run("What's the weather in Lisbon?")JSON Schema 7 is the input format. Convert from Zod with zod-to-json-schema if you want Zod as your source of truth.
#6. Combining RAG + tools + memory
import { sqliteChatMemory } from '@agentskit/memory'
const runtime = createRuntime({
adapter,
tools: [weather, ...filesystem({ basePath: './workspace' })],
retriever: rag,
memory: sqliteChatMemory({ path: './sessions/user-42.db' }),
maxSteps: 10,
})Memory load + save, retrieval per turn, tool resolution β all handled. The runtime contract is locked in ADR 0006.
#7. Vector store choice
| Backend | Import | Notes |
|---|---|---|
| In-memory | createInMemoryMemory (core) | Tests, demos. |
| File JSON | fileVectorMemory({ path }) | Vectra-backed; one process. |
| Postgres | pgvector(...) | Bring your own client (pg Pool, Neon, Supabase RPC). |
| Pinecone | pinecone({ indexUrl, apiKey }) | HTTP only; no SDK dep. |
| Qdrant | qdrant({ url, apiKey?, collection }) | HTTP only. |
| Chroma | chroma({ url, collection }) | HTTP only. |
| Upstash | upstashVector({ url, token }) | HTTP only. |
| Weaviate | weaviateVectorStore({ url, apiKey?, className }) | HTTP only. |
| Milvus / Zilliz | milvusVectorStore({ url, token, collection }) | HTTP only. |
| MongoDB Atlas | mongoAtlasVectorStore({ collection, ... }) | BYO client. |
| Supabase | supabaseVectorStore({ url, serviceRoleKey, table? }) | Wraps pgvector via RPC. |
#Where LlamaIndex still wins
- Python ecosystem. If your team and your data tooling are 100% Python, the migration cost might not be worth it.
- Tree / list / keyword indices for non-embedding retrieval shapes.
- Document loaders. LlamaHub has hundreds; AgentsKit ships ~10. If you need an exotic source (Slack archive, Salesforce object), you'd write it.
- Multimodal indices (image / video). AgentsKit's RAG is text-first today.
#Incremental migration
- Embed once with LlamaIndex; retrieve with AgentsKit β vector backends are wire-compatible. Point AgentsKit at your existing Pinecone / Qdrant index.
- Move the agent loop to AgentsKit's runtime β keep your LlamaIndex retriever wrapped via the
Retrievercontract. - Port loaders incrementally as you re-ingest.
#Related
- Recipe: Chat with RAG β end-to-end.
- RAG reranking β voyage / jina / BM25.
- Vector adapters β backend-by-backend.
- Concepts: Retriever β the contract LlamaIndex retrievers can implement.
Explore nearby
- PeerMigrating to AgentsKit
Guides for moving from other frameworks. Honest about what transfers easily and what doesn't.
- PeerFrom Vercel AI SDK
Side-by-side migration guide. Map your Vercel AI SDK code to AgentsKit β with honest callouts about where each wins.
- PeerFrom LangChain.js
Side-by-side migration guide. Map your LangChain.js code to AgentsKit β with honest callouts about where LangChain still fits.