agentskit.js
Data layerRAG

Chunking

Split docs before embedding. Sensible defaults; override per doc type.

import { chunkText } from '@agentskit/rag'

const chunks = chunkText(longDoc, {
  chunkSize: 800,
  chunkOverlap: 120,
  split: 'paragraph',
})

Options

OptionTypeDefault
chunkSizenumber1000
chunkOverlapnumber100
split'sentence' | 'paragraph' | 'markdown' | 'code' | 'char'paragraph

Rules of thumb

  • Prose: paragraph, 800/120.
  • Markdown: markdown, 1200/150.
  • Code: code, 1500/0.
✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

On this page