Auto-summarization
Wrap any ChatMemory so it compacts oldest messages into a summary whenever stored tokens exceed a budget.
compileBudget trims per-request. createAutoSummarizingMemory
trims at rest: once a session grows past maxTokens, the oldest
(non-summary) messages get folded into a single summary message via
your summarizer, then persisted. Idempotent β summaries are tagged
and never re-summarized.
#Install
Ships in @agentskit/core under the subpath:
import { createAutoSummarizingMemory } from '@agentskit/core/auto-summarize'#Wire it up
import { createAutoSummarizingMemory } from '@agentskit/core/auto-summarize'
import { createInMemoryMemory } from '@agentskit/core'
import { createRuntime } from '@agentskit/runtime'
import { anthropic } from '@agentskit/adapters'
const summarizerAdapter = anthropic({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-haiku-4-5' })
const memory = createAutoSummarizingMemory(createInMemoryMemory(), {
maxTokens: 8_000,
keepRecent: 6,
summarizer: async messages => {
const src = messages.map(m => `${m.role}: ${m.content}`).join('\n')
const result = await runOnce(summarizerAdapter, `Summarize the following chat transcript in 3 bullet points:\n\n${src}`)
return {
id: crypto.randomUUID(),
role: 'system',
content: result,
status: 'complete',
createdAt: new Date(),
}
},
onCompact: info => {
console.log(`compacted ${info.droppedCount} msgs: ${info.beforeTokens} β ${info.afterTokens} tokens`)
},
})
const runtime = createRuntime({ adapter: mainAdapter, memory })#Options
| Option | Default | Purpose |
|---|---|---|
maxTokens | (required) | Budget trigger |
keepRecent | 4 | Messages always kept verbatim at the tail |
counter | approximateCounter | Swap for tiktoken for real counts |
summarizer | (required) | (messages) => Message β your compaction prompt |
onCompact | β | Observability hook |
#Summary message shape
Every summary emitted by summarizer gets
metadata.agentskitSummary = true attached automatically. That tag
is what prevents re-summarization β subsequent compactions leave
existing summaries alone.
#See also
- Token budget compiler β per-request trimming
- Virtualized memory β cap count, not tokens
- Hierarchical memory β tiered long-term storage
Explore nearby
- PeerRecipes
Copy-paste solutions grouped by theme. Every recipe end-to-end, runs as written.
- PeerCustom adapter
Wrap any LLM API as an AgentsKit adapter. Plug-and-play with the rest of the kit in 30 lines.
- PeerAdapter contract tests
Verify any adapter against the ADR 0001 invariants A1βA10 with the shared test harness.