Home/ Part XIII — Expert Mode: Systems, Agents, and Automation/41. Advanced RAG (Beyond the Basics)/41.4 Context packing: selecting the best chunks under budget

41.4 Context packing: selecting the best chunks under budget

Overview and links for this section of the guide.

The Knapsack Problem

You have 30k tokens of context. You retrieved 50 documents (100k tokens). Which ones do you keep?

Packing Strategies

  1. Rerank: Use a cross-encoder to score relevancy. Keep top N.
  2. Max Marginal Relevance (MMR): Pick result 1. Pick result 2 that is different from result 1. Pick result 3 that is different from 1 and 2. This prevents "5 versions of the same doc" from clogging the context.
  3. Summarize: Ask a cheap model to summarize the bottom 20 docs into 1k tokens. Keep top 5 full docs.

Where to go next