Home/ Part VIII — Retrieval, Grounding, and "Don't Make Stuff Up" Engineering/25. Building a RAG App (Project 2)

25. Building a RAG App (Project 2)

Overview and links for this section of the guide.

On this page

What this project builds
Project deliverables (minimum viable)
Reference architecture (simple but real)
Common project failure modes
Section 25 map (25.1–25.5)
Where to start

What this project builds

Project 2 is an end-to-end RAG app: a system that answers questions about your documents with references.

Unlike a “chat over docs” demo, this project is designed to be:

grounded: answers are constrained to sources,
auditable: you can see which chunks influenced the answer,
maintainable: indexes update as docs change,
measurable: evaluation catches regressions.

Scope choice

This project focuses on the core RAG pipeline and guardrails. You can wrap it in a CLI or web UI later; the backend contract is the hard part.

Project deliverables (minimum viable)

By the end of Section 25, you should have:

A spec: clear “done” criteria and non-goals.
An indexer: ingest → chunk → embed → store (repeatable, idempotent).
A query path: retrieve → compose prompt → answer (with citations).
Validation: schema validation and “not found” behavior.
Evaluation harness: a small eval set + regression detection.
Maintenance plan: updates, deletions, and re-embedding strategy.

Reference architecture (simple but real)

A minimal but production-shaped architecture includes:

Document store: source docs and extracted text.
Chunk store: chunks with ids + metadata + text.
Vector index: embeddings for chunks (with metadata filters).
Retrieval layer: query embedding + filters + top-k search + rerank (optional).
Prompt composer: context packing + grounding rules.
Answer validator: JSON/schema checks + citation checks.
Audit log: store question, retrieved chunk ids, answer, model versions.

In the beginning, “stores” can be files on disk. The key is the interfaces and artifacts: ids, metadata, and reproducible pipelines.

Common project failure modes

No spec: the system “works” but you can’t tell what “correct” means.
No eval set: changes feel better until a user finds a bad answer.
Bad chunking: retrieval can’t find the evidence even though it exists.
Weak grounding: the model answers from vibes when sources are missing.
No maintenance: stale indexes break trust as docs evolve.

Section 25 map (25.1–25.5)

Where to start

Explore next

25. Building a RAG App (Project 2) sub-sections

5 pages

25.1 Spec: "Answer questions about my docs with references"

Open page

25.2 Indexing pipeline: ingest → chunk → embed → store

Open page

25.3 Query pipeline: retrieve → compose prompt → answer

Open page

25.4 Evaluation: measuring answer quality and faithfulness

Open page

25.5 Maintenance: updating the index and handling deletions

Open page