34. Project 3: "Vibe Coder" Assistant for Your Own Repo
Overview and links for this section of the guide.
On this page
The Goal
We are going to build a CLI tool that understands your project. You can ask it questions and get context-aware answers grounded in your actual codebase:
"Where is the user authentication logic?"
"Refactor the
Userclass to add aphoneNumberfield and update all call sites."
"Write a unit test for the
calculateTaxfunction."
This isn't a generic coding assistant—it won't hallucinate APIs that don't exist in your project. It reads your actual files and proposes changes as diffs you can review before applying.
Why Build This
Generic AI coding assistants have a fundamental limitation: they don't know your code. They can write a
UserService class, but they don't know you already have one at src/services/user.ts with
specific methods and conventions.
By building a repo-aware assistant, you learn the core patterns that power tools like GitHub Copilot Workspace, Cursor, and Cody:
- Retrieval-Augmented Generation (RAG) for code: Fetching relevant files before generating.
- Tool use: Giving the model the ability to read files, search, and propose edits.
- Human-in-the-loop: Ensuring the AI never makes changes without approval.
High-Level Architecture
The assistant follows a four-stage pipeline:
┌─────────────────────────────────────────────────────────────────┐
│ VIBE CODER ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ INDEXER │───▶│RETRIEVER │───▶│ CONTEXT │───▶│GENERATOR │ │
│ │ │ │ │ │ BUILDER │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ Walks your Finds files Packages Produces │
│ repo, builds relevant to query + answer or │
│ a "map" the query files into code diff │
│ a prompt │
│ │
└─────────────────────────────────────────────────────────────────┘
- Indexer: A script that walks your directory, respects
.gitignore, and creates a "map" of your codebase. This map contains file paths and optionally summaries of each file's purpose. - Retriever: When you ask a question, it consults the map (or uses embeddings) to find which files are relevant. For small repos, it might just include everything.
- Context Builder: Combines the user's query with the retrieved file contents into a structured prompt that fits the context window.
- Generator: The LLM produces a response—either an answer to a question or a proposed code change in diff format.
Technology Stack
For this project, we'll use:
| Component | Technology | Why |
|---|---|---|
| Language | TypeScript/Node.js | Easy file system access, good async support |
| LLM | Gemini 1.5 Pro | 2M token context fits most repos entirely |
| CLI Framework | Commander.js or Inquirer | Interactive prompts and command parsing |
| File Watching | chokidar (optional) | Re-index on file changes |
You can build a working prototype in under 200 lines of code. Don't over-engineer the first version—get to "it works on my repo" as fast as possible, then iterate.
What You'll Learn
By the end of this project, you'll have hands-on experience with:
- Repo traversal: Walking directories, respecting
.gitignore, handling binary files. - Context management: Deciding what to include when you can't fit everything.
- Structured diff generation: Getting the model to output parseable code changes.
- Safety patterns: Preventing the assistant from running dangerous commands or leaking secrets.
- Tool design: Defining clean interfaces for "read file," "write file," and "run command" tools.
Where to go next
Explore next