2.2 When vibe coding is great vs when it's dangerous
Overview and links for this section of the guide.
On this page
The core idea: match method to risk
Vibe coding is a force multiplier. Like any force multiplier, it can amplify both good and bad decisions. The skill is knowing when to use it aggressively, and when to slow down and add guardrails.
The simplest framing:
- When feedback is fast (tests, quick runs), vibe coding is great.
- When feedback is slow or ambiguous (security, production incidents, complex migrations), vibe coding is dangerous unless you add strong controls.
Vibe coding is safe when you can get to truth quickly. If you can’t verify quickly, your risk goes up.
When vibe coding is great
Vibe coding shines when the problem is well-scoped and verifiable.
Great-fit patterns
- Scaffolding and boilerplate: project setup, folder structure, wiring, initial configs.
- CRUD + glue code: predictable patterns where correctness is easy to test.
- Refactors with tests: rename, extract modules, reorganize code, improve readability.
- Debugging with strong evidence: you have a stack trace, repro steps, failing tests.
- Documentation from code: generating drafts you will validate against the repo.
- Consistency work: formatting, lint fixes, repetitive changes across files.
Why it works here
- Small diffs are possible.
- Ground truth exists (tests, output, schemas).
- Failure cost is low and rollback is easy.
- Constraints can be stated clearly (“don’t touch X,” “use schema Y”).
Tasks where you can say: “If this works, this test passes / this command outputs X / this UI flow succeeds.”
When vibe coding is dangerous
Vibe coding becomes risky when you can’t easily verify correctness, or when mistakes have a high blast radius.
High-risk zones
- Security and auth: permissions, data access control, secrets handling, encryption.
- Payments and money movement: idempotency, retries, fraud, reconciliation.
- Data deletion and migrations: schema changes, irreversible transforms, multi-tenant data.
- Production incident response: incomplete info, high pressure, hidden constraints.
- Complex distributed systems: concurrency, consistency, partial failures, backpressure.
- Regulated domains: compliance, audit trails, safety requirements.
Why it’s dangerous
- Verification is slow: you can’t run the full system locally or reproduce reliably.
- Hidden constraints: production realities aren’t fully captured in the prompt.
- Hallucinations are costly: an invented API or wrong assumption can cause real damage.
- Rollback is hard: data changes and security mistakes can be irreversible or high impact.
The risk is that AI increases throughput. If your process doesn’t include safety gates, you can ship mistakes faster.
A simple risk matrix you can actually use
Use two axes to decide how strict you should be:
- Blast radius: how bad is it if this change is wrong?
- Time-to-verify: how quickly can you prove it’s correct?
Low blast radius + fast verification
Green zone: vibe code aggressively.
- Small diffs, quick tests, easy rollback.
- Optimize for speed and iteration.
High blast radius + fast verification
Yellow zone: vibe code with guardrails.
- Add tests and explicit acceptance criteria.
- Require diff-only, reviews, and staged rollout if relevant.
Low blast radius + slow verification
Yellow zone: vibe code, but keep scope tiny.
- Prefer exploration, then implement only what you can validate.
- Use spikes or prototypes to make verification faster.
High blast radius + slow verification
Red zone: do not vibe code on autopilot.
- Use the model as a planning assistant, not an implementer.
- Require expert review, strong gates, and explicit rollback plans.
- Break work into smaller, verifiable slices before writing code.
You can still vibe code, but you must add process: tests, checks, reviews, and controlled scope.
Mitigations (how to vibe code safely)
When you’re in yellow/red zones, the model can still help—but you must change how you use it.
1) Reduce scope and enforce boundaries
- Allowlist files that can change.
- Require “diff-only changes.”
- Freeze interfaces unless explicitly migrating.
2) Increase evidence
- Add regression tests before the fix.
- Use schema validation for structured outputs.
- Add logs/metrics when debugging production-like behavior.
3) Split roles: planner vs implementer
Use one prompt for planning and one for execution:
- Planner prompt: options, tradeoffs, risks, verification strategy.
- Executor prompt: minimal diff to implement the chosen plan.
4) Add gates for dangerous actions
- Human approval for data deletion, secrets changes, permission changes.
- Staged rollout (feature flags, canaries) when applicable.
- Explicit rollback plan before deploying.
5) Make verification explicit and mandatory
- Always ask: “What should I run to verify?”
- Always run it.
- Feed back the exact results.
When in doubt: make the loop smaller and add one verification step. That usually resolves the uncertainty quickly.
Concrete examples (great vs dangerous)
Great: “Add a CLI flag and tests”
- Blast radius: low (local tool).
- Time-to-verify: fast (run the CLI + tests).
- Approach: vibe code aggressively; enforce small diffs.
Yellow: “Refactor auth middleware in a web app”
- Blast radius: high (login breaks = outage).
- Time-to-verify: medium (integration tests, staging).
- Approach: add tests, use diff-only, do a staged rollout, review carefully.
Red: “Write a database migration that deletes data”
- Blast radius: extremely high (irreversible loss).
- Time-to-verify: slow (production data realities).
- Approach: use the model for planning/checklists, but require human review, dry runs, backups, and explicit rollback strategy.
Red: “Production incident response with partial logs”
- Blast radius: high (customer impact).
- Time-to-verify: variable and stressful.
- Approach: use the model to generate hypotheses and queries; rely on tools/logs as truth; minimize risky changes.
If you can’t articulate how you verified a change, you shouldn’t ship it—especially in yellow/red zones.
Rules of thumb (fast decisions)
- If you can’t test it, don’t automate it. Reduce scope until you can.
- High risk + low evidence = planning mode. Ask for options, not code.
- When stakes are high, make diffs smaller. Big changes hide bugs.
- Prefer adding guardrails over adding prompts. Tests, schemas, budgets beat clever wording.
- When uncertain, create a repro. A minimal failing case is better than a long conversation.
Most of your work should be structured as a series of green-zone loops—even if the overall project is risky. Break it down until each step is verifiable.