5.1 A one-prompt micro app: "build me a CLI calculator"

On this page

Goal and constraints
Pick defaults (model, temperature, speed)
The one prompt (copy/paste)
What a good output looks like
Sanity checks before you run it
If the model misses (fast fixes)
Where to go next

Goal and constraints

You’re going to generate a tiny CLI calculator project in a single prompt. The purpose is not “perfect math”—it’s practicing the loop: clear spec → generated code → runnable output.

Constraints (important)

Language: Python 3.x.
Dependencies: standard library only.
Security: do not use eval, exec, or shelling out.
Behavior: parse and evaluate simple arithmetic expressions.
Error handling: invalid expressions should return a clear error message and non-zero exit code.
Tests: include a small test suite using unittest.

Why “no eval” matters (even for a toy)

A CLI calculator that uses eval is a code execution vulnerability disguised as a demo. Building the safe habit early keeps your vibe coding from turning into accidental security debt.

Pick defaults (model, temperature, speed)

For a micro app, you want fast iteration and repeatable output:

Prefer a fast model that can generate code reliably.
Use a low-to-moderate temperature so it doesn’t invent extra features.
Keep the prompt short but strict (constraints + acceptance criteria).

Your goal is “runnable,” not “beautiful”

Beauty comes after SP1. First get something you can execute and verify.

The one prompt (copy/paste)

Paste this as a single request. It is intentionally specific so the output is easy to turn into files.

You are building a tiny Python CLI calculator project.

Requirements:
- Python 3.x, standard library only
- Do NOT use eval/exec or any dynamic code execution
- Support +, -, *, /, parentheses, and unary minus
- Input is a single expression string provided as a CLI argument
- Output: print the numeric result to stdout
- Errors: print a clear message to stderr and exit with code 2 for invalid expressions, code 1 for other errors

Project output format:
1) First, print a short file tree.
2) Then output each file in a fenced code block with the file path in the header.

Project structure:
- README.md (how to run + examples)
- calc/__init__.py
- calc/cli.py (entrypoint)
- calc/parser.py (tokenize + parse)
- calc/eval.py (evaluate AST)
- tests/test_calc.py (unittest)

Acceptance tests (must pass):
- `python -m calc "2+2"` prints `4`
- `python -m calc "2*(3+4)"` prints `14`
- `python -m calc "-3 + 5"` prints `2`
- `python -m calc "1/2"` prints `0.5`
- `python -m calc "2+*3"` exits with code 2 and prints an error to stderr

Implementation notes:
- Keep the parser simple (recursive descent is fine)
- Keep code readable over clever
- Add minimal docstrings where it helps clarity
- Ensure tests are deterministic

Now produce the project.

Why this prompt format works

You’re forcing: (1) constraints, (2) file boundaries, (3) acceptance criteria, (4) output format. That combination reduces hallucinated architecture and makes copy/paste into a repo straightforward.

What a good output looks like

Don’t judge the output by how “smart” it sounds. Judge it by whether it can become a runnable project without interpretation.

Clear files: it creates exactly the listed files, with coherent imports.
No mystery deps: it doesn’t introduce third-party libraries.
No hidden behavior: no eval, no executing user input as code.
Tests match the CLI: tests invoke the evaluator/parser directly, or call cli.main in a testable way.
Errors are explicit: invalid syntax becomes a controlled exception and exit code.

Sanity checks before you run it

Before you create files, do a 60-second audit:

Search the output for eval / exec / subprocess.
Make sure the entrypoint is defined (python -m calc implies calc/__main__.py or a package module behavior; if missing, you’ll fix it in 5.2).
Check that test imports match the file tree.
Look for “creative” features you didn’t ask for (config files, extra commands, network calls).

If something is missing, that’s normal

The point is not perfection on the first pass. The point is to generate enough structure that your next prompt can be precise: “add this file” or “fix this entrypoint.”

If the model misses (fast fixes)

Common misses and the fastest follow-up prompts:

It didn’t support `python -m calc`

Add whatever is required so that `python -m calc "2+2"` works.
Keep the current structure. Add only the minimal file(s) needed (likely `calc/__main__.py`).
Show only the new file(s) or diffs.

It used `eval`

Replace any use of eval/exec with a safe parser + evaluator.
Keep the public CLI behavior the same.
Show a plan, then provide a diff-only patch.

Tests are missing or don’t match

Write `unittest` tests that cover the acceptance tests listed below.
Keep implementation unchanged unless required.
If required, make the smallest behavior-preserving fix and explain it.

5.1 A one-prompt micro app: "build me a CLI calculator"

Goal and constraints

Constraints (important)

Pick defaults (model, temperature, speed)

The one prompt (copy/paste)

What a good output looks like

Sanity checks before you run it

If the model misses (fast fixes)

It didn’t support python -m calc

It used eval

Tests are missing or don’t match

Where to go next

It didn’t support `python -m calc`

It used `eval`