14.1 Spec: "Summarize any article into structured bullets"
Overview and links for this section of the guide.
On this page
- Goal and non-goals
- Inputs (what the app accepts)
- Outputs (schema-first)
- Quality rules (grounding, uncertainty)
- Acceptance criteria (mini test suite)
- Edge cases you must define now
- How to prompt for this project (spec → plan → scaffold)
- Ship points (what “done” means per stage)
- Copy-paste spec templates
- Where to go next
Goal and non-goals
Goal: build an app that takes article text as input and returns a structured bullet summary that is easy to display and validate.
Non-goals (for this project):
- building a full web scraper (URLs are optional and out of scope initially),
- building a full RAG system (that’s Part VIII later),
- perfect factuality beyond the provided text (we’ll focus on “grounded in input”),
- long-term storage/search of summaries (out of scope for Project 1).
This project is intentionally small so you can learn the pipeline. You can add URL fetching, storage, and search later without redesigning the core.
Inputs (what the app accepts)
Define inputs explicitly so the model doesn’t guess and your app doesn’t become a “works for me” prototype.
Primary input: article text
- Input type: UTF-8 text
- Max length: define a limit (example: 10,000–30,000 characters) and enforce it
- Whitespace: treat as insignificant; trim and normalize
Optional metadata (nice-to-have)
- title: user-provided title (optional)
- source_url: optional string (store as metadata only; do not fetch in v1)
- audience: “general”, “technical”, “executive” (optional)
URL fetching adds networking, parsing, content extraction, and policy concerns. Start with pasted text. Add URL fetching later behind explicit constraints.
Outputs (schema-first)
The output should be structured so the app can parse and display it reliably. Start with a schema that is:
- small enough to be hard to break,
- expressive enough to be useful,
- easy to validate.
Suggested v1 schema (practical)
This is a good starting point for a “structured bullets” summary:
{
"title": "string | null",
"summary_bullets": ["string", "..."],
"key_entities": ["string", "..."],
"claims": [
{"claim": "string", "support": "string | null"}
],
"caveats": ["string", "..."]
}
Output rules (important)
- Length cap: cap bullet count (example: 5–10 summary bullets).
- No invented facts: claims must be grounded in the input text.
- Use
nullwhen unknown: don’t guess missing titles. - Strings only: keep it simple; no nested complexity beyond what you need.
If your schema can be validated with straightforward rules, your app becomes reliable. The model will still wobble sometimes; your validator makes wobble survivable.
Quality rules (grounding, uncertainty)
Summarization apps fail in predictable ways: hallucinated facts, overconfident tone, and missing caveats. Add explicit quality rules:
- Grounding rule: “Only use information from the provided text.”
- Uncertainty rule: “If the text doesn’t support a claim, don’t include it.”
- Caveat rule: “Include caveats/uncertainties if the text is ambiguous.”
- No policy violations: handle refusal/blocked outcomes (status response).
Not from “better prompting vibes.” From explicit constraints + validation + failure-aware UX.
Acceptance criteria (mini test suite)
Write acceptance criteria as if you were writing tests. Here is a strong v1 set:
Functional criteria
- Given a non-empty article text input, the app returns JSON matching the schema.
summary_bulletscontains 5–10 bullets (configurable, but bounded).- All strings are non-empty after trimming whitespace.
Grounding criteria
- The app includes a “caveats” array and it is present even if empty.
- The prompt explicitly instructs “do not invent facts” and “use null when unknown.”
- If input is too short/empty, the app returns a validation error (no model call).
Error and reliability criteria
- The app has a timeout for the model call.
- The app categorizes failures into:
ok,blocked,timeout,rate_limit,invalid_output,auth_error,unknown. - On
invalid_output, the app either retries once with a stricter repair prompt or returns a clear error.
UX criteria (CLI or web)
- Success output is clearly displayed (pretty JSON or formatted bullets).
- Failure output is user-friendly (no stack traces to the user).
- Inputs are not stored by default (privacy-first baseline).
Paste these criteria into your prompts. Require the model to explain how each criterion is satisfied before you accept code.
Edge cases you must define now
These are the “silent assumption” areas that cause bugs later if you don’t define them:
- Empty input: error message and exit code / HTTP status?
- Very long input: reject, truncate, or summarize first?
- Non-article text: do you still summarize, or ask for clarification?
- Language mismatch: summarize in same language or force English?
- Profanity/sensitive content in quoted text: how do you avoid accidental safety blocks?
For v1, choose simple answers (reject or handle gracefully) and document them.
How to prompt for this project (spec → plan → scaffold)
Use your Part III patterns:
- plan first (6.1),
- constraints (6.2),
- define done (6.3),
- examples as mini tests (6.4),
- diff-only changes (7.5) once the repo exists.
A practical prompt sequence
- Prompt A: ask for spec feedback and a plan (no code).
- Prompt B: scaffold repo structure (stubs + README + schema file).
- Prompt C: implement the walking skeleton (one end-to-end path).
- Prompt D: add tests + validation + error taxonomy.
That produces long, brittle output and makes review impossible. Keep changes incremental and verifiable.
Ship points (what “done” means per stage)
- SP1: CLI/web app runs locally; returns a summary for a short input.
- SP2: output validated against schema; invalid outputs handled.
- SP3: failure categories implemented; timeouts/retries in place.
- SP4: prompts versioned as files; logs include prompt version.
Copy-paste spec templates
Template: authoritative spec block
SPEC (authoritative)
Goal:
Summarize an article into structured bullets.
Inputs:
- article_text: string (max N chars)
- title: optional string
Output (JSON schema):
- schema: summarize/v1.json
Constraints:
- Use only provided text (no invented facts)
- If unknown, use null or omit per schema
- No new dependencies unless approved
- Handle blocked/timeout/rate_limit/invalid_output outcomes
Acceptance criteria:
- Output validates against schema
- summary_bullets length 5–10
- empty input returns validation error (no model call)
END SPEC
Template: mini tests (examples)
Mini tests:
1) Input: short article (2 paragraphs)
Expected: valid JSON, 5–10 bullets
2) Input: empty string
Expected: validation error, no model call
3) Input: very long text
Expected: rejected with clear message (v1)