14.4 Data flow: input → model → structured output → display
Overview and links for this section of the guide.
On this page
Goal: a clean, testable pipeline
AI apps become reliable when the data flow is explicit and validated. The goal here is to design your app so:
- inputs are validated before model calls,
- outputs are validated before your UI trusts them,
- failures are categorized and handled consistently,
- the whole pipeline is testable without real model calls.
Think of your app as moving from “input received” → “validated” → “model called” → “output validated” → “rendered.” Each transition can fail and must be handled.
Pipeline diagram (end-to-end)
User Input
↓
Input validation (empty? too long? encoding?)
↓
Build LLMRequest (prompt_id, prompt_version, schema_version, options)
↓
LLM wrapper call
↓
Parse + schema-validate output
↓
LLMResponse(status, result, metadata, error)
↓
Render result (CLI/web) OR render failure state
Step-by-step data flow
1) Receive input
- CLI: read from stdin or file
- Web: read from POST body
Immediately normalize whitespace and enforce max length.
2) Validate input
Validation should be fast and deterministic. Example rules:
- empty → validation_error
- too long → validation_error (or truncate with explicit behavior)
- non-text/binary → validation_error
Important: if validation fails, do not call the model.
3) Build the model request
This is where you “bind” the spec into an explicit request:
- prompt id and version (e.g.,
summarize@v1) - schema version (e.g.,
summarize/v1.json) - options (timeouts, retry caps, model selection lane)
- inputs (article text + optional metadata)
This is also where you attach a request id for logging.
4) Call the LLM wrapper
The wrapper is responsible for:
- building the final prompt from templates,
- making the provider call,
- timeouts and retry policy,
- parsing and validating structured output,
- returning
LLMResponsewith a category and metadata.
Your app layer should not implement these details repeatedly.
5) Render the result or the failure state
The UI layer should treat the response as:
- ok: render bullets, claims, caveats
- blocked: show refusal-aware UX (safe alternatives)
- timeout/rate_limit: show retry guidance
- invalid_output: show “try again” or fallback behavior
- unknown: show generic error with request id
Where validation must happen
Two validations are non-negotiable:
- Input validation: prevents garbage-in and reduces safety risk.
- Output validation: prevents garbage-out and makes parsing reliable.
Do validation at boundaries, not inside every caller.
If you accept malformed output and “do your best,” you create hidden correctness bugs. Fail clearly or repair explicitly.
Outcome handling (status-first)
A strong design choice: make status explicit everywhere. Example:
- Backend returns
{status, result, message, request_id} - CLI returns an exit code per status category
This turns failure modes into predictable branches instead of exceptions.
Observability points (what to log)
Log at choke points, using metadata not raw content:
- input length, validation outcome
- prompt id/version, schema version
- model name, latency, retry count
- outcome category
- request id (for correlation)
If you can’t tell which prompt version produced an output, you can’t debug or reproduce. Log versions consistently.
Testing the pipeline (without real model calls)
Test the pipeline by substituting the LLM wrapper with a fake:
- fake returns a valid schema object → UI renders correctly
- fake returns invalid output → app returns invalid_output state
- fake returns blocked → refusal UX works
- fake returns timeout → retry guidance works
These tests are fast and deterministic.
Failure injection (practice the edge cases)
Before you ship, simulate failures deliberately:
- force a timeout (set timeout to 1ms in dev)
- force invalid JSON (fake client returns malformed output)
- force rate limit category (fake client returns rate_limit)
This is how you confirm the product doesn’t fall apart under real-world conditions.