31.1 Input sanitization and allowlists

Overview and links for this section of the guide.

Goal: reduce attack surface at the entry point

User input is untrusted. In LLM apps, “input” is not just a string—inputs can include:

  • free-form user prompts,
  • uploaded files and documents,
  • copied logs, screenshots, transcripts,
  • retrieved context in RAG systems.

Your goal is to reduce the system’s attack surface by:

  • routing requests to known safe modes,
  • restricting allowed operations,
  • normalizing and validating inputs,
  • applying size and rate limits.
Input sanitization is not prompt injection “solved”

Filtering strings does not defeat injection. Sanitization reduces accidental failures and some low-effort abuse. Your real boundary is permissions, tool design, and output validation.

Principles: sanitize, constrain, and route

Three principles that work together:

  • Sanitize: normalize known quirks and remove obvious hazards.
  • Constrain: restrict what the system is allowed to do.
  • Route: classify the request into a specific mode with a specific prompt/tool set.

“Route” is important: one of the best security controls is not letting arbitrary input choose arbitrary behavior.

Allowlisting: what the system is allowed to do

Allowlisting starts with: what tasks are supported?

Examples of allowlisted modes:

  • summarize text,
  • extract structured fields,
  • answer questions from retrieved sources,
  • draft an email response,
  • propose code changes (proposal-only).

Requests outside allowlisted modes should be refused or routed to a safe fallback (“I can’t do that, but I can …”).

Allowlist dimensions you can enforce:

  • Allowed modes: which tasks exist.
  • Allowed tools: which tools are callable per mode.
  • Allowed outputs: which schema shapes are allowed.
  • Allowed data scopes: which tenant/docs a user can access.
Security benefit

Allowlists make behavior predictable. Predictable behavior is testable. Testable behavior is shippable.

Sanitization: what to normalize and what to reject

Sanitization is mostly about stability and safe handling, not “clever defense.”

Normalize

  • trim whitespace and normalize line endings,
  • normalize unicode where possible (avoid weird invisible characters),
  • remove obviously non-text payload wrappers (if expected),
  • standardize time zones and dates if your app expects them.

Reject or quarantine

  • inputs exceeding size limits,
  • unsupported file types,
  • inputs that contain suspected secrets (optional: block or force redaction),
  • inputs that trigger policy categories you do not support.

For “suspected secrets,” a safe product posture is:

  • stop and ask the user to redact, or
  • run deterministic redaction and proceed only with redacted text.

Limits: size, rate, and complexity caps

Limits prevent cost spikes and abuse:

  • Length cap: max input size and max context included.
  • Rate limiting: requests per minute per user/tenant.
  • Complexity cap: max retrieved chunks, max tool calls, max output length.
  • Timeouts: end-to-end request deadlines.

Limits are security and reliability controls at the same time.

RAG-specific input concerns

In RAG systems, “input” includes retrieved documents.

Controls:

  • Corpus allowlists: limit which sources can be retrieved for high-risk features.
  • Permission filters: enforce user scope before retrieval.
  • Content-type routing: prefer canonical docs over tickets/chats.
  • Injection detection signal: tag suspicious chunks for analysis (but don’t rely on tagging as the only defense).

Copy-paste prompts (classification and routing)

Prompt: classify request into a safe mode

Classify this user request into one of the allowed modes.

Allowed modes:
- summarize
- extract_json
- grounded_qa
- draft_reply
- not_supported

Rules:
- If ambiguous, choose "not_supported" and ask one clarifying question.
- Do not execute anything.

Return JSON:
{ "mode": string, "reason": string, "clarifying_question": string|null }

User request:
"""..."""

Practical checklist

  • Allowlist modes: define supported tasks and reject the rest.
  • Normalize input: handle whitespace/unicode and expected formats.
  • Cap size: hard limits on input/context/output.
  • Rate limit: per user/tenant.
  • Redact: detect and remove secrets/PII where required.
  • Route: map each mode to a specific prompt and tool set.

Where to go next