Home/
Part XII — Building Real Products (End-to-End Projects)/36. Project 5: Data-to-Insights Analyst Tool/36.5 Guardrails: avoid confident math errors
36.5 Guardrails: avoid confident math errors
Overview and links for this section of the guide.
On this page
The LLM Math Problem
If you ask an LLM "What is 1234 * 5678?", it might get the answer wrong. LLMs are language models, not calculators. They predict tokens, not compute arithmetic.
Common failure modes:
- Wrong last digit in multiplication
- Incorrect carry operations
- Rounding errors mentioned incorrectly
- Percentage calculations off by factors
The Code Solution
Never let the LLM do the math. Instead:
- LLM writes code that does the math
- Python executes the code (100% accurate)
- Display the computed result
// Good vs Bad approach
// ❌ BAD: Asking LLM to calculate
Prompt: "Calculate the sum of the sales column."
LLM: "The sum is approximately $1,234,567." // Might be wrong!
// ✅ GOOD: LLM writes code
Prompt: "Write code to calculate the sum of the sales column."
LLM: "df['sales'].sum()"
Python: 1234567.89 // Exactly correct
The Golden Rule of Analytics
Code is the calculator. The model is the operator. The model tells the code what to do; it never does the math itself.
Output Verification
Even with code execution, verify outputs make sense:
// output-verification.ts
interface VerificationResult {
passed: boolean;
warnings: string[];
errors: string[];
}
export function verifyNumericOutput(
result: any,
context: { columnName?: string; expectedRange?: [number, number]; rowCount?: number }
): VerificationResult {
const warnings: string[] = [];
const errors: string[] = [];
// Check for common impossible values
if (typeof result === 'number') {
if (isNaN(result)) {
errors.push('Result is NaN - calculation may have divided by zero');
}
if (!isFinite(result)) {
errors.push('Result is infinite - check for overflow');
}
if (result < 0 && context.columnName?.toLowerCase().includes('count')) {
errors.push('Negative count - this should not be possible');
}
}
// Check aggregation sanity
if (Array.isArray(result)) {
const numericValues = result.filter(r => typeof r === 'number' && isFinite(r));
const max = Math.max(...numericValues);
const min = Math.min(...numericValues);
if (max / min > 1000000) {
warnings.push('Very large range in results - verify this is expected');
}
}
// Check against expected range
if (context.expectedRange) {
const [min, max] = context.expectedRange;
if (result < min || result > max) {
warnings.push(`Result ${result} outside expected range [${min}, ${max}]`);
}
}
return {
passed: errors.length === 0,
warnings,
errors
};
}
Implementation
// math-guardrails.ts
export class MathGuardrails {
// Ensure the LLM never attempts direct calculation
sanitizePrompt(question: string): string {
const calculationPatterns = [
/calculate\s+the\s+(sum|average|mean|total)/gi,
/what\s+is\s+\d+\s*[\+\-\*\/]\s*\d+/gi,
/how\s+much\s+is/gi,
];
let sanitized = question;
for (const pattern of calculationPatterns) {
if (pattern.test(question)) {
// Rephrase to code generation
sanitized = question.replace(pattern, 'write code to calculate the $1');
}
}
return sanitized;
}
// Add verification step after execution
async analyzeWithVerification(
analyst: DataAnalyst,
schema: string,
question: string,
dataPath: string
): Promise {
// Generate and execute
const result = await analyst.analyze(schema, question, dataPath);
// Verify the output
const verification = verifyNumericOutput(result.output.data, {
columnName: this.extractTargetColumn(question),
});
if (!verification.passed) {
throw new VerificationError(verification.errors.join('; '));
}
return {
...result,
verification
};
}
// Double-check percentages
verifyPercentage(value: number, description: string): void {
if (value < 0) {
throw new Error(`${description} is negative (${value}%) - check calculation`);
}
if (value > 100 && !description.includes('growth')) {
throw new Error(`${description} exceeds 100% (${value}%) - might need normalization`);
}
}
}
Common Math Errors to Catch
| Error Type | Example | Detection |
|---|---|---|
| Division by zero | NaN or Infinity result | isNaN() or !isFinite() |
| Percentage > 100% | Market share = 150% | Range check |
| Negative counts | -5 orders | Sign check on count columns |
| Future dates | Data from 2030 | Date range validation |
| Null aggregation | sum() on all nulls | Check for None/NaN result |
// Sanity checks for common analytics outputs
function sanityCheckAnalytics(
query: string,
result: any
): string[] {
const warnings: string[] = [];
// Revenue/sales should be positive
if (query.match(/revenue|sales|income/i) && result < 0) {
warnings.push('Negative revenue detected - verify this is intentional');
}
// Counts should be integers
if (query.match(/count|number of|how many/i) && !Number.isInteger(result)) {
warnings.push('Non-integer count - rounding may have occurred');
}
// Rates should be between 0 and 1 (or 0 and 100 for percentages)
if (query.match(/rate|ratio|percentage/i)) {
if (result > 100 || result < 0) {
warnings.push('Rate/percentage outside normal range');
}
}
return warnings;
}
Always Show the Code
When displaying results, always show the calculation code. If users see df['revenue'].sum(), they can
verify it themselves. If they just see "$1.2M", they have to trust blindly.