45.4 It works once but not reliably

On this page

Symptom
Fixes
Where to go next

Symptom

You test with one input, it works. You run in production, sometimes it fails. Non-determinism is frustrating.

Fixes

// Fix 1: Lower temperature
const response = await model.generateContent({
  contents: prompt,
  generationConfig: {
    temperature: 0,  // Most deterministic
    // or 0.1-0.3 for slight variety with consistency
  }
});

// Fix 2: Few-shot examples covering edge cases
const prompt = `
Examples:
Input: "What time is it?" → {"intent": "time_query", ...}
Input: "WHAT TIME IS IT???" → {"intent": "time_query", ...}  // Caps don't matter
Input: "time?" → {"intent": "time_query", ...}  // Incomplete sentence
Input: "" → {"intent": "unclear", ...}  // Empty input

Now classify: "${userInput}"
`;

// Fix 3: Build a regression test suite
const testCases = [
  { input: "hello", expected: "greeting" },
  { input: "buy product", expected: "purchase" },
  { input: "", expected: "unclear" },
  { input: "!@#$%", expected: "unclear" },
];

async function runRegressionTests() {
  let passed = 0;
  for (const { input, expected } of testCases) {
    const result = await classify(input);
    if (result.intent === expected) passed++;
  }
  console.log(`${passed}/${testCases.length} passed`);
}

// Fix 4: Retry with backoff
async function reliableGenerate(prompt: string, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const result = await model.generate(prompt);
    if (isValid(result)) return result;
  }
  throw new Error("Failed after retries");
}

Where to go next

45.5 Too slow/expensive