Home/
Part XIII — Expert Mode: Systems, Agents, and Automation/42. Fine-Tuning vs Prompting vs Retrieval (Decision Framework)/42.4 Evaluation before and after tuning
42.4 Evaluation before and after tuning
Overview and links for this section of the guide.
On this page
Evaluation Strategy
// Always evaluate BEFORE fine-tuning to establish baseline
const baseline = await evaluateModel(baseModel, testSet);
// Train
const fineTuned = await trainModel(baseModel, trainingData);
// Evaluate after
const afterTuning = await evaluateModel(fineTuned, testSet);
// Compare
console.log('Improvement:', {
accuracy: afterTuning.accuracy - baseline.accuracy,
format: afterTuning.formatCompliance - baseline.formatCompliance
});
Key Metrics
- Task accuracy: Does it get the right answer?
- Format compliance: Does it match the expected schema?
- General capability: Did we regress on other tasks?