Home/
Part XI — Performance & Cost Optimization (Making It Fast and Affordable)/32. Token Economics for Builders/32.4 Batch processing vs interactive mode
32.4 Batch processing vs interactive mode
Overview and links for this section of the guide.
On this page
Comparison
| Aspect | Interactive | Batch |
|---|---|---|
| Latency | Low (real-time) | High (acceptable) |
| Cost | Higher | Lower |
| Use case | Chat, real-time | Reports, bulk processing |
| Error handling | Per-request | Retry entire batch |
Batching Strategy
// Instead of N separate requests, batch into 1
async function batchClassify(items: string[]) {
// Bad: N API calls
// for (const item of items) {
// await classify(item); // N × latency, N × overhead
// }
// Good: 1 API call
const prompt = `
Classify each item (output JSON array):
${items.map((item, i) => `${i+1}. ${item}`).join('\n')}
`;
const response = await model.generate(prompt);
return JSON.parse(response);
}
// Efficiency gain:
// 100 items × 10 tokens each = 1000 tokens
// vs. 100 requests × (overhead per request)
// Cost difference:
// Interactive: 100 requests × $0.002 = $0.20
// Batched: 1 request × $0.005 = $0.005 (40x cheaper)
When to Batch
- Batch: Processing CSV uploads, generating reports, background jobs
- Interactive: Chat, real-time suggestions, user-facing responses
- Hybrid: Buffer requests for 100ms, batch what arrives
// Micro-batching for near-real-time
class RequestBatcher {
private buffer: Request[] = [];
private timeout: NodeJS.Timeout | null = null;
add(request: Request) {
this.buffer.push(request);
if (!this.timeout) {
this.timeout = setTimeout(() => this.flush(), 100);
}
}
async flush() {
const batch = this.buffer;
this.buffer = [];
this.timeout = null;
const results = await batchProcess(batch);
batch.forEach((req, i) => req.resolve(results[i]));
}
}