Home/
Part XII — Building Real Products (End-to-End Projects)/36. Project 5: Data-to-Insights Analyst Tool/36.4 Reproducible "analysis notebooks" from AI output
36.4 Reproducible "analysis notebooks" from AI output
Overview and links for this section of the guide.
Why Notebooks?
AI analysis can be a black box. Users see a chart but don't know how it was made. This creates trust issues:
- "Is this calculation correct?"
- "Can I verify this with my data scientist?"
- "How do I reproduce this next month?"
Solution: Export every analysis as a Jupyter notebook (.ipynb) file. This turns the AI from a black box into a draft generator that humans can audit and modify.
The Notebook Artifact
A generated notebook includes:
- Markdown cells explaining the logic ("First, I filtered for Q3...")
- Code cells with the actual Python code
- Output cells with charts and tables
- Data loading cell showing how to reproduce with fresh data
// notebook-structure.json
{
"cells": [
{
"cell_type": "markdown",
"source": ["# Sales Analysis Report\n", "Generated: 2024-01-15\n"]
},
{
"cell_type": "markdown",
"source": ["## Question\n", "\"Show me total sales by region for Q3 2023\""]
},
{
"cell_type": "code",
"source": [
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Load your data\n",
"df = pd.read_csv('your_data.csv')"
]
},
{
"cell_type": "markdown",
"source": ["## Analysis\n", "Filtering data to Q3 2023 and aggregating by region."]
},
{
"cell_type": "code",
"source": [
"# Filter to Q3 2023\n",
"df['date'] = pd.to_datetime(df['date'])\n",
"q3_data = df[(df['date'] >= '2023-07-01') & (df['date'] < '2023-10-01')]\n",
"\n",
"# Aggregate by region\n",
"sales_by_region = q3_data.groupby('region')['sales'].sum()"
]
},
{
"cell_type": "code",
"source": [
"# Create visualization\n",
"plt.figure(figsize=(10, 6))\n",
"sales_by_region.plot(kind='bar')\n",
"plt.title('Q3 2023 Sales by Region')\n",
"plt.ylabel('Total Sales ($)')\n",
"plt.show()"
],
"outputs": [{"output_type": "display_data", "data": {"image/png": "..."}}]
}
]
}
Implementation
// notebook-generator.ts
interface NotebookCell {
cell_type: 'markdown' | 'code';
source: string[];
outputs?: any[];
metadata?: any;
}
interface JupyterNotebook {
nbformat: number;
nbformat_minor: number;
metadata: {
kernelspec: { name: string; display_name: string };
};
cells: NotebookCell[];
}
export class NotebookGenerator {
generateNotebook(
question: string,
code: string,
analysisSteps: string[],
outputs: { type: string; data: any }[]
): JupyterNotebook {
const cells: NotebookCell[] = [];
// Title
cells.push({
cell_type: 'markdown',
source: [
'# AI-Generated Analysis\n',
`Generated: ${new Date().toISOString()}\n`,
'\n',
'> Edit this notebook to customize or verify the analysis.\n'
]
});
// Original question
cells.push({
cell_type: 'markdown',
source: [
'## Original Question\n',
`"${question}"\n`
]
});
// Setup cell
cells.push({
cell_type: 'code',
source: [
'# Setup - modify the path to your data\n',
'import pandas as pd\n',
'import numpy as np\n',
'import matplotlib.pyplot as plt\n',
'\n',
'df = pd.read_csv("your_data.csv") # <- Update this path\n'
]
});
// Analysis steps with explanation
cells.push({
cell_type: 'markdown',
source: ['## Analysis Steps\n', ...analysisSteps.map(s => `- ${s}\n`)]
});
// Main analysis code
cells.push({
cell_type: 'code',
source: code.split('\n').map(line => line + '\n'),
outputs: outputs.map(o => this.formatOutput(o))
});
return {
nbformat: 4,
nbformat_minor: 5,
metadata: {
kernelspec: { name: 'python3', display_name: 'Python 3' }
},
cells
};
}
private formatOutput(output: { type: string; data: any }): any {
if (output.type === 'chart') {
return {
output_type: 'display_data',
data: { 'image/png': output.data }
};
}
if (output.type === 'table') {
return {
output_type: 'execute_result',
data: { 'text/html': this.tableToHtml(output.data) }
};
}
return {
output_type: 'stream',
name: 'stdout',
text: [JSON.stringify(output.data)]
};
}
exportToFile(notebook: JupyterNotebook): Blob {
const content = JSON.stringify(notebook, null, 2);
return new Blob([content], { type: 'application/json' });
}
}
Export Options
Provide multiple export formats:
| Format | Use Case | Implementation |
|---|---|---|
| .ipynb | Full reproducibility in Jupyter | Native notebook JSON |
| .py | Run as Python script | Extract code cells only |
| .html | Share with non-technical users | nbconvert or custom render |
| Print or email reports | nbconvert with LaTeX |
// Export button handler
function handleExport(format: string, notebook: JupyterNotebook) {
switch (format) {
case 'ipynb':
downloadBlob(generator.exportToFile(notebook), 'analysis.ipynb');
break;
case 'py':
const pythonCode = notebook.cells
.filter(c => c.cell_type === 'code')
.map(c => c.source.join(''))
.join('\n\n');
downloadBlob(new Blob([pythonCode]), 'analysis.py');
break;
}
}
Trust Through Transparency
The notebook export is your biggest trust-builder. Users can take the AI's work to a data scientist who can verify every calculation. This turns skeptics into believers.