36.4 Reproducible "analysis notebooks" from AI output

Overview and links for this section of the guide.

Why Notebooks?

AI analysis can be a black box. Users see a chart but don't know how it was made. This creates trust issues:

  • "Is this calculation correct?"
  • "Can I verify this with my data scientist?"
  • "How do I reproduce this next month?"

Solution: Export every analysis as a Jupyter notebook (.ipynb) file. This turns the AI from a black box into a draft generator that humans can audit and modify.

The Notebook Artifact

A generated notebook includes:

  1. Markdown cells explaining the logic ("First, I filtered for Q3...")
  2. Code cells with the actual Python code
  3. Output cells with charts and tables
  4. Data loading cell showing how to reproduce with fresh data
// notebook-structure.json
{
  "cells": [
    {
      "cell_type": "markdown",
      "source": ["# Sales Analysis Report\n", "Generated: 2024-01-15\n"]
    },
    {
      "cell_type": "markdown", 
      "source": ["## Question\n", "\"Show me total sales by region for Q3 2023\""]
    },
    {
      "cell_type": "code",
      "source": [
        "import pandas as pd\n",
        "import matplotlib.pyplot as plt\n",
        "\n",
        "# Load your data\n",
        "df = pd.read_csv('your_data.csv')"
      ]
    },
    {
      "cell_type": "markdown",
      "source": ["## Analysis\n", "Filtering data to Q3 2023 and aggregating by region."]
    },
    {
      "cell_type": "code",
      "source": [
        "# Filter to Q3 2023\n",
        "df['date'] = pd.to_datetime(df['date'])\n",
        "q3_data = df[(df['date'] >= '2023-07-01') & (df['date'] < '2023-10-01')]\n",
        "\n",
        "# Aggregate by region\n",
        "sales_by_region = q3_data.groupby('region')['sales'].sum()"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# Create visualization\n",
        "plt.figure(figsize=(10, 6))\n",
        "sales_by_region.plot(kind='bar')\n",
        "plt.title('Q3 2023 Sales by Region')\n",
        "plt.ylabel('Total Sales ($)')\n",
        "plt.show()"
      ],
      "outputs": [{"output_type": "display_data", "data": {"image/png": "..."}}]
    }
  ]
}

Implementation

// notebook-generator.ts
interface NotebookCell {
  cell_type: 'markdown' | 'code';
  source: string[];
  outputs?: any[];
  metadata?: any;
}

interface JupyterNotebook {
  nbformat: number;
  nbformat_minor: number;
  metadata: {
    kernelspec: { name: string; display_name: string };
  };
  cells: NotebookCell[];
}

export class NotebookGenerator {
  generateNotebook(
    question: string,
    code: string,
    analysisSteps: string[],
    outputs: { type: string; data: any }[]
  ): JupyterNotebook {
    const cells: NotebookCell[] = [];
    
    // Title
    cells.push({
      cell_type: 'markdown',
      source: [
        '# AI-Generated Analysis\n',
        `Generated: ${new Date().toISOString()}\n`,
        '\n',
        '> Edit this notebook to customize or verify the analysis.\n'
      ]
    });
    
    // Original question
    cells.push({
      cell_type: 'markdown',
      source: [
        '## Original Question\n',
        `"${question}"\n`
      ]
    });
    
    // Setup cell
    cells.push({
      cell_type: 'code',
      source: [
        '# Setup - modify the path to your data\n',
        'import pandas as pd\n',
        'import numpy as np\n',
        'import matplotlib.pyplot as plt\n',
        '\n',
        'df = pd.read_csv("your_data.csv")  # <- Update this path\n'
      ]
    });
    
    // Analysis steps with explanation
    cells.push({
      cell_type: 'markdown',
      source: ['## Analysis Steps\n', ...analysisSteps.map(s => `- ${s}\n`)]
    });
    
    // Main analysis code
    cells.push({
      cell_type: 'code',
      source: code.split('\n').map(line => line + '\n'),
      outputs: outputs.map(o => this.formatOutput(o))
    });
    
    return {
      nbformat: 4,
      nbformat_minor: 5,
      metadata: {
        kernelspec: { name: 'python3', display_name: 'Python 3' }
      },
      cells
    };
  }
  
  private formatOutput(output: { type: string; data: any }): any {
    if (output.type === 'chart') {
      return {
        output_type: 'display_data',
        data: { 'image/png': output.data }
      };
    }
    if (output.type === 'table') {
      return {
        output_type: 'execute_result',
        data: { 'text/html': this.tableToHtml(output.data) }
      };
    }
    return {
      output_type: 'stream',
      name: 'stdout',
      text: [JSON.stringify(output.data)]
    };
  }
  
  exportToFile(notebook: JupyterNotebook): Blob {
    const content = JSON.stringify(notebook, null, 2);
    return new Blob([content], { type: 'application/json' });
  }
}

Export Options

Provide multiple export formats:

Format Use Case Implementation
.ipynb Full reproducibility in Jupyter Native notebook JSON
.py Run as Python script Extract code cells only
.html Share with non-technical users nbconvert or custom render
.pdf Print or email reports nbconvert with LaTeX
// Export button handler
function handleExport(format: string, notebook: JupyterNotebook) {
  switch (format) {
    case 'ipynb':
      downloadBlob(generator.exportToFile(notebook), 'analysis.ipynb');
      break;
    case 'py':
      const pythonCode = notebook.cells
        .filter(c => c.cell_type === 'code')
        .map(c => c.source.join(''))
        .join('\n\n');
      downloadBlob(new Blob([pythonCode]), 'analysis.py');
      break;
  }
}
Trust Through Transparency

The notebook export is your biggest trust-builder. Users can take the AI's work to a data scientist who can verify every calculation. This turns skeptics into believers.

Where to go next