Claude Code for Data Science: Jupyter and Notebook Workflows

Q: Can Claude Code run notebook cells directly?

Claude Code can read .ipynb files and generate/modify cell content. To actually run cells, you still use Jupyter Lab/Notebook or jupyter nbconvert --execute. Claude generates the code, you run it.

Q: How do I share DataFrame context with Claude for debugging?

Include df.dtypes, df.head(), and the exact error message. For large DataFrames, also include df.describe() for numeric columns. Claude needs the structure and a sample to debug effectively.

Claude Code works with Jupyter notebooks through two paths: the CLI that can read and run notebook cells, and direct .ipynb file editing that treats notebook JSON as structured data. For most data science workflows, the most effective pattern is using Claude Code in the terminal alongside an open notebook — Claude generates code, you paste and run, iterate in 2026. This guide covers the patterns that work for EDA, visualization, and model development.

Setup: CLAUDE.md for Data Science Projects

# data-project CLAUDE.md

## Environment
- Python 3.12, Jupyter Lab
- Package manager: uv or pip
- Key packages: pandas, numpy, matplotlib, seaborn, scikit-learn, polars

## Data Conventions
- Raw data: data/raw/ (read-only, never modified)
- Processed: data/processed/
- Outputs: data/outputs/ (figures, reports)
- Column naming: snake_case
- Date columns: always parse as datetime, store UTC

## Code Style
- Type hints on functions
- Docstrings on public functions
- No magic numbers — name your constants
- DataFrames: prefer method chaining over intermediate variables

## Notebook Conventions
- First cell: imports only
- Second cell: configuration/constants
- Each analysis section: one markdown cell explaining what/why, then code cell(s)
- Save figures: always save to data/outputs/ in addition to displaying

## Testing
- Tests for data transformation functions: tests/
- Use pytest

Pattern 1: EDA (Exploratory Data Analysis)

# Generate a complete EDA notebook for a dataset
claude "Write Jupyter notebook cells for EDA on this dataset:
File: data/raw/sales_2026.csv
Known columns: date, product_id, quantity, price, region, customer_id

Generate cells for:
1. Load and basic info (shape, dtypes, head)
2. Missing value analysis (heatmap + counts)
3. Distribution of numeric columns (histograms)
4. Time series: monthly revenue trend
5. Top 10 products by revenue
6. Regional breakdown (bar chart)

Use seaborn for plots, save each figure to data/outputs/.
Each section: markdown explanation cell + code cell."

Generated output (example of one section):

# Cell: Missing Value Analysis
import missingno as msno
import matplotlib.pyplot as plt

# Count missing values
missing = df.isnull().sum()
missing_pct = (missing / len(df) * 100).sort_values(ascending=False)
missing_df = pd.DataFrame({'count': missing, 'pct': missing_pct})
print(missing_df[missing_df['count'] > 0])

# Visualize
fig, ax = plt.subplots(figsize=(10, 6))
msno.bar(df, ax=ax, color='steelblue', fontsize=12)
plt.title('Missing Value Distribution')
plt.tight_layout()
plt.savefig('data/outputs/missing_values.png', dpi=150, bbox_inches='tight')
plt.show()

Pattern 2: Debugging DataFrames

When a DataFrame transformation isn't working:

claude "Debug this pandas code:

df_result = (
    df.groupby(['region', 'month'])
    .agg({'revenue': 'sum', 'quantity': 'sum'})
    .reset_index()
    .pivot(index='month', columns='region', values='revenue')
    .fillna(0)
)

Error: 'DataFrame' object has no attribute 'values' on the pivot step.
DataFrame dtypes: region (object), month (object), revenue (float64), quantity (int64)
Sample: [paste df.head() output]"

Pattern 3: Visualization Generation

claude "Create a visualization function:
- Input: dataframe with columns [date, category, value]
- Output: subplot grid showing:
  1. Line plot per category over time
  2. Stacked bar chart by month
  3. Correlation heatmap between numeric columns
- Use seaborn theme='whitegrid'
- Save to data/outputs/analysis_[timestamp].png
- Return the figure object"

Pattern 4: Reading and Modifying Existing Notebooks

Claude Code can read .ipynb files directly:

# Read a notebook and suggest improvements
claude "@notebooks/sales_analysis.ipynb

Review this notebook and:
1. Identify any cells with deprecated pandas syntax (use pd.concat instead of append, etc.)
2. Find plots missing axis labels or titles
3. Suggest where to add docstrings
4. Note any magic numbers that should be constants

Don't modify anything yet — just report."

Then apply specific fixes:

claude "@notebooks/sales_analysis.ipynb

Fix all the issues you found:
- Update deprecated pandas syntax
- Add missing axis labels (use descriptive labels from column names)
- Add module-level constants for magic numbers

Show me the changed cells only."

Pattern 5: Model Evaluation Code

claude "Write model evaluation cells for a binary classification problem.
Model: sklearn LogisticRegression (already trained as 'model')
Test data: X_test, y_test (already defined)

Generate cells for:
1. Predictions and probability scores
2. Classification report (precision, recall, F1)
3. Confusion matrix heatmap
4. ROC curve with AUC
5. Precision-Recall curve
6. Feature importance (coefficients) bar chart

Each metric: explain what it means in one markdown sentence.
All plots: save to data/outputs/model_eval/"

Using Claude Code CLI with Jupyter

# Run notebook non-interactively
jupyter nbconvert --to notebook --execute notebooks/analysis.ipynb

# Claude can help debug failed executions
claude "This notebook failed during execution:
Error: KeyError: 'customer_segment' in cell 15
The column was renamed from 'segment' to 'customer_segment' in preprocessing.
Fix all references in the notebook."

# Generate a report from notebook output
claude "Convert the outputs from notebooks/analysis.ipynb into a
markdown report summary. Extract: key metrics, main findings,
and embed the saved figure paths as image references."

Frequently Asked Questions

Can Claude Code run notebook cells directly? Claude Code can read .ipynb files and generate/modify cell content. To actually run cells, you still use Jupyter Lab/Notebook or jupyter nbconvert --execute. Claude generates the code, you run it.

How do I share DataFrame context with Claude for debugging? Include df.dtypes, df.head(), and the exact error message. For large DataFrames, also include df.describe() for numeric columns. Claude needs the structure and a sample to debug effectively.

Is Claude good at pandas vs polars? Claude generates good pandas code (more training data). Polars is newer but Claude Code handles it reasonably with CLAUDE.md context: "Use polars for all DataFrame operations, not pandas."

What's the best way to iterate on visualizations with Claude? Describe what you want, generate, run it, then paste the output description back: "The bars are too narrow and the x-axis labels overlap. Fix spacing and rotate labels 45°." Claude iterates well on visualization details.

Related Guides

Claude Code Complete Guide — Full reference
Claude Code for Backend: Python, Go, Node.js — Python backend patterns
Context Engineering for Claude — CLAUDE.md optimization

Go Deeper

Power Prompts 300 — $29 — 20 data science prompts: full EDA workflows, model evaluation templates, visualization generation patterns, and the data science CLAUDE.md template.

→ Get Power Prompts 300 — $29

30-day money-back guarantee. Instant download.

Claude Code for Data Science: Jupyter and Notebook Workflows

Setup: CLAUDE.md for Data Science Projects

Pattern 1: EDA (Exploratory Data Analysis)

Pattern 2: Debugging DataFrames

Pattern 3: Visualization Generation

Pattern 4: Reading and Modifying Existing Notebooks

Pattern 5: Model Evaluation Code

Using Claude Code CLI with Jupyter

Frequently Asked Questions

Related Guides

Go Deeper

Related guides

Claude Code for Data Science: EDA & Visualization Workflows

Automate Web Scraping with Claude Code (2026)

Claude Code for Backend: Python vs Go vs Node.js Comparison

How to Use Claude for Data Analysis: Practical Guide

Tools and references