How to Use Claude for Data Analysis: Practical Guide
Claude is most useful for data analysis when you paste the data directly into the conversation and ask specific questions — not "analyse this", but "what's the trend in column X", "which customers churned in Q3", or "write Python to calculate the 30-day moving average." Claude can read CSVs, interpret summaries, write analysis code, explain statistical outputs, and generate visualisation code. This guide covers the practical patterns that get useful results.
Pattern 1: Paste CSV data for direct analysis
For datasets under 10,000 rows, paste directly:
Here's my sales data from Q1:
date,product,revenue,units
2026-01-02,Widget A,1250,50
2026-01-02,Widget B,890,35
2026-01-03,Widget A,1100,44
...
1. What's the total revenue by product?
2. Which day had the highest sales?
3. What's the week-over-week growth trend?
Claude reads the CSV headers, understands the data structure, and answers analytical questions directly. For straightforward aggregations and trends, no code is needed.
For larger datasets: paste a sample (first 20 rows + data description) and ask Claude to write pandas code you run yourself.
Pattern 2: Ask Claude to write analysis code
For repeatable analysis or large datasets:
I have a pandas DataFrame called `df` with these columns:
- user_id (int)
- signup_date (datetime)
- plan (str: 'free', 'starter', 'pro')
- monthly_revenue (float)
- last_active_date (datetime)
- churned (bool)
Write Python code to:
1. Calculate churn rate by plan
2. Find the median time-to-churn for churned users
3. Identify the cohort month (based on signup_date) with the highest churn rate
Claude writes production-quality pandas code with proper handling of nulls, datetime parsing, and groupby operations. Review the code, then run it.
Best practice: specify the exact DataFrame schema in your prompt. Claude's code quality is much higher when it knows the column names and types.
Pattern 3: Interpret statistical output
After running analysis code, paste the output back for interpretation:
Here's the output from my churn analysis:
Plan Churn Rate Median Days to Churn N
free 0.34 45 2,840
starter 0.18 120 1,230
pro 0.08 280 450
The overall cohort analysis shows August 2025 had a 40% churn rate vs.
the baseline of 18–22%.
What are the likely explanations for the August spike?
What data would I need to confirm or rule out each hypothesis?
Claude interprets statistical patterns, suggests causal hypotheses, and identifies what additional data would confirm them.
Pattern 4: Generate visualisation code
Using the churn DataFrame described above, write matplotlib/seaborn code to:
- A line chart showing monthly churn rate over time
- A bar chart comparing churn rate by plan
- Use a professional-looking style with clear labels and titles
I'm using Python 3.12 with matplotlib 3.8 and seaborn 0.13.
Specify your library versions — Claude knows the current API and avoids deprecated functions.
Pattern 5: SQL query writing
For database analysis:
I have a PostgreSQL database with these tables:
users (id, email, created_at, plan, country)
events (id, user_id, event_name, properties, created_at)
subscriptions (id, user_id, plan, started_at, ended_at, mrr)
Write a SQL query to:
Find users who signed up in January 2026, upgraded from free to starter
within 30 days of signup, and then upgraded to pro within 90 days.
Show their signup date, first upgrade date, second upgrade date, and current MRR.
Claude writes complex joins, window functions, and date arithmetic reliably when given the full schema.
Claude's data analysis capabilities
Strong:
- Aggregations and groupings (sum, count, average by category)
- Time series analysis (trends, seasonality, moving averages)
- Cohort analysis (retention, churn by cohort)
- Statistical interpretation (explaining p-values, confidence intervals, correlation)
- Code generation (pandas, SQL, R, Excel formulas)
Moderate:
- Pattern recognition in complex multi-dimensional data
- Causal inference and A/B test interpretation
- Forecasting (can write code for ARIMA/Prophet, but won't run it)
Weak:
- Running code and observing output directly (unless via Claude Code or computer use)
- Very large datasets (paste samples + have Claude write code instead)
- Domain-specific statistical methods not well-represented in training data
Asking the right questions
Too vague (produces generic analysis):
Analyse this sales data and tell me what you find.
Specific and useful:
For this sales data:
1. Is the decline in Widget A revenue driven by fewer units sold, lower price,
or a mix? Show the calculation.
2. Which sales rep had the best Q1 vs Q4 year-over-year growth?
3. If current trends continue, what's the projected Q2 revenue?
Show your calculation method.
The more specific your questions, the more actionable the analysis.
Data privacy considerations
For sensitive data (PII, financial records, health data):
- Anonymise before pasting: replace names with IDs, generalise ages to buckets
- Consider whether you're allowed to share data with a third-party AI service
- For highly sensitive data, have Claude write analysis code you run locally instead of pasting the data
Frequently asked questions
Can Claude access my database directly? Via the API with custom tools, yes. Claude can call a database query function you define. Via the claude.ai chat interface, no — you need to paste data or results. For production data analysis automation, build a Claude agent with database access.
How much data can I paste into Claude? Up to 200,000 tokens (Claude's context window). That's roughly 150,000 words or about 10,000–20,000 rows of typical CSV data. For larger datasets, paste a sample and have Claude write code you run on the full dataset.
Is Claude better than dedicated BI tools (Tableau, Looker)? Different tools for different purposes. BI tools are better for persistent dashboards, scheduled reports, and sharing with non-technical stakeholders. Claude is better for exploratory analysis ("I have a question and need to dig into the data now"), writing analysis code, and interpreting complex patterns.
Can Claude write Excel formulas? Yes, including complex nested formulas. Describe what you want in plain English: "I need a formula in column F that calculates the running total of column C only for rows where column B = 'Active'". Claude writes the exact Excel formula.
What programming languages can Claude write analysis code in? Python (pandas, numpy, scipy, matplotlib, seaborn, plotly), R (dplyr, ggplot2, tidyr), SQL (PostgreSQL, MySQL, SQLite, BigQuery, Snowflake syntax), and others including Julia and MATLAB for more specialised use cases.
Related guides
- Claude JSON Structured Output: Getting Reliable JSON Every Time — structured data extraction from documents
- Claude Agent SDK: Build Your First Agent in 30 Minutes — building automated data analysis agents
Take It Further
Power Prompts 300: Claude Code Productivity Patterns — Section 6 covers Data Analysis Prompts: 30 templates for data interpretation, code generation, statistical analysis, visualisation, and SQL query writing — all with the schema-specification pattern that produces production-quality code.
30-day money-back guarantee. Instant download.