Courses Curriculum Cities Blog Enroll Now
Data Visualization · Day 2 of 5 ~40 minutes

Day 2: Seaborn: Statistical Visualization Made Simple

Seaborn builds on matplotlib and produces beautiful statistical charts with less code. Master the 6 chart types that cover all common statistical visualization needs.

1
Day 1
2
Day 2
3
Day 3
4
Day 4
5
Day 5
What You'll Build

Six seaborn charts: a distribution plot, box plot, heatmap, pair plot, violin plot, and categorical plot — all on real data from a CSV file.

1
Section 1 · 8 min

Why Seaborn Over Pure Matplotlib

Seaborn doesn't replace matplotlib — it builds on it. The value: statistical charts that would take 50 lines in matplotlib take 5 in seaborn, and they look better by default.

pythonseaborn_setup.py
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Set the default theme — all charts look this good
sns.set_theme(style="whitegrid", palette="muted")

# Load a built-in seaborn dataset for practice
df = sns.load_dataset("penguins")
print(df.head())
2
Section 2 · 17 min

Six Essential Seaborn Charts

pythonseaborn_charts.py
import seaborn as sns
import matplotlib.pyplot as plt

df = sns.load_dataset("penguins").dropna()
sns.set_theme(style="whitegrid")

# 1. Distribution plot
fig, axes = plt.subplots(2, 3, figsize=(15, 10))

sns.histplot(df, x="flipper_length_mm", hue="species", ax=axes[0,0])
axes[0,0].set_title("Distribution by Species")

# 2. Box plot
sns.boxplot(df, x="species", y="body_mass_g", ax=axes[0,1])
axes[0,1].set_title("Body Mass by Species")

# 3. Heatmap
corr = df.select_dtypes("number").corr()
sns.heatmap(corr, annot=True, fmt=".2f", cmap="Blues", ax=axes[0,2])
axes[0,2].set_title("Correlation Heatmap")

# 4. Violin plot
sns.violinplot(df, x="species", y="bill_length_mm", ax=axes[1,0])
axes[1,0].set_title("Bill Length Distribution")

# 5. Scatter with regression
sns.regplot(df, x="flipper_length_mm", y="body_mass_g", ax=axes[1,1])
axes[1,1].set_title("Flipper vs Mass")

# 6. Count plot
sns.countplot(df, x="species", hue="island", ax=axes[1,2])
axes[1,2].set_title("Species by Island")

plt.tight_layout()
plt.savefig("seaborn_charts.png", dpi=150, bbox_inches="tight")
3
Section 3 · 15 min

Working with Real Data

The built-in datasets are for practice. Real work uses CSVs. Here's how to load real data and handle the most common issues:

pythonreal_data.py
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load your CSV
df = pd.read_csv("your_data.csv", parse_dates=["date_col"])

# Quick overview before plotting
print(df.describe())    # statistical summary
print(df.isnull().sum())  # missing values per column

# Drop missing for plotting (or fill with median)
df_clean = df.dropna(subset=["key_column"])

# Plot the distribution of your most important metric
fig, ax = plt.subplots(figsize=(10, 6))
sns.histplot(data=df_clean, x="key_column", ax=ax)
ax.set_title("Distribution of Key Metric")

What You Learned Today

  • How seaborn's set_theme() makes all charts look professional instantly
  • The six seaborn chart types and when to use each: histplot, boxplot, heatmap, violinplot, regplot, countplot
  • How the hue parameter adds a categorical variable as color to any chart
  • How to load real CSV data and handle missing values before plotting
Your Challenge

Go Further on Your Own

  • Load a dataset you actually use at work (or download one from Kaggle). Build a seaborn pair plot using sns.pairplot() — what correlations do you see?
  • Build a correlation heatmap for your real dataset. Are the correlations you'd expect there? Any surprises?
  • Use sns.FacetGrid to create a grid of the same plot broken down by a categorical variable
Day 2 Complete

Nice work. Keep going.

Day 3 is ready when you are.

Continue to Day 3
Course Progress
40%

Want live instruction and hands-on projects? Join the AI bootcamp — 3 days, 5 cities.

Finished this lesson?