Introduction
Seaborn is a Python data visualization library built on top of Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
It emphasizes statistical relationships and data distribution, making it ideal for exploratory analysis and publication-quality charts with minimal code.
Installation
# Using pip pip install seaborn # Using conda conda install seaborn
Install matplotlib and pandas alongside Seaborn for full functionality.
Basic Plots
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
# Scatter plot
sns.scatterplot(x="total_bill", y="tip", data=tips)
plt.show()
# Histogram
sns.histplot(tips["total_bill"], bins=20)
plt.show()
Use hue and style to add grouping and improve interpretability.
Categorical Plots
# Boxplot sns.boxplot(x="day", y="total_bill", data=tips) plt.show() # Violin plot sns.violinplot(x="day", y="total_bill", data=tips) plt.show() # Bar plot sns.barplot(x="day", y="total_bill", data=tips) plt.show()
Box and violin plots summarize distributions. Bar plots show aggregates with confidence intervals by default.
Regression Plots
# Linear regression sns.lmplot(x="total_bill", y="tip", data=tips) plt.show() # Residual plot sns.residplot(x="total_bill", y="tip", data=tips) plt.show()
Regression plots help diagnose trends and variance; residual plots expose model misfit and non-linearity.
Matrix Plots
flights = sns.load_dataset("flights").pivot("month","year","passengers")
# Heatmap
sns.heatmap(flights, annot=True, fmt="d", cmap="YlGnBu")
plt.show()
Heatmaps are excellent for correlation matrices and time-series grids.
Pair & Joint Plots
# Pair plot sns.pairplot(tips, hue="sex") plt.show() # Joint plot sns.jointplot(x="total_bill", y="tip", data=tips, kind="scatter") plt.show()
Pair plots show pairwise relationships; joint plots combine scatter and marginal distributions.
Customization
# Themes
sns.set_style("whitegrid")
# Palette
sns.set_palette("pastel")
# Figure size
plt.figure(figsize=(8,5))
sns.boxplot(x="day", y="total_bill", data=tips)
plt.show()
Use sns.set_theme() and sns.color_palette() for consistent styling across plots.
Advanced Topics
- FacetGrid for multi-plot grids
- Relational plots: sns.relplot
- Time series visualization
- Integration with Matplotlib for complex plots
# FacetGrid example g = sns.FacetGrid(tips, col="sex", row="smoker") g.map(sns.scatterplot, "total_bill", "tip") plt.show()
Facet grids scale comparisons across categories, and relational plots handle long-form data elegantly.