Visualisation is how you communicate analytical findings to others and how you discover things you would never see in a table. Python offers two complementary tools: Matplotlib for precise, low-level control over every element of a plot, and Seaborn for high-level statistical graphics that are beautiful by default and integrate directly with Pandas DataFrames.
The Matplotlib Object Model
Every Matplotlib plot has a hierarchy: a Figure contains one or more Axes; each Axes holds the actual plot elements (lines, bars, labels). Understanding this object model is essential for producing multi-panel figures:
python
import matplotlib.pyplot as pltimport numpy as np# Explicit object-oriented API — always prefer this over plt.* functionsfig, axes = plt.subplots(nrows=2, ncols=2, figsize=(12, 8))x = np.linspace(0, 2 * np.pi, 200)axes[0, 0].plot(x, np.sin(x), color="#2196F3", linewidth=2)axes[0, 0].set_title("Sine Wave")axes[0, 0].set_xlabel("x")axes[0, 0].set_ylabel("sin(x)")axes[0, 1].plot(x, np.cos(x), color="#E91E63", linestyle="--")axes[0, 1].set_title("Cosine Wave")fig.suptitle("Trigonometric Functions", fontsize=16, fontweight="bold")fig.tight_layout()plt.savefig("trig_functions.png", dpi=150, bbox_inches="tight")plt.show()
Histograms — Distribution Shape
Histograms reveal the shape of a distribution: is it normal, skewed, bimodal, or truncated?
python
import seaborn as snsimport pandas as pddf = sns.load_dataset("tips")fig, axes = plt.subplots(1, 2, figsize=(12, 4))# Matplotlib histogramaxes[0].hist(df["total_bill"], bins=30, edgecolor="white", color="#42A5F5", alpha=0.8)axes[0].set_xlabel("Total Bill ($)")axes[0].set_ylabel("Count")axes[0].set_title("Distribution of Total Bill (Matplotlib)")# Seaborn histogram with KDE overlaysns.histplot(data=df, x="total_bill", bins=30, kde=True, ax=axes[1])axes[1].set_title("Distribution of Total Bill (Seaborn)")fig.tight_layout()plt.show()
Box Plots — Comparing Distributions
Box plots show median, interquartile range, whiskers (typically 1.5×IQR), and outliers. They are ideal for comparing a numeric variable across categories:
python
fig, axes = plt.subplots(1, 2, figsize=(14, 5))# Seaborn box plotsns.boxplot(data=df, x="day", y="total_bill", palette="Set2", order=["Thur","Fri","Sat","Sun"], ax=axes[0])axes[0].set_title("Total Bill by Day")# Violin plot — shows full distribution shape, not just quartilessns.violinplot(data=df, x="day", y="total_bill", palette="Set2", order=["Thur","Fri","Sat","Sun"], inner="quartile", ax=axes[1])axes[1].set_title("Total Bill by Day (Violin)")fig.tight_layout()plt.show()
| Chart | Best for |
|---|---|
| Box plot | Comparing medians and spread; outlier detection |
| Violin plot | Comparing full distribution shapes |
| Strip/swarm plot | Small datasets; show individual points |
| Bar chart | Comparing point estimates (means/totals) |
| Histogram | Single distribution; shape and spread |
Scatter Plots — Relationships Between Variables
python
fig, ax = plt.subplots(figsize=(8, 6))# Colour-encode a third variable using huesns.scatterplot( data=df, x="total_bill", y="tip", hue="time", style="smoker", size="size", sizes=(40, 200), alpha=0.7, palette="deep", ax=ax)# Add a regression line to show the trendsns.regplot(data=df, x="total_bill", y="tip", scatter=False, color="grey", line_kws={"linestyle": "--"}, ax=ax)ax.set_title("Tip vs Total Bill")ax.set_xlabel("Total Bill ($)")ax.set_ylabel("Tip ($)")plt.show()
For quick pairwise scatter plots across many numeric columns:
# Correlation heatmapcorr = df[["total_bill", "tip", "size"]].corr()fig, ax = plt.subplots(figsize=(6, 5))sns.heatmap( corr, annot=True, fmt=".2f", cmap="coolwarm", vmin=-1, vmax=1, square=True, linewidths=0.5, ax=ax)ax.set_title("Correlation Matrix")plt.show()# Pivot table heatmap — e.g. average tip by day and timepivot = df.pivot_table(values="tip", index="day", columns="time", aggfunc="mean")sns.heatmap(pivot, annot=True, fmt=".2f", cmap="YlOrRd")plt.title("Average Tip by Day and Time")plt.show()
Global Styling
Consistent visual style makes analysis reports look professional:
python
# Apply a Seaborn theme globallysns.set_theme(style="whitegrid", palette="deep", font_scale=1.2)# Or use a Matplotlib style sheetplt.style.use("seaborn-v0_8-whitegrid")# Custom palettecustom_colors = ["#264653", "#2A9D8F", "#E9C46A", "#F4A261", "#E76F51"]sns.set_palette(custom_colors)# Increase resolution for publicationplt.rcParams.update({ "figure.dpi": 150, "axes.spines.top": False, "axes.spines.right": False})
Summary
Matplotlib's object model (Figure → Axes → plot elements) gives full control; always use the explicit OO API (fig, ax = plt.subplots()) rather than plt.* convenience functions in production code.
Seaborn's statistical chart functions (histplot, boxplot, scatterplot, heatmap) take a data DataFrame argument and a hue parameter for automatic colour-encoding by category.
Histograms reveal distribution shape; box plots and violin plots compare distributions across categories; scatter plots expose bivariate relationships; heatmaps show correlation matrices and pivot tables.
Apply sns.set_theme() or plt.style.use() at the top of your notebook to establish a consistent visual identity across all charts.
Always call fig.tight_layout() before plt.show() or plt.savefig() to prevent label clipping.