Skip to main content

Command Palette

Search for a command to run...

📊 Mastering Data Visualization in Python — Matplotlib, 3D Plots & Pandas Plotting

Published
15 min read
📊 Mastering Data Visualization in Python — Matplotlib, 3D Plots & Pandas Plotting

Hey everyone! 👋
Welcome back to my tech-learning vlog! Today’s post is special because I spent hours learning Matplotlib, 3D plotting, and Pandas visualization, and I’m going to tell you everything I learned — not just highlights. This is the complete journey from start to finish. So if you want to understand data visualization in Python deeply, but in a fun, relatable way… grab a snack 🍿 and enjoy!

🔰 The Beginning: Importing & Understanding Matplotlib

Why Matplotlib?

Matplotlib is the foundation of Python plotting. It’s low-level (so you get full control) and forms the basis of higher-level libraries like Seaborn and Pandas’ plotting API. Start with:

import matplotlib.pyplot as plt

Before plotting, I learned something important:
Not all data is the same. Understand your data first.
Different plots serve different data types:

Data TypeExampleBest Plot
ContinuousAge, TemperatureLine, Scatter, Histogram
CategoricalCountry, GenderBar, Pie
BinnedAge groupsHistogram

Once that clicked, plotting became a decision — not guessing.

📈 2D Line Plot — When, How & Why

Purpose

Show trends over an ordered dimension (time, index) or continuous variable. Ideal for time-series, fitted model lines, or comparing series.

Minimal example

import numpy as np
x = np.linspace(0, 10, 200)
y = np.sin(x)
plt.plot(x, y)
plt.title("Sine wave")
plt.xlabel("x")
plt.ylabel("sin(x)")
plt.grid(True)
plt.show()

Practical tips

  • Use plt.plot(x, y, label='label', color='C0', linestyle='--', linewidth=2) to style.

  • Add plt.legend() when plotting multiple series.

  • Use ax = plt.gca() and ax.set_xlim() / ax.set_ylim() for precise control.

  • For time-series, convert the x-axis to datetime and use matplotlib.dates formatting.

Common pitfalls

  • Connecting points that shouldn’t be connected (ordinal vs categorical x-axis).

  • Plotting too many series without color/label clarity.

The graph didn’t just show values anymore — it told a story.

➕ Plotting a Function

Concept

Plot functions like y = f(x) directly by computing y over a dense x.

x = np.linspace(-10, 10, 400)
y = 2*x + 5
plt.plot(x, y)

Tips

  • Use dense sampling for smooth curves.

  • If the function has singularities or discontinuities, split domains (avoid vertical connecting lines).

It felt like the graph was reacting to the equation I wrote. — Very satisfying 😌

🐼 Plotting from a Pandas DataFrame

Why

Pandas simplifies plotting: it wraps Matplotlib and infers axes from DataFrame/Series.

df.plot(kind="line", x="date", y=["sales", "profit"], figsize=(10,5))

Tips

  • Use df.plot(subplots=True) for multiple axes.

  • For large dataframes, downsample or use rolling averages to reduce clutter.

At this point, I started feeling like a real data analyst 😂

🔗 Multiple Graphs on One Plot

Why

Compare series directly by overlaying them:

plt.plot(x, y1, label='A')
plt.plot(x, y2, label='B')
plt.legend()

Tips

  • Use contrasting colors and markers.

  • Consider dual axes (with ax.twinx()) only when scales differ, but comparison is meaningful — otherwise it confuses readers.

Seeing multiple trends on a single graph helped me compare patterns instantly.

🎨 Customizing Graphs — The Glow-Up Stage

This was my favorite chapter!
I realized graphs don’t have to be boring.

FeatureUsage
Color'red', 'green', or '#FF5733'
Line widthlinewidth=4
Line style'--', '-.', ':'
Marker'o', '*', 's' (square)
Marker sizemarkersize=12
plt.plot(x, y, color="#1f77b4", lw=2, ls='--', marker='o', ms=6)

Best practices

  • Use colorblind-friendly palettes (viridis, cividis) for accessibility.

  • Limit marker frequency on dense data (plot(..., markevery=10)).

Suddenly, graphs became beautiful — not just informative.

🏷️ Legends, Titles & Labels — Don’t Skip Them

A graph without context is just a drawing.

  • plt.title(), plt.xlabel(), plt.ylabel() — always provide context.

  • plt.legend(loc='best') — place legend intelligently.

  • Use ax.annotate() for inline annotations.

Example:

plt.title("Population vs Literacy")
plt.xlabel("Country")
plt.ylabel("Percent")
plt.legend()
plt.annotate("Notable spike", xy=(x0, y0), xytext=(x0+1, y0+10),
             arrowprops=dict(arrowstyle="->"))

One small change = huge clarity.

📍 Setting Axis Limits

Plotting began feeling precise:

  • plt.xlim(a, b) and plt.ylim(a, b) to focus.

  • Use plt.yscale('log') or ax.set_xscale('log') for log plots.

  • For percentage axes, set formatter: from matplotlib.ticker import PercentFormatter.

plt.xlim(0, 50)
plt.ylim(10, 100)

Now I had control.

🌐 Adding Grid & Show

Grid = easy readability
Show = final reveal
Simple but powerful:

  • plt.grid(True) improves readability for numeric axes.

  • plt.show() renders the figure; in notebooks prefer interactive backends (%matplotlib inline / %matplotlib widget).

plt.grid()
plt.show()

🔵 Scatter Plots — When Relationships Become Visible

Scatter plots became one of my best friends.

plt.scatter(x, y)

I could see correlations clearly.

Then I learned to style them:

plt.scatter(x, y, marker="*", s=120, color="purple")

From DataFrames? Even easier:

df.plot.scatter(x="Height", y="Weight")

Advanced features

  • Color by a variable: c=values, cmap='viridis' and plt.colorbar().

  • Size by variable: s = size_array (Note s is area units).

  • Use alpha for transparency to reduce overplotting.

Using plt.plot as a scatter

plt.plot(x, y, 'o') will show markers but may connect points if the format is not specified carefully.

Pitfalls

  • Overplotting: for dense data use hexbin, 2D histogram, or transparency.

  • Misinterpreting color and size scales — always include legends/colorbars.

So the difference is:

plt.plotplt.scatter
Connects by defaultNo connection
Good for trendsGood for correlation

A mental light bulb moment 💡

🟦 Bar Charts — The Categorical Champions

Vertical bars:

plt.bar(categories, values)

Horizontal bars:

plt.barh(categories, values)

Multiple bar charts required side-by-side spacing — once I figured that out, everything clicked.

Grouped / Multiple bars

Compute offsets for categories:

width = 0.35
x = np.arange(len(categories))
plt.bar(x - width/2, vals_a, width, label='A')
plt.bar(x + width/2, vals_b, width, label='B')
plt.xticks(x, categories)

📍 X-Tick Rotation Fix

Labels were overlapping, but the simple fix was:

plt.xticks(rotation=45)
plt.tight_layout()

📚 Stacked Bar Charts

My first “dashboard-level” plot:

plt.bar(x, a)
plt.bar(x, b, bottom=a)

Tips

  • For many categories (>10), consider horizontal bars or reorder by size.

  • Use error bars yerr for uncertainty.

📊 Histograms — Understanding Data Distribution

plt.hist(data, bins=8)

This helped me see how data is spread, not just its average.

Then I discovered log scale:

plt.yscale("log")

Perfect for skewed data.

Key considerations

  • Choose bins carefully — too few hides structure, too many adds noise.

  • Normalize using density=True to overlay PDFs.

  • Use log=True for heavy-tailed distributions.

Alternatives

  • Kernel density estimate (seaborn.kdeplot) for smooth density.

  • Boxplot or violin plot to summarize distribution succinctly.

🥧 Pie Charts — Visualization with Attitude 😎

When appropriate

Show parts of a whole (few categories, not many). Use sparingly — often a bar chart is clearer.

Then came the cool features:

FeaturePurpose
autopctShow %
explodeHighlight slice
shadow3D-like effect
colorsCustom palette

Example

plt.pie(values, labels=labels, autopct='%1.1f%%', explode=[0,0.1,0], shadow=True)

Pitfalls

  • Hard to compare slice angles; avoid many slices (>6).

  • Don’t use for negative values.

Pie charts instantly made reports look premium.

🎨 Styles & Saving Figures

Style presets blew my mind:

plt.style.use("ggplot")

One line = entire theme change.

And saving results made my work feel official:

plt.savefig("graph.png", dpi=300)
  • Use dpi=300 for print quality.

  • bbox_inches='tight' avoids clipped labels.

🧊 3D Plots — The WOW Moment

I felt like entering a whole different universe:

ax = fig.add_subplot(projection='3d')
##########################################
from mpl_toolkits.mplot3d import Axes3D  # older mpl; not always necessary
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

Then came:

  • 3D scatter

  • 3D line

  • 3D surface

  • Contour

  • Filled contour

  • Heatmaps

I was literally rotating graphs like a game 😭🔥

Deep Dive: 3D Scatter · 3D Line · 3D Surface · Contour · Filled-contour · Heatmap (Matplotlib)

Visualizing data in two dimensions is often enough — but sometimes you need an extra dimension (or a dense grid of values) to truly understand a phenomenon. Below I walk through six closely related visualization types, focusing on how to create them in Matplotlib, how to tune them, and when to use something else.

1) 3D Scatter — Visualizing point clouds & clusters in 3D

What it shows:
Discrete points in three dimensions (x, y, z). Great for inspecting spatial point-clouds, clustering structure, multi-feature relationships, or simply exploring 3D distributions.

Data shape:
Three 1D arrays (x, y, z) of the same length. You can also supply a fourth array for colors or sizes.

Minimal example (Matplotlib):

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D  # no direct use, but required for older mpl

np.random.seed(0)
x = np.random.randn(300)
y = np.random.randn(300)
z = np.random.randn(300)
c = np.sqrt(x**2 + y**2 + z**2)  # color by magnitude
s = (c - c.min()) / (c.ptp()) * 50 + 10  # size mapped from c

fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111, projection='3d')
sc = ax.scatter(x, y, z, c=c, s=s, cmap='viridis', alpha=0.8)
ax.set_xlabel('X'); ax.set_ylabel('Y'); ax.set_zlabel('Z')
fig.colorbar(sc, label='magnitude')
ax.view_init(elev=20, azim=30)  # camera angle
plt.show()

Customization tips:

  • cmap controls the colormap; alpha for transparency.

  • s sets marker size (can be scalar or array).

  • marker supports markers like 'o', '^', etc. (3D markers are still 2D glyphs projected into 3D).

  • ax.view_init(elev, azim) sets the camera elevation and azimuth.

  • Add edgecolors='k' for marker outlines (careful with large scatter sizes — performance hit).

Pitfalls & tips:

  • Overplotting: large point clouds can hide structure. Downsample or use alpha blending.

  • Depth perception: 3D on a 2D screen reduces depth cues; rotate the plot interactively to inspect.

  • Performance: many thousands of points → slow. Consider subsampling or plotting with specialized tools (e.g., datashader or Plotly for faster WebGL rendering).

When to use alternatives:

  • For interactive exploration or huge point clouds, use Plotly (WebGL) or datashader.

  • For static publication-quality 3D, Matplotlib is fine for moderate-sized datasets.

2) 3D Line — Plotting trajectories, time-series in 3D

What it shows:
A connected line through 3D space. Useful for trajectories (movement in space), parametric curves, or linking ordered observations.

Data shape:
Three 1D arrays (x(t), y(t), z(t)), usually ordered by parameter t (time, index, angle).

Minimal example:

import numpy as np
import matplotlib.pyplot as plt

t = np.linspace(0, 10, 500)
x = np.sin(t)
y = np.cos(2*t)
z = t  # time as the z-axis

fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111, projection='3d')
ax.plot(x, y, z, lw=2, label='trajectory')
ax.scatter(x[::50], y[::50], z[::50], color='red')  # sample points
ax.set_xlabel('x'); ax.set_ylabel('y'); ax.set_zlabel('z')
ax.legend()
ax.view_init(30, 120)
plt.show()

Customization tips:

  • lw for line width, linestyle for dashed or dotted lines.

  • Combine with ax.scatter to highlight key points or timestamps.

  • You can color the line by a fourth variable by plotting segments or using Line3DCollection (more advanced).

Pitfalls & tips:

  • If your line intersects itself heavily, perspective may hide parts; rotate to inspect.

  • To color a line along its parameter value, you’ll need to break the line into segments and use a Line3DCollection.

When to use alternatives:

  • For interactive animation of trajectories, consider using matplotlib.animation or Plotly animations.

3) 3D Surface — Continuous surface z = f(x,y)

What it shows:
A continuous surface defined over a grid in the x–y plane with height z. Perfect for visualizing functions of two variables, topography, or response surfaces.

Data shape:
Two 2D grids X, Y (from np.meshgrid) and a matching 2D Z = f(X,Y).

Minimal example:

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# create grid
x = np.linspace(-3, 3, 150)
y = np.linspace(-3, 3, 150)
X, Y = np.meshgrid(x, y)
Z = np.sin(np.sqrt(X**2 + Y**2)) / (np.sqrt(X**2 + Y**2) + 1e-8)

fig = plt.figure(figsize=(10,7))
ax = fig.add_subplot(111, projection='3d')

# surface plot
surf = ax.plot_surface(X, Y, Z, rstride=2, cstride=2, cmap='coolwarm',
                       linewidth=0, antialiased=True, alpha=0.95)

ax.set_xlabel('X'); ax.set_ylabel('Y'); ax.set_zlabel('Z')
fig.colorbar(surf, shrink=0.6, aspect=10, label='Z value')
ax.view_init(elev=30, azim=230)
plt.show()

Customization tips:

  • cmap chooses a color mapping; shading='gouraud' (for pcolormesh) is smooth; for plot_surface, set antialiased=True/False.

  • rstride and cstride control stride on older Matplotlib; with large grids you can reduce resolution for speed.

  • facecolors can accept an RGBA array for per-face coloring.

  • Use ax.plot_wireframe for a wireframe view (linewidths parameter).

  • ax.plot_surface(..., cmap='viridis') + fig.colorbar(...) is a common combo.

Pitfalls & tips:

  • Surface shading and lighting are limited in Matplotlib; for realistic lighting and materials, consider Mayavi or Plotly.

  • For large grids (e.g., 1000×1000), rendering will be slow—downsample or use pcolormesh + imshow equivalents.

When to use alternatives:

  • Use Plotly for interactive 3D surfaces (rotate smoothly in the browser).

  • Use specialized 3D packages for volumetric rendering or advanced shading.

4) Contour — Iso-lines of a continuous 2D field

What it shows:
Lines connecting points of equal z (iso-contours) over a 2D plane. Great for topographic maps, response surfaces, or level sets.

Data shape:
Same as surface: X, Y, Z grids.

Minimal example:

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(-3, 3, 300)
y = np.linspace(-3, 3, 300)
X, Y = np.meshgrid(x, y)
Z = np.sin(X**2 + Y**2) / (X**2 + Y**2 + 1e-8)

plt.figure(figsize=(8,6))
cs = plt.contour(X, Y, Z, levels=12, cmap='viridis')
plt.clabel(cs, inline=True, fontsize=8)
plt.xlabel('X'); plt.ylabel('Y'); plt.title('Contour plot')
plt.colorbar(cs, label='Z value')
plt.show()

Customization tips:

  • levels can be an integer (auto-levels) or a sequence of explicit iso-values.

  • plt.clabel() prints labels on the contour lines; set fmt to format numbers.

  • linestyles and linewidths control contour appearance.

  • plt.contourf gives filled contours (see next section).

Pitfalls & tips:

  • Contours can be misleading if Z is noisy; smooth or increase grid resolution.

  • If Z has large dynamic range, choose levels or use norm=matplotlib.colors.LogNorm() for logarithmic scaling.

5) Filled Contour (contourf) — Colored regions between contours

What it shows:
A contour plot where areas between contours are filled with color. Use for smooth heat-like visualizations with explicit level regions.

Minimal example:

plt.figure(figsize=(8,6))
cf = plt.contourf(X, Y, Z, levels=15, cmap='plasma')
plt.colorbar(cf, label='Z value')
plt.xlabel('X'); plt.ylabel('Y'); plt.title('Filled Contour')
plt.show()

Customization tips:

  • levels controls how many discrete color bands you get.

  • Add alpha for transparency.

  • Combine contourf with contour(..., colors='k', linewidths=0.5) to outline regions.

  • For continuous-looking plots, use more levels (e.g., levels=100) but be mindful of performance.

Pitfalls & tips:

  • When using categorical colormaps (e.g., 'tab10'), filled contours may look odd; prefer perceptually uniform colormaps (viridis, cividis, plasma).

  • Matplotlib interpolation and shading options affect appearance — for matplotlib >= 3.3 you can use shading='nearest' or similar in pcolormesh.

6) Heatmap — Dense grid displayed as a colored image

What it shows:
A dense grid of values as a raster image — typically used for correlation matrices, image-like data, or dense 2D fields.

Variants: plt.imshow, plt.pcolormesh, or seaborn.heatmap (higher-level).

Minimal example (imshow):

plt.figure(figsize=(8,6))
plt.imshow(Z, extent=(x.min(), x.max(), y.min(), y.max()), origin='lower', aspect='auto', cmap='inferno')
plt.colorbar(label='Z value')
plt.xlabel('X'); plt.ylabel('Y'); plt.title('Heatmap (imshow)')
plt.show()

Using pcolormesh (recommended for non-regular meshes):

plt.figure(figsize=(8,6))
pcm = plt.pcolormesh(X, Y, Z, shading='auto', cmap='inferno')
plt.colorbar(pcm, label='Z value')
plt.xlabel('X'); plt.ylabel('Y'); plt.title('Heatmap (pcolormesh)')
plt.show()

Customization tips:

  • origin='lower' flips the y-axis so (0,0) is at the bottom-left.

  • aspect='equal' preserves aspect ratio; 'auto' stretches to fill axes.

  • shading='auto' in pcolormesh avoids artifacts when grid sizes differ.

  • Use masked arrays to hide invalid values: Z = np.ma.masked_invalid(Z).

Pitfalls & tips:

  • imshow expects image-like arrays; watch the orientation and extents.

  • For very large arrays, use a subsampled image for performance.

  • Choose colormaps carefully: avoid jet for scientific data; prefer perceptually uniform ones (viridis, cividis, plasma).

Practical advice — color, scale & perception

  • Colormaps: use perceptually uniform colormaps (viridis, plasma, cividis) for scientific data. Reserve diverging colormaps (RdBu, seismic) for centered data (where zero is meaningful).

  • Colorbars: always include a colorbar for surfaces, contours, and heatmaps to give numeric context.

  • Normalization: use matplotlib.colors.Normalize() or LogNorm() for non-linear scaling.

  • Accessibility: consider color-blind friendly maps (e.g., cmap='cividis').

  • Levels: Pick explicit contour levels when you want consistent comparisons across multiple plots.

Performance & interactivity

  • Large datasets: Matplotlib can be slow with millions of points or huge grids. Options:

    • Downsample data (random sampling for scatter, coarsen grid for surfaces).

    • Use agg backend for static rendering; use WebGL-backed libraries (Plotly, Bokeh) for interactive, fast rendering.

    • For rasterized large surfaces, consider saving as an image with a lower resolution.

  • Interactivity: Matplotlib provides basic interactive rotation for 3D, but for publishable interactive visualizations, use Plotly (browser, WebGL) or tools like Mayavi for advanced 3D rendering and lighting.

Putting it together — combined figure example

Here is how you can present multiple related plots in one figure (3D surface + contour + heatmap):

import numpy as np
import matplotlib.pyplot as plt

# data
x = np.linspace(-3, 3, 200)
y = np.linspace(-3, 3, 200)
X, Y = np.meshgrid(x, y)
Z = np.sin(X**2 + Y**2) / (X**2 + Y**2 + 1e-8)

fig = plt.figure(constrained_layout=True, figsize=(14,6))

# 3D surface
ax1 = fig.add_subplot(1,3,1, projection='3d')
surf = ax1.plot_surface(X, Y, Z, cmap='coolwarm', linewidth=0, antialiased=True)
ax1.set_title('3D Surface')
ax1.view_init(35, 45)
fig.colorbar(surf, ax=ax1, shrink=0.6)

# filled contour
ax2 = fig.add_subplot(1,3,2)
cf = ax2.contourf(X, Y, Z, levels=30, cmap='viridis')
ax2.set_title('Filled Contour')
fig.colorbar(cf, ax=ax2)

# heatmap
ax3 = fig.add_subplot(1,3,3)
pcm = ax3.pcolormesh(X, Y, Z, shading='auto', cmap='inferno')
ax3.set_title('Heatmap (pcolormesh)')
fig.colorbar(pcm, ax=ax3)

plt.show()

Summary / When to use what

  • 3D Scatter: point clouds, clusters, spatial data. Use when individual points matter.

  • 3D Line: trajectories and parametric curves through 3D space.

  • 3D Surface: continuous fields z=f(x,y) — topography, response surfaces.

  • Contour / Filled Contour: show iso-values and region bands; print annotation with clabel.

  • Heatmap: dense sampled fields or matrices — image-like visualization (imshow/pcolormesh).

🐼 Pandas Plotting — The Fast Lane

Just when I had learned everything manually…
Pandas stepped in like:

df.plot()

One line… and boom — plot ready.

From scatter to bar to pie to stacked to histogram — Pandas did it all effortlessly.

Example 1 — multi-series time plot with smoothing & annotations

df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=200),
    'A': np.random.randn(200).cumsum()+50,
    'B': np.random.randn(200).cumsum()+30
}).set_index('date')

ax = df.plot(figsize=(12,5), linewidth=2)
df.rolling(7).mean().plot(ax=ax, linestyle='--', linewidth=1)
ax.annotate('peak', xy=(df.index[-10], df['A'].iloc[-10]), xytext=(-50, 30),
            textcoords='offset points', arrowprops=dict(arrowstyle='->'))
plt.title('Series A vs B (with 7-day mean)')
plt.show()

Example 2 — grouped bar + stacked proportions

agg = df_grouped = df2.groupby(['category', 'subcat'])['value'].sum().unstack(fill_value=0)
agg.plot(kind='bar', stacked=True, figsize=(10,6))

Example 3 — scatter with color & size

df.plot.scatter(x='sepal_length', y='sepal_width', c='petal_length', s=df['petal_width']*20, cmap='viridis', figsize=(8,6))

Example 4 — histogram + KDE via pandas & seaborn

ax = df['value'].plot(kind='hist', bins=30, density=True, alpha=0.5)
df['value'].plot(kind='kde', ax=ax)
plt.show()

Example 5 — pairwise scatter matrix

pd.plotting.scatter_matrix(df[['A','B','C','D']], figsize=(10,10), diagonal='kde')

This one blew my mind🧠: — Next-level visualization🚀

🔥 Final Thoughts

This journey wasn’t just about graphs.
It was about understanding data visually.

I now think in plots:

  • Trend? → Line plot

  • Categories? → Bar chart

  • Relationship? → Scatter

  • Distribution? → Histogram

  • Share/percentage? → Pie

  • Multi-dimension? → 3D plot

  • Quick results? → Pandas plot

I didn’t just learn data visualization — I learned to communicate with data.